RFM Segmentation
Hello, everyone! Welcome back to our journey through the fascinating world of Customer & Marketing Analytics, where data-driven decisions light the path to business success.
i want to share project Customer & Marketing Analytics, this project focuses on RFM Segmentation
Outline
- Business Understanding
- Modelling Workflow
- Data Pipelines
- EDA
- Preprocessing
- Clustering
- Business Recommendation
Business Understanding
Let’s dive right in, company CoM has amassed a year’s worth of data through scanning, and they intend to leverage this information to assess and expand their business operations. The primary objective is to gain a deeper understanding of customer personas. This initiative is driven by the need to evaluate sales performance and devise strategic plans for the coming year.
Business Objective
- Enhanced Customer Understanding: RFM segmentation helps in categorizing customers based on their recent interactions (Recency), frequency of transactions (Frequency), and the monetary value of those transactions (Monetary Value). By applying RFM segmentation to the collected data, Company CoM can gain a more nuanced understanding of different customer groups and their behaviors.
- Strategic Planning: The insights gained from RFM segmentation can inform the overall marketing strategy for the coming year.
Business Questions / Problem Statement
Now, let’s set the stage.
How can we seamlessly slice and dice our customer base to unearth their deepest insights? Our inputs include records of recency, frequency, and monetary value, derived from the magic of scanning. The output we seek? Well-defined customer segments. With these segments, our marketing maestros can orchestrate strategies designed to optimize company revenue.
Define the Problem
- What are the inputs? based on scanner input there is records: recency, frequency, and monetary.
- What are the outputs? Customer Segments
- What do we do with the segmentation? The marketing team will strategize for maximizing company revenue by tailoring their approach to each segment identified through RFM analysis.
Data Description
Date
: This column likely contains the date on which a particular transaction took place.Customer_ID
: This column likely contains unique identifiers for each customer.Transaction_ID
: This column likely contains unique identifiers for each transaction..SKU_Category
: This column likely categorizes the products or services purchased in the transaction.SKU
: SKU stands for Stock Keeping Unit and represents unique identifiers for specific products or services.Quantity
: This column likely indicates the quantity of the SKU (product or service) purchased in each transaction.Sales_Amount
: This column likely represents the total sales amount or revenue generated from each transaction. It typically includes the price of the SKU multiplied by the quantity purchased.
Preprocessing Data
Now, let’s roll up our sleeves and prepare our data for exploration.
We know that
- There are no duplicates
Date
is a time series data, change the data type to time series.
# change the data types for InvoiceDate
data['Date'] = pd.to_datetime(data['Date'])
Nice! Now we are ready to explore the data
Exploratory Data Analysis
There are 19 number of items in a single transaction in Transaction_ID
28731. To calculate Frequency
, do not use the count of Transaction_ID
, but count of unique Transaction_ID
first transaction on January 2, 2016 to the last transaction on December 31, 2016, This time frame spans the entire year of 2016.
Create RFM features
Now, it’s time to craft our RFM features, First set the latest date. Because the last transaction is 2016–12–31, we can set 2017–01–02 as a possible reference day (not too far away from the day of the last transaction)
Let's see the data distribution
- The data for both
Monetary
andFrequency
is heavily skewed, - Handle outliers using IQR both
Monetary
&Frequency
Next we do Normalize the data
It’s evident that using min-max scaling has constrained the values of the data within the range of 0 to 1.
Next we are gonna clustering using elbow method first
- The optimal number of clusters is the point where adding another cluster does not significantly decrease WCSS
- This point can be found by looking for a sharp bend or elbow in the WCSS graph
- If so, then adding more clusters does not improve the clustering performance
- So, we consider to use either
n_cluster
= 3 orn_cluster
= 4, because our budget is still sufficient to cover until 5 clusters (based on assumption in RFM ranking method above) - We decide using
n_cluster = 3
, because we can't see 'beding curve' onn_cluster = 4
andn_cluster = 5
K-Means Clustering
- We want to cluster using K-Means with
n_cluster = 3
- Find the centroids coordinate for clusters
Cluster 0 → Average Customers
- Characteristics: low recency, medium frequency & monetary
Cluster 1 → At Risk Customers
- Characteristics: high recency, low frequency & monetary
Cluster 2 → Best Customers
- Characteristics: low recency, high frequency & monetary
General Recommendation
Best Customer
General Characteristic (Low recency, high frequency & monetary)
- Recency : 58 days
- Frequency : 6 transactions
- Monetary : $ 130
General marketing initiatives:
- Loyalty Programs: Implement a loyalty program that rewards customers for their frequent transactions and high spending.
- Personalized Recommendations: Leverage customer data to provide personalized product recommendations based on their previous purchases.
- Targeted Email Campaigns: Send targeted email campaigns to these best customers, featuring new arrivals, special promotions, or exclusive offers tailored to their preferences.
Average Customers
General Characteristic (high recency, low frequency & monetary)
- Recency : 63 days
- Frequency : 2.8 transactions (specifically 2.8 transactions)
- Monetary : $ 45
General marketing initiatives:
- Segmentation: Further segment the “Average Customers” group based on their specific transaction behaviors.
At Risk Customers
General Characteristic (Low recency, medium frequency & monetary)
- Recency : 236 days
- Frequency : 2~3 transactions (specifically 2.5 transactions)
- Monetary : $ 47.2
General Marketing initiative:
- Feedback and Surveys: Seek feedback from these customers to understand their preferences and any potential pain points.
- Frequency-Based Discounts: Offer discounts or rewards based on the number of transactions.
In Depth Analysis & Recommendation
- We want to perform a more specific analysis of the RFM data. We want to conduct an analysis and take various conditions into consideration, such as short-term and long-term games, when making specific recommendations.
- Identify suitable business recommendations and implement marketing initiatives based on the analysis
- We want to know how most customers purchase behavior, We want to make foundation for specific recommendation based on specific condition and segment
- Here we got characteristic of each customer using average value
- As average revenue per segment, the highest come from best customers
- But most our customers are best customers
- Most our revenue come from the best customer segment
- In short Term, we can focus to give marketing initiative to best & average customers with several types of promotion: e.g., discount, upselling, cross-selling
- In long term, we need to retail all customers. In simple term, we should know the reason why customer leave/stay with our company, give them loyalty program each segments, and treat the potential churn customer with discount promotion
Recommendation for Short & Long Term Game
Busienss Objective
- Company CoM amis to personalize marketing strategies to increase customer retention (achieve average monthly retention > 30%) and maximize revenue with effective marketing spending in the next 4 months
- The company also wants to identify high-value customers to give them personalized marketing
At Risk Customers:
Short Term:
- Send personalized re-engagement emails with special offers to entice them to make another purchase.
- Implement a loyalty program with rewards to incentivize repeat purchases.
- Use retargeting ads on social media and online platforms to remind them of your products or services.
Long Term:
- Continuously monitor and improve the customer experience to prevent them from becoming “At Risk” in the future.
- Offer exclusive membership benefits to encourage long-term loyalty.
- Implement a referral program to turn satisfied customers into brand advocates.
Average Customers:
Short Term:
- Send appreciation emails with a limited-time discount to boost their next purchase.
- Recommend complementary products or services based on their previous purchases.
- Create urgency with flash sales and limited-time promotions.
Long Term:
- Gradually introduce them to higher-tier products or services as they become more loyal.
- Personalize their experience based on their browsing and purchase history.
- Provide excellent customer service to build a long-term relationship.
Best Customers:
Short Term:
- Offer early access to new products or exclusive sales events.
- Send personalized thank-you notes or gifts to express appreciation.
- Provide VIP customer support with faster response times.
Long Term:
- Maintain regular communication to keep them engaged and informed about new offerings.
- Create a VIP loyalty program with tiered benefits and rewards.
- Solicit feedback and involve them in product development to strengthen their loyalty.
Conclusion
- For best customers, the focus is on providing a special experience, maintaining communication, and providing incentives through loyalty programs.
- For average customers, efforts are concentrated on increasing transaction value, personalizing the experience, and providing good customer service.
- For at-risk customers, the strategy involves making efforts to rekindle their interest and building long-term loyalty.
Thank you for taking the time to explore my project.