SALES ANALYSIS USING RFM METHOD

Rakesh Sethu NP
5 min readOct 27, 2022

--

Recently I have worked on a project using SQL AND TABLEAU. This is the first time I came across Tableau usually works around Power-Bi. I really enjoyed creating the dashboard which i am going to share with you.

Let’s get Started …

Step 1 :

Loading the dataset.

Step 2 :

Few basic Analyses by using Select method, Grouping method

-- LETS EXPLORE THE DATA --SELECT*FROM[dbo].[sales_data_sample]--UNIQUE VALUE--SELECTDISTINCT STATUSFROM[dbo].[sales_data_sample] --MUCH NEEDED QUERY TO PLOT IN VISUALIZATIONSELECTDISTINCT YEAR_IDFROM[dbo].[sales_data_sample]SELECTDISTINCT PRODUCTLINEFROM[dbo].[sales_data_sample] -- THIS CATEGORY IS REQUIRED TO VISUALIZE SALES IN TERM OF CATEGORYSELECTDISTINCT COUNTRYFROM[dbo].[sales_data_sample] --MUCH NEEDED QUERY TO PLOT IN VISUALIZATIONSELECTDISTINCT DEALSIZEFROM[dbo].[sales_data_sample]SELECTDISTINCT TERRITORYFROM[dbo].[sales_data_sample]--GROUPING BY PRODUCT LINE-- Using Aggregate FunctionSELECTPRODUCTLINE,SUM(SALES) as REVENUEFROM[dbo].[sales_data_sample]group byPRODUCTLINEorder by2 desc -- Revenue generated by each category-- SALES BY YEAR --SELECTYEAR_ID,SUM(SALES) as REVENUEFROM[dbo].[sales_data_sample]group byYEAR_IDorder by2 desc -- Revenue generated by each category--- AS PER BELOW RESULT THERE SEEMS REVENUE IS LESS IN YEAR 2005. LETS SEE SAME ON HOW MANY MONTHS IS BEING OPERATED --SELECTDISTINCT (MONTH_ID)FROM[dbo].[sales_data_sample]whereYEAR_ID = 2005 -- IN 2005 THEY HAVE OPERATED FOR ONLY 5 MONTHS!-- DEAL SIZE--SELECTDEALSIZE,SUM(SALES) as REVENUEFROM[dbo].[sales_data_sample]group byDEALSIZEorder by2 desc--- AFTER ALL THESE ANALYSIS LET US DEEP DIVE INTO MONTHLY SALES TO KNOW WHICH MONTH HAS HIGHEST SALES --SELECTMONTH_ID,SUM(SALES) as REVENUE,COUNT(ORDERNUMBER) FREQUENCYFROM[dbo].[sales_data_sample]WhereYEAR_ID = 2003 --We gonna see 2003 --group byMONTH_IDorder by2 descSELECTMONTH_ID,SUM(SALES) as REVENUE,COUNT(ORDERNUMBER) FREQUENCYFROM[dbo].[sales_data_sample]WhereYEAR_ID = 2004 --We gonna see 2004 --group byMONTH_IDorder by2 desc-- AS PER BOTH YEAR NOV MONTH HAS HIGH FREQ OF REVENUE TO THEM . SO WE CAN SEE WHICH CATEGORY IS SEELING IN THAT MONTH --SELECTMONTH_ID,PRODUCTLINE,SUM(SALES) as REVENUE,COUNT(ORDERNUMBER) FREQUENCYFROM[dbo].[sales_data_sample]WhereYEAR_ID = 2003 and MONTH_ID=11 --We gonna see 2003 AND MONTH =11 --group byMONTH_ID ,PRODUCTLINEorder by3 descSELECTMONTH_ID,PRODUCTLINE,SUM(SALES) as REVENUE,COUNT(ORDERNUMBER) FREQUENCYFROM[dbo].[sales_data_sample]WhereYEAR_ID = 2004 and MONTH_ID=11 --We gonna see 2004 AND MONTH =11 --group byMONTH_ID ,PRODUCTLINEorder by3 desc-- IN BOTH ANALYSIS WE CAN SEE ITS CLASSIC CAR THEY HAVE FREQ SALES WITH MAJORITY OF REVENUE GENERATED FROM THAT CATEGORY---- NOW WE HAVE TO SEE WHO ARE HIGHEST BUYER FROM THEM --

Step 3 :

RFM METHOD

RFM means— Recency, Frequency, Monetary

THIS TECHNIQUE IS USED TO CATEGORIZE CUSTOMERS BASED ON PAST PURCHASING BEHAVIOUR —

— IN THE RFM TECHNIQUE WE SEGMENT CUSTOMERS AS BELOW —

— RECENCY — WHEN WAS THE LAST TIME PURCHASING HAPPENED (LAST ORDER DATE) —

— FREQUENCY — WHAT IS FREQUENCY OF PURCHASE i.e HOW OFTEN THEY PURCHASE? (COUNT OF TOTAL ORDERS) —

— MONETARY — WHAT IS THE PURCHASE VALUE?(TOTAL INVOICE VALUE)

-- to group into 4 equal bucket we gonna use CTE METHOD---- here in result you can see less the number is most recent and most frequent--drop table if exists #rfm --Will help in executing the table again and again;with rfm as(SELECTCUSTOMERNAME,SUM(sales) as monetoryvalue,avg(sales) as avgmonetaryvalue,count(ORDERNUMBER) as Frequency,max(ORDERDATE) as last_order_date,(select MAX(ORDERDATE) from [dbo].[sales_data_sample]) as max_order_date,DATEDIFF(DD,max(ORDERDATE),(select MAX(ORDERDATE) from [dbo].[sales_data_sample])) as RecencyFROM[dbo].[sales_data_sample]group byCUSTOMERNAME),rfm_calc as(select r.*,NTILE(4) over (order by Recency desc) as rfm_recency,NTILE(4) over (order by Frequency) as rfm_frequency,NTILE(4) over (order by monetoryvalue) as rfm_monetoryfrom rfm r)Select c.*, rfm_recency+ rfm_frequency+rfm_monetory as rfm_newcell,cast(rfm_recency as varchar) +cast(rfm_frequency as varchar) + cast(rfm_monetory as varchar) as rfm_newcell_stringinto #rfmfrom rfm_calc as c-- here in result you can see less the number is most recent and most frequent--SelectCUSTOMERNAME,rfm_recency,rfm_frequency,rfm_monetory,Case when rfm_newcell_string in (111, 112, 121, 122, 123, 132, 211, 212,114, 141) then 'lost_customers' --Customer we lost-when rfm_newcell_string in (133, 134, 143, 244, 334, 343, 344, 144) then 'Going to lose Customers' --They are bigspender but slipping away--when rfm_newcell_string in (311, 411, 331) then 'new customers' --new customer --when rfm_newcell_string in (222, 223, 233, 322) then 'potential_churn_customers' --There are chances of losing them--when rfm_newcell_string in (323, 333, 321, 422, 332, 432) then 'Active customers' --Active Customer--when rfm_newcell_string in (433, 434, 443, 444) then 'Loyal Customer' end rfm_segmentfrom#rfm
SEGMENT HAS BEEN CATEGORIZED BASED OF RFM METHOD
--NOW WE HAVE SEGMENTED THE CUSTOMER IN RFM METHOD---- LET SEE WHAT PRODUCT ARE BROUGHT TOGETHER BY CUSTOMERS--SELECTORDERNUMBER, COUNT(*) AS rnFROM[dbo].[sales_data_sample]where STATUS = 'Shipped'group by ORDERNUMBER

— There may be chances of multiple line items for a particular order

— There are multiple orders from same order no

select distinct ORDERNUMBER,STUFF((select ',' + PRODUCTCODEfrom  [dbo].[sales_data_sample] as pwhere ORDERNUMBER in(SELECT ORDERNUMBER FROM(SELECTORDERNUMBER, COUNT(*) AS rnFROM[dbo].[sales_data_sample]where STATUS = 'Shipped'group by ORDERNUMBER)mwhere rn = 2 -- can also try with different no to get no of product wrt to order no)and p.ORDERNUMBER=s.ORDERNUMBERfor xml path ('')),1,1, '') as productcodesfrom [dbo].[sales_data_sample] as sorder by 2 desc

The above query helps us find the multiple products for a particular OrderNumber using the XML path method. Post that Stuff method is used to group the same in Order number as seen above

NOW LET’S SEE SOME VISUALIZATION

Below visualization is done of different parameters such as sales distribution, Revenue by country, monthly sales by country etc

Thank you for reading . Please like and support me if you liked my work.

Link for GitHub repository

Please follow me on GitHub

Links for Tableau public Dashboard

Please follow me on Tableau as well

https://public.tableau.com/views/RFMSALESANALYSISDASHBOARD-1/SALESDASH1?:language=en-US&:display_count=n&:origin=viz_share_link

https://public.tableau.com/views/RFMSALESANALYSISDASHBOARD-2/SALESDASH2?:language=en-US&:display_count=n&:origin=viz_share_link

--

--

Rakesh Sethu NP
Rakesh Sethu NP

Written by Rakesh Sethu NP

Senior Procurement Executive |Data Analyst | Expertise in Excel, Power BI

No responses yet