9/27/2024

Sachin Chandra

Strategic Market Segmentation: Customer Behavior Analysis

This project analyzes customer segmentation to identify and understand distinct customer groups within a business's target market. Analyzing demographic data, purchasing ...

Introduction: Beyond the Average Customer

In the digital age of retail, the myth of the "average customer" has been shattered. Every purchase tells a story, every transaction holds a pattern, and within these patterns lie the keys to understanding the diverse tapestry of consumer behavior. This analysis delves deep into customer data to uncover these hidden patterns.

Modern retailers face a paradox: while they have more customer data than ever before, translating this data into meaningful insights remains challenging. Through sophisticated machine learning techniques, we transform raw customer data – from purchase history to campaign responses – into clear, actionable segments. Each segment represents not just a group of customers, but a unique perspective on how different consumers interact with our brand.

Our journey takes us through three critical phases: data exploration and cleaning, advanced dimensionality reduction using PCA, and finally, cluster analysis using Agglomerative Clustering. The result? Four distinct customer personas, each with their own purchasing patterns, preferences, and potential for growth.

As a first step we import the dataset into notebook environment using the convenient "Import to Notebook" option.

data = pd.read_csv(r'''/app/market Segmentation.csv''',sep="\t")
data.describe()

data = pd.read_csv(r'''/app/market Segmentation.csv''',sep="\t")
data.describe()

In this section

Data Cleaning
Feature Engineering

SHOW TABLES

SHOW TABLES

From the output we get by executing the above code in a SQL cell, we can conclude and note that:

There are missing values in income
Dt_Customer indicates the date a customer joined the database is not parsed as DateTime
There are some categorical features in our data frame; as there are some features in dtype: object). So we will need to encode them into numeric forms later.

First of all, for the missing values, We are simply going to drop the rows that have missing income values.

print("The total number of data-points after removing the rows with missing values are:", len(data))
data = data.dropna()
print("The total number of data-points after removing the rows with missing values are:", len(data))

print("The total number of data-points after removing the rows with missing values are:", len(data))
data = data.dropna()
print("The total number of data-points after removing the rows with missing values are:", len(data))

In the next step, We are going to create a feature out of "Dt_Customer" that indicates the number of days a customer is registered in the firm's database. However, in order to keep it simple, We are taking this value relative to the most recent customer in the record.

Thus to get the values We must check the newest and oldest recorded dates.

data.columns = data.columns.str.strip()

data["Dt_Customer"] = pd.to_datetime(data["Dt_Customer"], dayfirst=True)
dates = []
for i in data["Dt_Customer"]:
    i = i.date()
    dates.append(i)

print("The newest customer's enrolment date in the records:", max(dates))
print("The oldest customer's enrolment date in the records:", min(dates))

data.columns = data.columns.str.strip()

data["Dt_Customer"] = pd.to_datetime(data["Dt_Customer"], dayfirst=True)
dates = []
for i in data["Dt_Customer"]:
    i = i.date()
    dates.append(i)

print("The newest customer's enrolment date in the records:", max(dates))
print("The oldest customer's enrolment date in the records:", min(dates))

Creating a feature ("Customer_For") of the number of days the customers started to shop in the store relative to the last recorded date

days = []
d1 = max(dates)
for i in dates:
    delta = d1 - i
    days.append(delta)
data["Customer_For"] = days
data["Customer_For"] = pd.to_numeric(data["Customer_For"], errors="coerce")

days = []
d1 = max(dates)
for i in dates:
    delta = d1 - i
    days.append(delta)
data["Customer_For"] = days
data["Customer_For"] = pd.to_numeric(data["Customer_For"], errors="coerce")

Exploratory Data Analysis

print("Total categories in the feature Marital_Status:\n", data["Marital_Status"].value_counts(), "\n")
print("Total categories in the feature Education:\n", data["Education"].value_counts())

print("Total categories in the feature Marital_Status:\n", data["Marital_Status"].value_counts(), "\n")
print("Total categories in the feature Education:\n", data["Education"].value_counts())

In the next bit, We will be performing the following steps to engineer some new features:

Extract the "Age" of a customer by the "Year_Birth" indicating the birth year of the respective person.
Create another feature "Spent" indicating the total amount spent by the customer in various categories over the span of two years.
Create another feature "Living_With" out of "Marital_Status" to extract the living situation of couples.
Create a feature "Children" to indicate total children in a household that is, kids and teenagers.
To get further clarity of household, Creating feature indicating "Family_Size"
Create a feature "Is_Parent" to indicate parenthood status
Lastly, We will create three categories in the "Education" by simplifying its value counts.
Dropping some of the redundant features

data["Age"] = 2024 - data["Year_Birth"]

data["Spent"] = data["MntWines"]+ data["MntFruits"]+ data["MntMeatProducts"]+ data["MntFishProducts"]+ data["MntSweetProducts"]+ data["MntGoldProds"]

data["Living_With"]=data["Marital_Status"].replace({"Married":"Partner", "Together":"Partner", "Absurd":"Alone", "Widow":"Alone", "YOLO":"Alone", "Divorced":"Alone", "Single":"Alone",})

data["Children"]=data["Kidhome"]+data["Teenhome"]

data["Family_Size"] = data["Living_With"].replace({"Alone": 1, "Partner": 2}).astype("int64") + data["Children"]

data["Is_Parent"] = np.where(data.Children> 0, 1, 0)

data["Education"]=data["Education"].replace({"Basic":"Undergraduate","2n Cycle":"Undergraduate", "Graduation":"Graduate", "Master":"Postgraduate", "PhD":"Postgraduate"})

data=data.rename(columns={"MntWines": "Wines","MntFruits":"Fruits","MntMeatProducts":"Meat","MntFishProducts":"Fish","MntSweetProducts":"Sweets","MntGoldProds":"Gold"})

data["Age"] = 2024 - data["Year_Birth"]

data["Spent"] = data["MntWines"]+ data["MntFruits"]+ data["MntMeatProducts"]+ data["MntFishProducts"]+ data["MntSweetProducts"]+ data["MntGoldProds"]

data["Living_With"]=data["Marital_Status"].replace({"Married":"Partner", "Together":"Partner", "Absurd":"Alone", "Widow":"Alone", "YOLO":"Alone", "Divorced":"Alone", "Single":"Alone",})

data["Children"]=data["Kidhome"]+data["Teenhome"]

data["Family_Size"] = data["Living_With"].replace({"Alone": 1, "Partner": 2}).astype("int64") + data["Children"]

data["Is_Parent"] = np.where(data.Children> 0, 1, 0)

data["Education"]=data["Education"].replace({"Basic":"Undergraduate","2n Cycle":"Undergraduate", "Graduation":"Graduate", "Master":"Postgraduate", "PhD":"Postgraduate"})

data=data.rename(columns={"MntWines": "Wines","MntFruits":"Fruits","MntMeatProducts":"Meat","MntFishProducts":"Fish","MntSweetProducts":"Sweets","MntGoldProds":"Gold"})

Now that we have some new features let's have a look at the data's stats.

print(data)

print(data)

The above stats show some discrepancies in mean Income and Age and max Income and age.

Do note that max-age is 128 years, As We calculated the age that would be today (i.e. 2021) and the data is old.

We must take a look at the broader view of the data. We will plot some of the selected features.

import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib import colors
import pandas as pd

# Ensure "Customer_For" feature exists
data['Dt_Customer'] = pd.to_datetime(data['Dt_Customer'], format='%d-%m-%Y')
data['Customer_For'] = 2024 - data['Dt_Customer'].dt.year

# Set background color for axes and figure
sns.set(rc={"axes.facecolor": "#FFF9ED", "figure.facecolor": "#FFF9ED"})

# Define color palette and colormap
pallet = ["#682F2F", "#9E726F", "#D6B2B1", "#B9C0C9", "#9F8A78", "#F3AB60"]
cmap = colors.ListedColormap(pallet)

# Selected features to plot
To_Plot = ["Income", "Recency", "Customer_For", "Age", "Spent", "Is_Parent"]

# Plot pairplot with hue
print("Relative Plot of Some Selected Features: A Data Subset")
sns.pairplot(data[To_Plot], hue="Is_Parent", palette=["#682F2F", "#F3AB60"])
plt.show()

import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib import colors
import pandas as pd

# Ensure "Customer_For" feature exists
data['Dt_Customer'] = pd.to_datetime(data['Dt_Customer'], format='%d-%m-%Y')
data['Customer_For'] = 2024 - data['Dt_Customer'].dt.year

# Set background color for axes and figure
sns.set(rc={"axes.facecolor": "#FFF9ED", "figure.facecolor": "#FFF9ED"})

# Define color palette and colormap
pallet = ["#682F2F", "#9E726F", "#D6B2B1", "#B9C0C9", "#9F8A78", "#F3AB60"]
cmap = colors.ListedColormap(pallet)

# Selected features to plot
To_Plot = ["Income", "Recency", "Customer_For", "Age", "Spent", "Is_Parent"]

# Plot pairplot with hue
print("Relative Plot of Some Selected Features: A Data Subset")
sns.pairplot(data[To_Plot], hue="Is_Parent", palette=["#682F2F", "#F3AB60"])
plt.show()

Clearly, there are a few outliers in the Income and Age features. I will be deleting the outliers in the data.

import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib import colors

# Set background color for axes and figure
sns.set(rc={"axes.facecolor": "#FFF9ED", "figure.facecolor": "#FFF9ED"})

# Define color palette and colormap
pallet = ["#682F2F", "#9E726F", "#D6B2B1", "#B9C0C9", "#9F8A78", "#F3AB60"]

# Selected features to plot
To_Plot = ["Income", "Recency", "Customer_For", "Age", "Spent"]

# Create subplots for each feature to visualize with box plots
fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(18, 12))

# Flatten the axes array for easier iteration
axes = axes.flatten()

# Plot box plots for each feature
for i, feature in enumerate(To_Plot):
    sns.boxplot(data=data, x="Is_Parent", y=feature, palette=pallet, ax=axes[i])
    axes[i].set_title(f"Box Plot of {feature} by Is_Parent")
    axes[i].set_xlabel("Is Parent")
    axes[i].set_ylabel(feature)

# Remove any unused axes
for j in range(i+1, len(axes)):
    fig.delaxes(axes[j])

# Adjust the layout and display the plots
plt.tight_layout()
plt.show()

import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib import colors

# Set background color for axes and figure
sns.set(rc={"axes.facecolor": "#FFF9ED", "figure.facecolor": "#FFF9ED"})

# Define color palette and colormap
pallet = ["#682F2F", "#9E726F", "#D6B2B1", "#B9C0C9", "#9F8A78", "#F3AB60"]

# Selected features to plot
To_Plot = ["Income", "Recency", "Customer_For", "Age", "Spent"]

# Create subplots for each feature to visualize with box plots
fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(18, 12))

# Flatten the axes array for easier iteration
axes = axes.flatten()

# Plot box plots for each feature
for i, feature in enumerate(To_Plot):
    sns.boxplot(data=data, x="Is_Parent", y=feature, palette=pallet, ax=axes[i])
    axes[i].set_title(f"Box Plot of {feature} by Is_Parent")
    axes[i].set_xlabel("Is Parent")
    axes[i].set_ylabel(feature)

# Remove any unused axes
for j in range(i+1, len(axes)):
    fig.delaxes(axes[j])

# Adjust the layout and display the plots
plt.tight_layout()
plt.show()

data = data[(data["Age"]<90)]
data = data[(data["Income"]<600000)]
print("The total number of data-points after removing the outliers are:", len(data))

data = data[(data["Age"]<90)]
data = data[(data["Income"]<600000)]
print("The total number of data-points after removing the outliers are:", len(data))

Next, let us look at the correlation amongst the features. (Excluding the categorical attributes at this point)

data_encoded = pd.get_dummies(data, drop_first=True)

corrmat = data_encoded.corr()

plt.figure(figsize=(200,200))
sns.heatmap(corrmat, annot=True, cmap="coolwarm", center=0)
plt.show()

data_encoded = pd.get_dummies(data, drop_first=True)

corrmat = data_encoded.corr()

plt.figure(figsize=(200,200))
sns.heatmap(corrmat, annot=True, cmap="coolwarm", center=0)
plt.show()

The data is quite clean and the new features have been included. We will proceed to the next step. That is, preprocessing the data.

DATA PREPROCESSING

In this section, We will be preprocessing the data to perform clustering operations.

The following steps are applied to preprocess the data:

Label encoding the categorical features
Scaling the features using the standard scaler
Creating a subset dataframe for dimensionality reduction

s = (data.dtypes == 'object')
object_cols = list(s[s].index)

print("Categorical variables in the dataset:", object_cols)

s = (data.dtypes == 'object')
object_cols = list(s[s].index)

print("Categorical variables in the dataset:", object_cols)

LE=LabelEncoder()
for i in object_cols:
    data[i]=data[[i]].apply(LE.fit_transform)

print("All features are now numerical")

LE=LabelEncoder()
for i in object_cols:
    data[i]=data[[i]].apply(LE.fit_transform)

print("All features are now numerical")

All features are now numerical

ds = data.copy()
cols_del = ['AcceptedCmp3', 'AcceptedCmp4', 'AcceptedCmp5', 'AcceptedCmp1','AcceptedCmp2', 'Complain', 'Response']
ds = ds.drop(cols_del, axis=1)
#Scaling
scaler = StandardScaler()
scaler.fit(ds)
scaled_ds = pd.DataFrame(scaler.transform(ds),columns= ds.columns )
print("All features are now scaled")

ds = data.copy()
cols_del = ['AcceptedCmp3', 'AcceptedCmp4', 'AcceptedCmp5', 'AcceptedCmp1','AcceptedCmp2', 'Complain', 'Response']
ds = ds.drop(cols_del, axis=1)
#Scaling
scaler = StandardScaler()
scaler.fit(ds)
scaled_ds = pd.DataFrame(scaler.transform(ds),columns= ds.columns )
print("All features are now scaled")

All features are now scaled

print("Dataframe to be used for further modelling:")
scaled_ds.head()

print("Dataframe to be used for further modelling:")
scaled_ds.head()

DIMENSIONALITY REDUCTION

In this problem, there are many factors on the basis of which the final classification will be done. These factors are basically attributes or features. The higher the number of features, the harder it is to work with it. Many of these features are correlated, and hence redundant. This is why I will be performing dimensionality reduction on the selected features before putting them through a classifier. Dimensionality reduction is the process of reducing the number of random variables under consideration, by obtaining a set of principal variables.

Principal component analysis (PCA) is a technique for reducing the dimensionality of such datasets, increasing interpretability but at the same time minimizing information loss.

Steps in this section:

Dimensionality reduction with PCA
Plotting the reduced dataframe

Dimensionality reduction with PCA

For this project, We will be reducing the dimensions to 3.

pca = PCA(n_components=3)
pca.fit(scaled_ds)
PCA_ds = pd.DataFrame(pca.transform(scaled_ds), columns=(["col1","col2", "col3"]))
PCA_ds.describe().T

pca = PCA(n_components=3)
pca.fit(scaled_ds)
PCA_ds = pd.DataFrame(pca.transform(scaled_ds), columns=(["col1","col2", "col3"]))
PCA_ds.describe().T

x =PCA_ds["col1"]
y =PCA_ds["col2"]
z =PCA_ds["col3"]
#To plot
fig = plt.figure(figsize=(10,8))
ax = fig.add_subplot(111, projection="3d")
ax.scatter(x,y,z, c="maroon", marker="o" )
ax.set_title("A 3D Projection Of Data In The Reduced Dimension")
plt.show()

x =PCA_ds["col1"]
y =PCA_ds["col2"]
z =PCA_ds["col3"]
#To plot
fig = plt.figure(figsize=(10,8))
ax = fig.add_subplot(111, projection="3d")
ax.scatter(x,y,z, c="maroon", marker="o" )
ax.set_title("A 3D Projection Of Data In The Reduced Dimension")
plt.show()

CLUSTERING

Now that We have reduced the attributes to three dimensions, We will be performing clustering via Agglomerative clustering. Agglomerative clustering is a hierarchical clustering method. It involves merging examples until the desired number of clusters is achieved.

Steps involved in the Clustering

Elbow Method to determine the number of clusters to be formed
Clustering via Agglomerative Clustering
Examining the clusters formed via scatter plot

print('Elbow Method to determine the number of clusters to be formed:')
Elbow_M = KElbowVisualizer(KMeans(), k=10)
Elbow_M.fit(PCA_ds)
Elbow_M.show()

print('Elbow Method to determine the number of clusters to be formed:')
Elbow_M = KElbowVisualizer(KMeans(), k=10)
Elbow_M.fit(PCA_ds)
Elbow_M.show()

The above cell indicates that four will be an optimal number of clusters for this data. Next, we will be fitting the Agglomerative Clustering Model to get the final clusters.

AC = AgglomerativeClustering(n_clusters=4)
yhat_AC = AC.fit_predict(PCA_ds)
PCA_ds["Clusters"] = yhat_AC
data["Clusters"]= yhat_AC

AC = AgglomerativeClustering(n_clusters=4)
yhat_AC = AC.fit_predict(PCA_ds)
PCA_ds["Clusters"] = yhat_AC
data["Clusters"]= yhat_AC

To examine the clusters formed let's have a look at the 3-D distribution of the clusters.

fig = plt.figure(figsize=(10,8))
ax = plt.subplot(111, projection='3d', label="bla")
ax.scatter(x, y, z, s=40, c=PCA_ds["Clusters"], marker='o', cmap = cmap )
ax.set_title("The Plot Of The Clusters")
plt.show()

fig = plt.figure(figsize=(10,8))
ax = plt.subplot(111, projection='3d', label="bla")
ax.scatter(x, y, z, s=40, c=PCA_ds["Clusters"], marker='o', cmap = cmap )
ax.set_title("The Plot Of The Clusters")
plt.show()

EVALUATING MODELS

Since this is an unsupervised clustering. We do not have a tagged feature to evaluate or score our model. The purpose of this section is to study the patterns in the clusters formed and determine the nature of the clusters' patterns.

For that, we will be having a look at the data in light of clusters via exploratory data analysis and drawing conclusions.

Firstly, let us have a look at the group distribution of clustering

pal = ["#682F2F","#B9C0C9", "#9F8A78","#F3AB60"]
pl = sns.countplot(x=data["Clusters"], palette= pal)
pl.set_title("Distribution Of The Clusters")
plt.show()

pal = ["#682F2F","#B9C0C9", "#9F8A78","#F3AB60"]
pl = sns.countplot(x=data["Clusters"], palette= pal)
pl.set_title("Distribution Of The Clusters")
plt.show()

The clusters are fairly distributed

pl = sns.scatterplot(data = data,x=data["Spent"], y=data["Income"],hue=data["Clusters"], palette= pal)
pl.set_title("Cluster's Profile Based On Income And Spending")
plt.legend()
plt.show()

pl = sns.scatterplot(data = data,x=data["Spent"], y=data["Income"],hue=data["Clusters"], palette= pal)
pl.set_title("Cluster's Profile Based On Income And Spending")
plt.legend()
plt.show()

Income vs spending plot shows the clusters pattern

group 0: high spending & average income
group 1: high spending & high income
group 2: low spending & low income
group 3: high spending & low income

Next, We will be looking at the detailed distribution of clusters as per the various products in the data. Namely: Wines, Fruits, Meat, Fish, Sweets and Gold

plt.figure()
pl=sns.swarmplot(x=data["Clusters"], y=data["Spent"], color= "#CBEDDD", alpha=0.5 )
pl=sns.boxenplot(x=data["Clusters"], y=data["Spent"], palette=pal)
plt.show()

plt.figure()
pl=sns.swarmplot(x=data["Clusters"], y=data["Spent"], color= "#CBEDDD", alpha=0.5 )
pl=sns.boxenplot(x=data["Clusters"], y=data["Spent"], palette=pal)
plt.show()

From the above plot, it can be clearly seen that cluster 1 is our biggest set of customers closely followed by cluster 0. We can explore what each cluster is spending on for the targeted marketing strategies.

Let us next explore how did our campaigns do in the past.

#Creating a feature to get a sum of accepted promotions
data["Total_Promos"] = data["AcceptedCmp1"]+ data["AcceptedCmp2"]+ data["AcceptedCmp3"]+ data["AcceptedCmp4"]+ data["AcceptedCmp5"]
#Plotting count of total campaign accepted.
plt.figure()
pl = sns.countplot(x=data["Total_Promos"],hue=data["Clusters"], palette= pal)
pl.set_title("Count Of Promotion Accepted")
pl.set_xlabel("Number Of Total Accepted Promotions")
plt.show()

#Creating a feature to get a sum of accepted promotions
data["Total_Promos"] = data["AcceptedCmp1"]+ data["AcceptedCmp2"]+ data["AcceptedCmp3"]+ data["AcceptedCmp4"]+ data["AcceptedCmp5"]
#Plotting count of total campaign accepted.
plt.figure()
pl = sns.countplot(x=data["Total_Promos"],hue=data["Clusters"], palette= pal)
pl.set_title("Count Of Promotion Accepted")
pl.set_xlabel("Number Of Total Accepted Promotions")
plt.show()

There has not been an overwhelming response to the campaigns so far. Very few participants overall. Moreover, no one part take in all 5 of them. Perhaps better-targeted and well-planned campaigns are required to boost sales.

plt.figure()
pl=sns.boxenplot(y=data["NumDealsPurchases"],x=data["Clusters"], palette= pal)
pl.set_title("Number of Deals Purchased")
plt.show()

plt.figure()
pl=sns.boxenplot(y=data["NumDealsPurchases"],x=data["Clusters"], palette= pal)
pl.set_title("Number of Deals Purchased")
plt.show()

Unlike campaigns, the deals offered did well. It has best outcome with cluster 0 and cluster 3. However, our star customers cluster 1 are not much into the deals. Nothing seems to attract cluster 2 overwhelmingly

#for more details on the purchasing style
Places =["NumWebPurchases", "NumCatalogPurchases", "NumStorePurchases",  "NumWebVisitsMonth"]

for i in Places:
    plt.figure()
    sns.jointplot(x=data[i],y = data["Spent"],hue=data["Clusters"], palette= pal)
    plt.show()

#for more details on the purchasing style
Places =["NumWebPurchases", "NumCatalogPurchases", "NumStorePurchases",  "NumWebVisitsMonth"]

for i in Places:
    plt.figure()
    sns.jointplot(x=data[i],y = data["Spent"],hue=data["Clusters"], palette= pal)
    plt.show()

PROFILING

Now that we have formed the clusters and looked at their purchasing habits. Let us see who all are there in these clusters. For that, we will be profiling the clusters formed and come to a conclusion about who is our star customer and who needs more attention from the retail store's marketing team.

To decide that I will be plotting some of the features that are indicative of the customer's personal traits in light of the cluster they are in. On the basis of the outcomes, I will be arriving at the conclusions.

Personal = [ "Kidhome","Teenhome","Customer_For", "Age", "Children", "Family_Size", "Is_Parent", "Education","Living_With"]

for i in Personal:
    plt.figure()
    sns.jointplot(x=data[i], y=data["Spent"], hue =data["Clusters"], kind="kde", palette=pal)
    plt.show()

Personal = [ "Kidhome","Teenhome","Customer_For", "Age", "Children", "Family_Size", "Is_Parent", "Education","Living_With"]

for i in Personal:
    plt.figure()
    sns.jointplot(x=data[i], y=data["Spent"], hue =data["Clusters"], kind="kde", palette=pal)
    plt.show()

Profiling The Clusters

About Cluster Number: 0

• Are definitely a parent

• At the max have 4 members in the family and at least 2

• Single parents are a subset of this group

• Most have a teenager at home

• Relatively older

About Cluster Number: 1

• Are definitely not a parent

• At the max are only 2 members in the family

• A slight majority of couples over single people

• Span all ages

• A high-income group

About Cluster Number: 2

• The majority of these people are parents

• At the max are 3 members in the family

• They majorly have one kid (and not teenagers, typically)

• Relatively younger

About Cluster Number: 3

• They are definitely a parent

• At the max are 5 members in the family and at least 2

• Majority of them have a teenager at home

• Relatively older

• A lower-income group

Conclusion: From Customer Segmentation Insights to Action

Our analysis reveals a nuanced picture of customer behavior that challenges conventional marketing wisdom. The four distinct clusters we've identified – from high-spending parents to deal-seeking young families – each represent unique opportunities and challenges for targeted marketing strategies.

Key Discoveries:

The Premium Segment (Cluster 1): High-income, typically childless customers who respond well to luxury products but show limited interest in deals
Family Value Seekers (Cluster 0): Middle-income parents who actively engage with deals and promotions
Budget Conscious (Cluster 2): Younger families with modest spending patterns
Aspirational Buyers (Cluster 3): Lower-income families with selective high-value purchases

These insights provide a foundation for:

Tailored marketing campaigns aligned with each segment's preferences
Product mix optimization based on segment-specific purchasing patterns
Improved customer engagement through personalized communication strategies
More efficient allocation of marketing resources

The path forward is clear: transform these segments into actionable marketing strategies that speak directly to each group's unique needs and preferences. Success in modern retail isn't just about what you sell – it's about who you're selling to and how well you understand their story.

Get started with Strategic Market Segmentation: Customer Behavior Analysis

Click below to copy this free template.

Recommended Templates

Grocery Retail Performance Analysis: Boost Sales & Efficiency

2/6/2025

Manas Mehrotra

Strategic Grocery Retail Performance Analysis: Boost Sales & Efficiency

This dataset consolidates sales transactions, customer interactions, inventory management, and marketing performance data from a mid-sized grocery chain. It integrates data from physical stores, e-commerce ...Read more

DVD Rental Market Insights and Performance Analysis

1/15/2025

Manas Mehrotra

DVD Rental Market Insights and Performance Analysis

This analysis offers a detailed exploration of the DVD rental market, focusing on key operational and financial metrics. By integrating data across various dimensions such ...Read more

Data stack for operational excellence

Autonmis helps modern teams own their entire operations and data workflow — fast, simple, and cost-effective.

Start a conversation