Select Page

3.10 Describing and Labeling Clusters


Even after building a k-means clustering model, determining the optimal number of clusters to use, and describing those clusters through summary stats and visualizations, the modeler’s work is still not complete.  

While an audience cannot be expected to remember the meaningful differences between “Cluster 1”, “Cluster 2”, and “Cluster 3”, they can quickly grasp (and retain!) the meanings of labels such as “Carefree Young Singles,” “Carpools and Cartoons”, or “Soccer Practice and Piano Lessons.”

For this reason, naming your clusters is one of the most important steps in the entire segmentation process.  The accompanying descriptions are essential, too; a few sentences that highlight a cluster’s distinguishing features will help an audience to see what makes this group unique.  

For Cluster 0 in the model above, we can start by looking for the most prominent ‘standout’ areas.  This group has the highest average household income, by far, along with the second-largest average home sizes.  Their spending on entertainment and travel is above average, but not the highest.  While their mean household size is exactly average for the dataset, they have more children under 12 than the average.  With high incomes, big families, and relatively moderate discretionary spending, we’ll call them “Saving for College.”  

Cluster 1’s members have the second-lowest average household incomes among all groups, yet the highest average entertainment spending by far.  This looks like a group of people who like to live for the moment, so we can call them YOLO, or You Only Live Once.  

Cluster 2 members have the largest average home sizes, and the highest average travel spending among all groups.  Yet, their incomes are the lowest among any cluster.  Perhaps they are retirees who love to travel – we can call them “Golden Age Globetrotters.”

Cluster 3 has the lowest average spending on travel and entertainment, but the greatest average number of pets as well as the greatest average number of children under 12.  We can call this group “Pets First, Pets Always.”

Cluster 4 has the lowest average number of household members below 12, and the smallest average number of people per household.  They also tend to live in small spaces and not spend very much.  Many members could be singles who live in apartments and have not yet started families.  We can call this group, “Waiting for that Special Someone.”  

Cluster 5 has the largest number of people per household (by far), and a below average value of children under 12.  With most other variables hovering just a bit above or below the dataset means, this cluster might consist of groups of friends living together after college or graduate school.  This group could be called “Twentysomethings on the Move.”

Cluster 6 has the highest average number of children below 12, with low discretionary spending (travel and entertainment), the fewest pets, and small properties.   These families have their hands full, and they do not partake in all the luxuries and finer things.  We can call this group “Busy and Blue Collar.”