Archetypal Analysis is an unsupervised learning method that’s gaining popularity in marketing research. Rather than describing data by “average” observations (cluster centers), the analysis represents it by extremal points in the multidimensional space (the archetypes). We’re comparing Archetypal Analysis with traditional approaches to market segmentations and showing how Archetypal Analysis might lead to more high-quality actionable solutions.
The objective of a segmentation is to divide consumers into approachable groups based on demographics, needs, attitudes, interests, and other psychographic and behavioral criteria. In each group, we want consumers to be similar to each other and to be as different as possible to consumers in other groups.
But how do we define “similar” and “different” to efficiently exploit uncovered heterogeneity in product or service marketing? If there was a small set of variables responsible for the groups, the problem would be trivial. If we have lots of respondents described by many variables, however, the task becomes significantly more challenging. Traditionally, to develop a segmentation, researchers use a wide range of unsupervised machine learning algorithms called clustering methods. Cluster analysis focuses on groupings within the cloud of individual respondents. Generally, groups are formed around “average” members that are used as a “prototype” for each cluster.
Archetypal Analysis offers an alternative approach to grouping that might present multiple benefits to a market segmentation. It searches the periphery of the data cloud and focuses on extreme individuals. The goal of the analysis in this case is to identify the pure types (the archetypes) and then to describe each point in a data set as a convex combination of a set of archetypes.
Archetypal Analysis approximates the convex hull of a set of data. For a large dataset, the number of points in the convex hull might be relatively large and dimensionality might be high, so the first step is to look for an approximated hull with a reasonable number of points minimizing the residual of the approximation. Similar to a clustering approach, a researcher might need to consider and compare multiple solutions derived using Archetypal Analysis and choose the most appropriate for the segmentation.
Points inside of an approximated hull can be represented as a convex combination of archetypes (points on the outside are represented by their nearest point on the archetype hull). Similar to methods like latent class, Archetypes Analysis leads to a fuzzy segmentation. For any individual, we know exactly the share of each archetype contributing to their traits. We can interpret these shares as probabilities to belong to each of the archetypes and classify each individual to the archetype with the highest probability.
Archetypal Analysis offers an interesting perspective to market segmentations.
Should we define and describe each segment by its extreme (archetype) or based on its average (prototype)? If the goal is accuracy and the clusters are compact, then the average is a good group descriptor. However, the average does not perform well when clusters are elongated or sparce, which is a realistic scenario in marketing applications. In some cases, use of traditional clustering techniques leaves researchers with solutions and personas that are hard to define. Our clients are often looking for a set of contrastive categories to describe the market, and Archetypal Analysis can be most helpful in achieving this goal in segmentation studies.
Let us consider a simple example.
Imagine two matchmakers are evaluating their pool of 100 candidates to introduce the best matches to their most demanding clients. (Figure 1) The candidates are described by multiple traits, but we are representing each candidate with a point on a two-dimensional chart for visualization.