From 22,000 unstructured messages to 40 clear themes

In the previous article we described the problem: an energy company with 22,912 support tickets and no structural insight into what customers are actually asking. In this article we cover how we solved it — including the attempts that didn't work.

The approach in four steps

Step 1: Embeddings — from text to meaning

The first step was converting each ticket into a vector: a numerical representation of the message's meaning. We did this with an embedding model trained to understand language — not to match keywords.

The difference is crucial. A keyword-based system sees "cancel contract" and "terminate subscription" as two different things. An embedding model sees them as nearly identical, because the meaning is the same.

This is the power of embeddings for customer service analysis: customers don't use the terms from your dropdown menu. They use their own words. And those words vary enormously — but the intent behind them doesn't.

Step 2: Dimensionality reduction — from thousands of dimensions to a map

An embedding is a vector with hundreds or thousands of dimensions. Useful for a computer, useless for a human. To spot patterns — and to let clustering algorithms work efficiently — we reduced the dimensions.

The result is a two-dimensional map where each ticket is a point. Tickets with similar meaning are positioned close together. Suddenly you see structure emerge. Clusters of points that form a theme. Areas where many tickets converge and areas that are sparsely populated.

This is the moment where data turns into insight. Not through a table of numbers, but through a visual representation that makes patterns visible that are invisible in spreadsheets.

Step 3: Clustering — finding structure

With the reduced vectors we let a clustering algorithm identify the groups. Which tickets belong together based on their position in vector space?

This is where we learned our most important lesson. The first attempt produced 120+ clusters. Technically correct — the algorithm found real groups. But unusable. Too much overlap. Clusters that described almost the same theme but with a slightly different nuance. No human could take action on them.

The second attempt: far fewer clusters, 12 in total. Now everything was too vague. One cluster called "miscellaneous questions" contained 40% of all tickets. As if you stuffed the entire dataset into a "various" category.

On the third attempt we found the balance: 40 clusters with a hierarchical structure. Enough detail to understand what customers are asking. Enough structure to keep things manageable. And crucially: every cluster was actionable — you could tie a concrete improvement to it.

Step 4: AI labeling and FAQ generation

The final step: automatically labeling each cluster and providing FAQ answers. We had a language model read the tickets in each cluster and generate:

A descriptive name for the cluster. A summary of the types of questions it contains. And — the most concrete outcome — FAQ answers that are directly usable as a foundation for automated customer service.

This is where the circle closes. From unstructured tickets to a structured knowledge base. Not manually compiled by someone who read a few hundred tickets, but systematically generated from the full corpus.

What we got wrong

It would be nice to say this process was linear. It wasn't.

The first clustering attempt was useless. Not because of a technical error, but because of a wrong assumption: that more clusters are better. They're not. The optimal number of clusters isn't the number that best describes the data — it's the number a human can oversee and act on.

Data quality was a challenge. Some tickets consisted of just a subject line. Others had been forwarded four times with "see below" on top. A few were in German. The reflex is to clean the data first. The better approach: build a system that can handle messy data. Because the data is never going to be perfect.

The parameters needed tuning. Which embedding model, which dimensionality reduction method, which clustering hyperparameters — each of these choices affects the result. It took multiple iterations to find the right combination. Not because there's one right combination, but because "right" depends on what you want to do with it.

The lesson

AI analysis of customer service data isn't a black box where you throw in data and insights come out. It's an iterative process where technical choices are guided by the question: what does the organization need in order to take action?

The best embedding in the world is useless if the results aren't actionable. The prettiest cluster visualization is pointless if nobody knows what to do with it.

Technology in service of the business. Not the other way around.