Man vs. machine: how to analyze free text data

Picture the scene. You’ve just set a market research survey free into the world. Perhaps you’re gathering customer feedback, investigating consumer trends, or testing a new brand or product idea. Whatever the end goal; you’ve spent hours agonizing over the right questions to ask and can’t wait to get your hands on the juicy insights.

There’s just one problem: you’re already sweating about what to do with your free text responses. Reams, even novels’ worth, of text data is headed your way, and you’re going to have to find a way to analyze it all…

The goliath challenge of free text responses

The fact is, free text analysis isn’t always easy. If the sheer volume of words in free text data isn’t challenge enough, there are a few other reasons why finding insights in this kind of data can be tricky.

For a start, human analysis has its downsides. There’s the fact that it’s highly time-consuming for a person to go through extensive text data. Then there’s the issue of bias. If you approach a feedback survey, for example, expecting to find that customers love your product variety but hate the price tag; those are the comments that are likely to jump out to you when analyzing – no matter how hard you try to be objective.

Software analysis can be equally flawed. Consider this: you’re conducting research into how people use public spaces. The survey shows that many people love the local park, while many others are struggling to find spaces to park their cars. While some tools do this well, unsophisticated software solutions will struggle to tell the difference between these context-specific meanings.

The crux of it is: words are not black and white in the way that numbers are, making reliable analysis a challenge. The market for qualitative analysis tools is therefore much less developed than for quantitative data, which has historically been the primary focus of business intelligence teams.

So, what options do you have if you’re sitting on a pile of unanalyzed free text comments that needs unpicking?

Three common methods for analyzing text

When deciding the most appropriate way to analyze free text data, it’s important to consider the purpose and design of the tool(s) in question.

We’ll walk through three methods for analyzing text – manual coding, excel text frequency analysis, and purpose-built text analysis – so you can decide which fits your needs best.

1. Manually coding free text responses

When in doubt, there is always the option of setting a human on the case for manually coding free text responses. While this approach is often time intensive, people can understand the nuances, contexts, and ‘unsaid’ insights implicit in text responses.

Consider an employee survey, for example, in which employees have made vague comments that are as broad as: “if you can stick it out, you know you’re something special”, “there are definitely fluffier companies to work for”, or “we struggle to accept anyone who is less than great”. A computer may not spot any consistencies or similarities in comments as disparate as these, whereas a human can grasp the underlying theme at play: don’t these comments infer a challenging, perhaps problematic, work culture?

For smaller sets of text data, manual coding can be an appropriate way to work through the insights, but this can quickly become unmanageable for larger datasets or be compromised by a person’s inadvertent bias.

2. Excel text frequency analysis

When you’re met with large quantities of text data, it makes sense to look for ways that digital tools can help you. Excel text frequency analysis provides a rudimentary method that can support manual analysis by quantifying how often particular words appear in your dataset.

Take a consumer research survey into the launch of a new soft drink, as an example. It is clear through manual analysis that the product’s taste is a big sticking point for consumers – many find it too sweet. Text processing in Excel can then help to quantify this finding, indicating the number of times that respondents have used words like ‘sweet’, ‘sugar’, and ‘taste’. This can also be supported by a sentiment analysis plugin, which can then sort references to sweetness into positive, negative, and neutral categories.

While this method can help to quickly quantify and sort emerging themes in text data, there are watch-outs. For example, imagine if it turns out that a secondary emerging theme in this survey is that respondents are concerned about sugar content from a health perspective. Excel text frequency is only able to quantify the incidences of language around sugar, not separate the comments into concerns about flavor or health.

Similarly, text analysis Excel add-ins can struggle with nuance, often attempting to oversimplify how people feel. A comment like “so much sugar my kids were bouncing of the walls for hours”, for example, could leave this tool flummoxed.

Ultimately, Excel text frequency analysis can only ever deliver basic insights as it lacks the linguistic expertise necessary to make informed judgements about the emotions, topics, and nuance at play in the text.

3. Purpose-built text analytics tools

Evidently, there are flaws in both the manual and digital analysis of free text. But there is a solution which combines the benefits of both. Purpose-built text analysis tools combine the fast, unbiased, and quantifiable outcomes of digital tools, with the nuance and contextual understanding of human analysis.

These are tools powered by AI, data science, and natural language processing, and are often built by experts in linguistics. Such tools can reveal the deeper trends that may be unseen by the human eye, allowing data to be segmented, visualized, and contextualized.

To return to our previous hypothetical examples, quality text analytics tools can look at the broader context around public parks and car parking and separate these themes. They can navigate the subtle topic of lack of support in company culture reviews, without overemphasizing it in light of more positive topics. They can group comments around product flavor and separate them from comments around product health.

Such tools can reveal the golden insights that live beyond the headlines, and often include free text visualization tools that can help you understand these insights in compelling visuals, statistics, or insight cards that bring everything together. It may even surface an insight that has remained resolutely hidden in human or basic digital analysis.

All this and more is exactly what we do at Relative Insight. Book a discovery call to learn how we can help you analyze your free text data with our purpose-build text analytics platform.