PUBLISHED JULY 2015
by Jeff Sauro, statistical analyst
Collecting, analyzing, and making decisions from data is the heart of customer analytics. But whether you’re new to data analysis or have been doing it a while, 10 common mistakes can affect the quality of your results. You should be on the lookout for them. They follow, and I include some ideas on how to avoid them as well.
Optimizing Around the Wrong Metric
Metrics exist for just about anything in an organization and most probably are collected for a good reason. Be sure the metric you want to optimize will achieve not just your goals, but also your customers’ goals.
If airlines optimize around on-time departure instead of on-time arrival, an airplane that pulls away from the gate and sits on the tarmac is a metric success even though the customers feel the experience is disappointing as they arrive at their destination an hour late. If you optimize around the number of calls answered in one hour at a call center, you are placing quantity over quality. While customers generally want to get resolution quickly, are their issues being properly addressed?
Be sure your metrics are meaningful to your customer and that optimizing those metrics makes for a better experience.
Relying Too Much on Behavioral or Attitudinal Data
Mining customer transactions can reveal a lot of patterns in things like what products customers purchase together or the average time between purchases. But this behavioral data doesn’t necessarily help you understand the attitudes and motivations behind why customers purchase things together.
This attitudinal data can more easily be collected using surveys or other methods of asking customers.
Not Having a Large Enough Sample Size
If you’re looking to detect small differences in metrics, like conversion rates or customer attitudes, and you’re measuring a sample of customers or data, be sure your sample size is large enough to detect that difference. Use sample size tables or consult a statistician to know what sample size you’ll need ahead of time.
A lot of cost and effort are wasted on looking for very small differences in customer attitudes, such as satisfaction, perception of usability, or likelihood to recommend after making very small changes to products or websites with too small a sample size.
Eyeballing Data and Patterns
I call it “eyeballing statistics.” It’s the tendency to think you can detect patterns from data by examining it without any statistics.
You can see very large patterns easily without any computations, but these sorts of obvious patterns rarely show up. To minimize the chance that you’re being fooled by randomness in data, use statistics and mathematical computations to differentiate the news from the noise.
Confusing Statistical Significance with Practical Significance
With a large sample size, you’ll be able to detect very small differences and patterns that are statistically significant. Statistical significance just means that the pattern or difference is not due to random noise in your data.
But that doesn’t mean that what’s detected will have much practical importance. Analytics programs will flag different patterns and differences, but you need to determine if a 1 percent difference in conversion rates results will have a major or negligible impact.
This depends on the context but means you’ll need to exercise judgment and not blindly follow the software. Don’t immediately think every statistically significant result is meaningful. Think through the business implications of the result carefully..
Not Having an Interdisciplinary Team
If you have a stats PhD crunching numbers in your company basement, that may generate the right insights; but if sales, marketing, service, or product teams aren’t involved, it’s going to be difficult to get buy-in and implement the insights.
Get the right people and teams involved in your initiative early and look to have complementary skills, including mathematical, software, business, marketing, and product experience.
Not Cleaning Your Data First
Garbage in, garbage out (GIGO) is a common phrase data junkies like to use to explain that data that has problems before analysis will have problems after analysis. This can involve anything from mismatched data pulled from databases (customer names don’t match transactions) or missing values.
If the data is bad going in, you’ll have bad insights coming out. Before running any analysis, do a quality check on your data by selecting a sample of data and auditing it for quality. Corroborate it with other sources to verify its accuracy.
Improperly Formatted Data
When you analyze your data, at least half the effort is spent formatting it so your software can properly analyze it. This often involves disaggregating and getting customer transactions or survey data in rows and columns. Skimping on proper formatting usually means a lot of rework later, so be sure your data is formatted properly—and early.
Not Having Clear Research Questions to Answer
Sometimes it’s fine to have a fishing expedition and examine patterns in data. But don’t stop with the fishing expedition; use what you find to form hypotheses about customer behavior and look to confirm, refine, or reject these hypotheses with additional data.
Waiting for Perfect Data
Every dataset tends to have some problem of some sort. Some are minor, like a few missing fields; others are major, with lots of missing fields and mismatched data. For survey data, there always seems to be a concern about how a question was asked and to whom it was asked.
That said, expect some imperfection in all your datasets and surveys. But don’t let it stop you from working with what you have. Just be cautious about your interpretation.
Jeff Sauro is a Six-Sigma trained statistical analyst and pioneer in qualifying the customer experience. He writes a a weekly column at Measuringu.com and has been a speaker at Fortune 500 companies and industry conferences. This article is excerpted with permission from the publisher, Wiley, from Customer Analytics for Dummies by Jeff Sauro. Copyright © 2015.