Carl Gold, Chief Data Scientist for Zuora, digs into data science to decipher how to reduce churn in subscription services. If you want to learn more about how to fight churn with data, check out his website fightchurnwithdata.com with excerpts and insights from his upcoming book “Fighting Churn with Data” (publication date late 2019).
How can you fight churn with data science? The answer will surprise you.
Over the last few years, I have analyzed churn at more than 50 companies in my role as Chief Data Scientist at Zuora. Through this work, what I have found is that the way to stop churn with data science does not involve using the latest AI technology…
What works best is a return to a traditional notion of “science”: test some hypotheses about subscribers and churn and then communicate the results to the people who actually prevent churn: product and content creators, marketers, and customer representatives.
To many newer to data science, this may come as a surprise because we have come of age in a time when the hype cycle is completely infatuated with neural networks and other black box products and technologies. So much so that being a data scientist is practically synonymous with deploying black box predictive systems.
But AI is not advanced to the point where it can perform high value / high risk tasks like calling customers and designing email campaigns. By and large those jobs still belong to account managers, marketers, and customer support and success representatives.
I have some bad news for all of my fellow data scientists and analysts out there…
Predicting churn is hard.
Usually (hopefully) churn is rare in comparison to the continuation of a subscription, so churn is what you call a rare outcome in data science lingo. As a result, false positives will be common with the best predictive algorithms.
It’s easy to see why predicting churn is hard, and is prone to false positives: Consider your own behavior the last time you unsubscribed from something. You probably were not taking full advantage of the subscription for months, but it took you that long to cancel because you were too busy or uncertain. If a churn warning system was observing your behavior during that time it would have flagged you as a risk every month and been wrong every time—until, that is, the right moment to cancel came. But that moment was determined by too many extraneous factors to be predicted with high precision.
Preventing churn is even harder than predicting churn.
Preventing churn is what’s really hard. If you think about it, every subscription has a cost that must be outweighed by a benefit delivered. If the cost outweighs the benefit, churn is just a matter of time. The cost may be a concrete transactional amount, or it can simply be the attentional cost of a subscription to a free service such as a game or YouTube channel (is it worth the space it takes in your subscription feed?).
This means that in order to prevent churn in a long-term and reliable way, a company must actually move the needle on benefit delivered or cost incurred from using the service. This can be harder than getting people to sign up in the first place, because now they know what the product is actually like.
People often ask me for “silver bullets” to reduce churn, and here’s the bad news: There are no silver bullets to reduce churn, if by a silver bullet you mean a low cost and reliable way that always works. In the words of the great startup CEO and venture capitalist Ben Horowitz, “There are no silver bullets for this, only lead bullets.”
In other words, you have to do the hard work of increasing the value you provide to subscribers, or reduce the cost—which is the nuclear option for a paid service because while revenue churn or downsells may be better than complete and total churn, it’s still churn. (The downsell is a “diamond bullet” against churn: it always works, but you can’t always afford it.)
There have been remarkable advances in AI and data science in the past years, but for the most part actually preventing churn is still something that has to be done by people who either a) make the product, service or content; or b) interact with customers. It varies by the type of subscription offering and organization, but generally speaking these are the people who prevent churn:
From the point of view of the data scientist or analyst, these are the “customers” or “users” of the data analysis. At small organizations these may all be the same people or just one person, but that doesn’t change the question: what can data science do to really help people perform these tasks?
As a result of all these reasons, the data science needed to reduce churn is not the kind of black box AI algorithms that get most of the attention in the media nowadays. Instead the real deal is more of a traditional scientific and statistical analysis. Predictions of churn can be useful, but not unless the prediction is the natural extension of a program of investigation and knowledge transfer from the data scientist or analyst to the product and customer teams.
So a data scientist working to help reduce churn needs to act more like a social scientist or economist than a computer scientist. The data scientist needs to test specific understandable hypotheses about the causes of churn, like what content is stickiest or which behavioral metrics are most closely aligned to value attainment.
Many of these hypotheses should come from the product and customer teams, but a good data scientist should be able to guide the process, challenge assumptions, and uncover some surprises. Then all of this has to be translated into knowledge that actually helps the real churn preventers do their job.
This point of view is actually well known to people who invest large portfolios on Wall Street. No one trusts long-term, high-value investments to black box predictive systems. (Though it is common to use black box AI for high frequency systems making small trades that complete in seconds or less. In that scenario it is easier to halt a failing algorithm and course correct before much damage has been done.) If you have to move money in and out of large positions it means long investment horizon and high transaction cost. Statistical methods are used to verify and quantify hypotheses that the decision makers already have about the markets, but not to make predictions the decision makers cannot see the reasoning behind.
Likewise for churn prevention, the value at stake is high—your company’s survival depends on it. And the cost of interventions can be high too. Poorly planned or executed interventions to prevent churns can be disastrous.
I point all of this out only because, with so much hype around machine learning and black box AI technologies these days, inexperienced data scientists may not realize how inappropriate these approaches are for churn prevention.
So what is a data scientist to do?
1. Leave the Kaggle Mindset Behind
Data scientists and analysts have to stop thinking that accuracy on a predictive problem is the only metric that matters. This is a common attitude for academics developing algorithms on fixed benchmark databases, and it is amplified by competitions like Kaggle. However, this approach falls flat when the problem is a business decision with high stakes. Old fashioned hypothesis testing is the way to start.
2. Listen to the Business
Data scientists need to ask the business stakeholders what they are really interested in achieving, and, just as importantly, find out what hypotheses the business already has about the data they work with. The prior beliefs of people with deep domain knowledge is worth much more than any algorithm!
3. Talk to the Business
Data scientists need to answer the questions the business asks, not just apply algorithms. And the answers need to be in terms the business can understand. Black box models are usually disqualified, and a lot of statistical jargon also has to go. Teach the business the most important findings with simple charts or cohort analyses they can reproduce in Excel. Once the data scientist has gained the confidence of the business there is room for more advanced approaches, but it has to start with a solid foundation.
For more information on churn, tune into Carl Gold’s session on Fighting Churn with Data at Data Council San Francisco from April 17-19.