Small is Beautiful in Data Science

By Aarthi Rayapura November 14, 2016

Q&A with Carl Gold, Zuora’s Chief Data Scientist. Gold holds a PhD in Computation & Neural Systems and has spent most of his career as a quantitative analyst on Wall Street.

What’s the difference between Big Data and Small Data?
When you have hundreds of millions, or billions of accounts, then you have big data. Companies with truly big data are the likes of Spotify, Netflix, Facebook, Google, Amazon. Those big companies have tons of data, and they have massive budgets to hire dozens of data scientists and pay for super expensive clusters of servers and do all kinds of great stuff. The thing with big data is that you can answer more complicated questions than you can with small data because you’re a lot freer from the limits of finding statistical significance.

But, small is just as beautiful in data science and companies with small data can use it to up their game, too! Companies with less data tend to think they need to get more data, but that doesn’t necessarily make sense. B2B companies naturally have less data than B2C companies because they have fewer customers. But, each customer of a B2B company is worth more, and there are smart ways to reach useful conclusions, even if you don’t have big data. As long as they have enough data to draw out actionable insights, companies should focus on what they can learn from their data. Moreover, with the cost of collecting and analyzing data coming down so much, if you’re not doing it, you’re at a competitive disadvantage.

What advice do you have for companies with “Small Data” who want to walk down a data-driven path?
Many small and medium-sized companies gather a lot of data but don’t invest in the required tools and expertise. People tend to think spreadsheets and plot points are enough but data science is so much more! Having untrained people in various departments making plots and staring at spreadsheets is not really the answer.
Sure, you can probably catch easy differences between your regions or things like that. But when you actually get into the question of detecting effects from different factors and analyzing the difference between finer and more specific groups, you won’t get very far. Here’s what you need to do:

1. What small companies need to do is clean up inhouse data, set up effective systems and processes for communicating about the data between departments, and use specific analytical tools to gain actionable insights. If you don’t have the foundation of clean data and processes built around it, it’s going to impact your analyses big time.

2.You need at least a little in-house expertise. But it doesn’t have to be someone who studied data science, or has a PhD; working knowledge of undergraduate statistics is all you really need. Then, you can leverage that expertise with the right tool set.

3. Just like in software, it doesn’t always make sense to build what you need in-house. I’m a 100 percent believer in using specialized data analysis tools to answer different questions. For instance, your Marketing team’s needs will be very different from your Finance team’s needs—you shouldn’t be using the same tool and approach for both. Find the right tools for the job.

Any favorite projects at Zuora?
Our new Insights product is definitely my favorite! It’s a predictive tool that will help companies gain insights from subscriber life cycle events. It’ll help businesses both at the strategic and practical level. At a strategic level, it’s going to let you identify what factors contribute to customers changing their status, such as upgrading or downgrading, and what in your data predicts those events. On a practical level you can look at each account and say, “What’s the most likely thing that’s going to happen to this account? Are they a churn risk? Are they an upsell opportunity?” And why. That’s extremely valuable information for any business to have. We’ve followed a very rigorous process with some serious testing of our predictive accuracy. We’re providing the predictions, as well as a clear understanding of the behaviors driving the prediction. There’s an emphasis on interpretability. It’s really cool!

Another interesting project is our new SubscriptionEconomy Index which takes a look at the health of the global Subscription Economy. There’s so much buzz about it. It’s good to dive into the data and get an in-depth view. We are now able to say, “Look, these companies have been really growing at a faster pace than the rest of the economy for the past couple of years. And here’s why.” A lot of businesses wonder about it, but no one has had the data to show the trends until now. It has been a fascinating project to work on!

Check out Zuora Insights here and the Subscription Economy Index here!