OUR RESEARCH PROCESS
We developed this tool together with a contributor experienced in Biz Ops at LinkedIn and Harvard Business School.
IF YOU REMEMBER NOTHING ELSE
- Cohort analysis lets you visualize how retention changes over time.
- Use this to look back and spot problems to fix, or look forward and measure new initiative success.
- Unpaid products that monetize with ads can use built-in cohort analysis from tools like Google Analytics.
- Paid products should start with manual analysis, can migrate to tools like Mixpanel, and will eventually want migrate back to internal tools at scale.
- Interpret results top to bottom and look for trends. Positive trends mean you’re making meaningful improvements, negative trends mean that your product is getting worse.
Cohort analysis shows how revenue or user retention changes over time, by comparing cohorts of users to each other.
Companies use cohort analysis in two main ways:
Way to use cohort analysis
This lets you…
Retrospective: Look backward to see if retention is getting better or worse, and how this changes with time.
See what is driving retention changes so you can do more of the good and correct the bad.
Show progress when raising capital.
Prospective: Look forward to see if major initiatives pay off.
Serve a similar function to A/B testing to see if major features or initiatives impact retention.
Use our spreadsheet template to run cohort analysis, in four steps:
3. Double check your time periods.
1. Get your data
In order to run a cohort analysis, you need to know:
- Customers or revenue acquired or in any given month
- Customers or revenue retained in each subsequent month
Getting data depends heavily on your existing infrastructure. For some companies this may be a simply SQL query, for others you’ll need the help of your data science team. The “raw data” section in the spreadsheet above shows the format required to measure retention and churn by cohort.
2. Input your raw data.
Add your customer and revenue data in cells B46 through P58 on the "Cohort Analysis" spreadsheet.
Be sure that your time periods are correct and consistent - you want to use 12 months, 12 years, 12 weeks, 12 quarters, etc. of data.
4. Check the results.
The customer and revenue cohort retention and cohort churn analysis will update automatically at the top of the spreadsheet.
For unpaid products:
Do this manually until you feel that you have the processes, tools, data infrastructure, and budget required to use a third party tool. Often third party tools will require engineering support and may require setting up an API so you can directly flow the data through to the tool – don’t waste money on an external tools too soon!
Use third party tools when possible as you scale up and implement better data infrastructure, and keep these tools in place for as long as possible. The only reason to transition to something home-grown or another large excel model is that your data may eventually exceed the limits of what third-party tools can manage. Mixpanel is a great example of a common tool used to measure retention and cohorts.
If you’ve reached a volume of data that would be difficult to manage via a third party tool or would be prohibitively expensive to run through a third-party tool then you can revert back to your own models. Typically at this scale, you’ll gather the data directly from your data warehouse with the help of a data scientist (or pull directly using SQL or other languages) and run through your own excel model.
Ad-driven products like content sites or mobile apps, where you care about user and not revenue retention, can use built-in tools like Google Analytics for cohort analysis.
For paid products:
Paid products should start with manual analysis, can migrate to tools like Mixpanel, and will eventually want migrate back to internal tools at scale.
The reason you use “cohorts”, or customers who started during a specific time period, is that you want to see how retention and churn trend for users who experience the product AFTER you’ve implemented changes over time.
For each of the charts in this spreadsheet, you interpret results top to bottom and look for trends. Positive trends mean you’re making meaningful improvements, negative trends mean that your product is getting worse.
Don’t waste time calculating statistical significance:
If you don’t have many users, your data is going to be noisier, but we wouldn’t advise spending time trying to calculate statistical significance.
Our contributors worked large, well-funded tech companies and spent hundreds of hours testing different approaches to establish whether cohorts were statistically significant – to no avail.
“We had a huge team in the valley and tried tons of different approaches to telling whether differences in cohort retention were statistically significant. It’s such a research suck, and we ultimately couldn’t find a scientific way to do it.”
Cohort analysis helps you answer questions like:
- How do retention and churn change over time?
- Did retention and churn change immediately after a major initiative?
- Are these changes linked to your experiments and product changes?
- As we make investments and roll out product changes, are we actually improving the experience?
How to know if you’re improving over time:
The most common conclusion from a cohort analysis is whether you’re improving your product and experience over time.
You assess this by looking at changes in retention and churn in each lifetime month for each cohort. For example, if we look at the cells below, we can see that Month 3 retention is gradually improving over time. This means that we’re gradually improving our product and keeping more customers around for longer. We can also see that something happened in the 7/31 cohort, but that this issue went away.