Why product-driven data science is the only kind of data science
Getting the most from data science within a product team set up, from a data scientist.
Imagine the most powerful car engine in the world on the floor of a garage. Top speed? 0mph. Data science projects are no different.
Take a churn prediction engine - a core pillar of data science that when delivered correctly, has the potential to deliver millions of pounds in revenue and cost savings.
To unlock this bounty, two things need to happen:
- The engine itself has to be well designed and engineered
- It needs to be placed inside a product that perfectly compliments the output of the engine
The first point is well documented - there are hundreds of excellent blogs posts, courses, and tools that build machine learning models to predict a given response.
The second point unlocks the value. And incredibly, it’s hardly ever talked about. We’re here to change that.
Here are five rules for delivering true product-driven data science.
Think backwards
The day after the project finishes, somebody will be using the output from your model. Who is using it? What does that look like? When do the use it? Why are they using it?
Never lose sight of the answers to these four questions - they drive everything you do. Before you start doing anything with the data, map out the ‘endgame’ scenario where the solution is deployed and fully operational. One this target is fixed, you can start to work backwards to understand the tools, technologies and people that need to be involved to deliver the project.
Build a data pipeline before a model
A model without a data pipeline delivers no value. A data pipeline without a model delivers a benchmark value that can be built upon.
Don’t spend months tuning a model locally before thinking about how the output is surfaced to your internal teams. It’s amazing how often something simple can deliver amazing value, even before applying game-changing machine learning techniques.
Taking a product-driven approach ensures you can start delivering value straight away. Build a flow of data from source to insight first, so that end users are involved from the start of the project and see a product that is growing and improving.
Deliver actions over accuracy
The success of a model is defined solely by how it affects the actions taken by your company. Even if the model is almost perfectly accurate, it only adds value if it has a ‘so-what’ outcome.
Here’s a striking example. Two data scientists come to you with two different models for churn prediction. The first is 100% accurate - it tells you exactly who will churn next week - but is so complex, that there is no way to tell why each predicted customer will churn. The second is less accurate, but is immediately actionable by your marketing team because it tells you the factors that are influencing their decision to leave.
Which would you choose?
A product-driven approach to data science says the second model wins every time. A prediction is just a number - on its own, it delivers no value. Only actions delivered through data products deliver value.
Modularise and abstract
The best products have well defined user-flows, modules and messaging to ensure the best possible experience for the user. Data science projects should be delivered in the same way.
Ensure one day a week is spent refactoring and tidying code, in the same way that your designers would spend time tweaking the product to improve user experience.
Write code and documentation in modules so that each function only performs one job. Nobody likes products that contains superfluous buttons and the same is true for code with unnecessary complexity.
Create a technical ‘readme’ describing how the solution works at a high level. Automate everything. Log everything. You’ll thank yourselves in a year’s time when you come to upgrade the codebase.
Brand and sell the solution
People don’t just buy trainers, they buy Nikes. Products that are branded correctly stick around. With data science, your model is a product with a user base and a purpose and therefore it also deserves its own branding.
This can be something simple like a catchy name and logo. Or if it’s a suite of dashboards, make sure they’re all styled in the same way and navigation between them is seamless.
It’s imperative that data scientists work closely with designers and the whole project team to ensure the model is embedded into business as usual processes. It takes time to sell a project into the business after completion. But if it’s branded correctly, it will sell itself.
These are the five rules that ensure your data science project is product-driven. Get these right and your data science project will flourish every time.
David Foster is Co-Founder of Applied Data Science, Made by Many's data science partners.
Continue reading
CDO Interview: Explaining the art of the possible with MoneySuperMarket.com
Thoughts on the role of a Chief Data Officer and how to make data work with MoneySuperMarket.com's CDO Piers Stobbs.
Designing with data: A guide to user-centred, data-driven products
Data is often used to report inside the business, rarely to build new innovative products that drive customer value. Let's start 'making' products with th...
Mid- to Senior Software Engineer
We're looking for a Software Engineer to join our growing team.