As data is already everywhere, so data science problems are becoming increasingly prevalent. Some enterprises have counted 50 to 100 data science use cases. To better cope with the sheer mass of projects, some leading organizations are starting data science teams whose general mission is to become a shared resources across the organization. Data science projects can be characterized by their business impact, helping analytics leaders to understand the business benefits they deliver, from innovation and business understanding to prototyping, process refinement and firefighting.
Data science projects can be characterized by their business impact, helping analytics leaders to understand the business benefits they deliver, from innovation and business understanding to prototyping, process refinement and firefighting. At the macro level, data science projects can be utilized to deliver the following high-level business impacts, which we discuss throughout the note in more detail:
Innovation — Foster New Thinking Based on Data Science
Without data scientists and their knowledge, many issues surrounding the digital business age will remain unresolved — possibly even untouched. Data scientists frame complex business problems as machine-learning or operations research problems. Data scientists know which new information sources should be collected or acquired from external sources, to solve old burning business issues in radically new ways. There are many more examples of disruptive projects and new “business moments” made possible through data:
- In the 1990s, Google achieved its incredible success by using a previously untapped data source: the hyperlinks encoded in Web pages.
- Also in the mid-1990s, Amazon started one of the earliest recommendation services, which became one of the most prominent and lucrative data science projects in history. Rumor has it that 15% to 20% of Amazon’s retail business is due to this simple product recommendation. In fact, it became a desirable feature, with customers wanting to explore related items for any given product.
- UPS On-Road Integrated Optimization and Navigation (ORION) revamped route optimization using many new data sources. It has enabled UPS to significantly improve its routing schedules, saving hundreds of millions of dollars per year while improving customer service.
Recommendations for analytics leaders:
- Your own thinking — You are your most important source of inspiration. Constantly think about your own business model, industry and understanding of new types of customer or equipment interaction points.
- Technology screening — Learn what you can from successful case studies from your own industry or other industries. But be cautious, because many publicly available case studies may not fully reflect exactly what happened.
- Induction from data — Examine how data expeditions can support your thinking process and how they can uncover novel and insightful patterns that teach you more about the underlying business mechanics.
Business Understanding — Explore yet Unknown Patterns in Data
Data scientists must engage with big data expeditions, especially when there is no clear objective other than to explore the data for insights and tidbits. Such expeditions are a form of inductive thinking or inductive reasoning (see Note 4) — an example of “letting the data speak.” The process can be tactical and ad hoc. Alternatively, it can be part of a more systematic practice in which you give the data science team a data dump for diving into and exploring.
Recommendations for analytics leaders:
- Use your data science team to spot anomalies in data before you notice any problems, not after a crisis happens. View it as a form of prevention or a means of solving problems early, as with police doing regular patrols or people going for routine medical checks.
- Ask your data science team to look at the data again when new information sources appear or when you gain new understanding. This can prove very worthwhile.
Prototyping — Challenge the Status Quo with Radical New Solutions
Data science and especially machine learning excel in solving complex, data-rich business problems where traditional approaches, such as human judgment and exact solutions, either increasingly fail or deliver inferior solutions. Data science methods have been proven to often deliver superior results, when the space of critical variables is highly dimensional and very noisy.
Hundreds of new business problems exist that data science teams could tackle. Companies are already using data science teams for tasks such as:
- Improving product categorization. Many large online retailers realize that their product classification may have errors or not fit the way customers think about products or want to access them. Data science teams are seeking to improve product categorization by using all available features. These include: look, shape, purpose, codes (such as European Article Numbering and North American Industry Classification System codes), product text descriptions and user-generated tags.
- Predicting more accurately which passengers who buy airline tickets will fail to arrive for their flights. More accurate predictions enable airlines to oversell their planes. This minimizes potential lost revenue from empty seats while reducing the risk of passengers arriving to find that there is no seat available for them.
Refinement — Continuously Improve Existing In-Production Solutions
Most data scientists in the industry work in the production part of the business. In such areas, established models are already “in production.” For example:
- Banks, retailers, telcos and insurance companies are constantly refining their existing customer segmentation, in order to gain a better understanding of customer profitability and customer behavior and engagement optimization.
- Retailers keep recalibrating propensity-to-buy models while online retailers are constantly improving and updating price elasticity prediction, in order to optimize their dynamic pricing.
- Financial services providers are continuously working to improve their risk models — the more accurate their assessment of risk, the better their chances of profitability.
Recommendations for analytics leaders:
- Use your data science team to support production teams in creating and improving enterprise-wide model management and performance monitoring.
- Use your data science team to help production teams create a more homogeneous and cutting-edge compute architecture in terms of hardware, cloud and software stack.
- Ensure your data science and production teams jointly explore the external data landscape and deploy cutting-edge algorithms (for example, ensemble techniques).
Firefighting — Identify the Drivers of Certain Upcoming Situations
Sometimes it may be almost impossible to avoid a crisis because insights into issues that may cause problems can be so well hidden. In such cases, use your data science team to help resolve the crisis. This use is a variation of the big data expedition use of data science teams. Many analytics projects are triggered by crises. When you ask a data science team in this way, you already know the “symptom” of the crisis. For example:
- Customer complaints have suddenly risen
- Customer retention has fallen dramatically
- Quality defects have increased
- Profitability has dropped
This means that the data science team has to identify “only” the cause, which narrows the datasets it must scrutinize.
Everything else in this use scenario is very similar to the work the lab does in big data expeditions. As in big data expeditions, the lab does not know at the outset whether it can identify the cause of the problem. Indeed, it is possible that the lab may never be able to identify the cause.
Do ensure that senior data scientists are part of innovation projects — only then can you be sure not to miss out on innovations that can be framed as data science projects. Also, use your data science team to support production teams for continuously improving enterprise wide model management and performance monitoring. Create a portfolio of analytical scenarios, including those your organization is already executing or planning, to better rationalize funding decisions for data science projects.