The three key ingredients of successful AI
What your organization needs to get business value from machine learning and artificial intelligence
In 2020, many organizations are convinced of the added value of data science. Significant resources have been invested to start new, modern data departments. After their conception, it appeared considerably more difficult than expected to turn the output of these teams into concrete business value. What happened?
To explain this phenomenon we will look into the key ingredients that are required for the successful implementation of machine learning in business processes. Of these three important pillars, two are frequently neglected, leading to disappointing results. These ingredients are:
- Strong machine learning capabilities
- Fast and reliable deployment of machine learning solutions
- Seamless integration of machine learning solutions in the business process
The need for a strong data team to get results is obvious and gets plenty of attention. The effort required for fast and reliable deployment of solutions is often underestimated and the business process integration usually follows as only an afterthought. We will deep-dive into this last ingredient first because in our opinion this is where projects should start.
Most machine learning projects only add value once properly integrated into a business process. While this may appear obvious, you would be surprised how frequently the required effort is underestimated. A lack of trust in the output of the new solution is one of the leading causes of machine learning project failure. Another common issue is a lack of understanding between data scientists and their business partners, which can lead to a group of competent people solving the wrong business problem. Giving enough attention to business integration is essential to a successful machine learning project.
Impact and continuity
Machine learning can generally improve business processes in two ways: by improving the quality of complex decision-making and by automating a decision-making process at scale. Algorithmic trading and predictive maintenance are examples of complex decisions that can be improved by training machine learning models on massive amounts of data. Automatic decision-making offers a scaling advantage: while an expert employee might make better decisions, a machine learning model can scale to making billions of these decisions at sub-second processing times. Common examples of this type of business case are recommender systems to personalize websites and natural language processing to analyze live feedback from customers.
It is important to measure the performance of the solution and the improvement it brings to the business process. Because of this, machine learning solutions are often trialed in the form of a proof of concept. The integration is by definition improvised and the performance measurements ad hoc and opportunistic. This is acceptable for the start, but inadequate for the long term.
This gap can be closed by anticipating the preferred end situation and by investing time, money and resources in the solution to accelerate to the required level as soon as possible. Long term impact requires structural quality and continuity.
The business integration of machine learning solutions is human labor. It requires expertise in the business domain, software development, business leadership, and strong machine learning knowledge. Making an impact using machine learning requires gathering a team with the right mix of skills.
Deployment: continuous delivery for machine learning
Only when a machine learning model is available on a production system, integration in a business process is truly possible. Making these models available is also known as deploying these models. A successful deployment process is fast, reliable and predictable. In the perfect world, your organization is capable of continuously improving and deploying machine learning models. There are few things more frustrating than endlessly waiting until a promising application is available for use.
A slow deployment process hinders user acceptance and decreases trust in the application. This can be disastrous for data applications because trust is one of the critical success factors of successful business integration.
Most machine learning applications have an additional risk: they show decay over time. The world is changing continuously, including the relationship between the input and the output of the models. This means the value of a model decreases with time. The more volatile the changes in a domain, the stronger this effect will be. This concept is commonly referred to as drift.
A slow deployment process directly hurts the business value of a solution when drift occurs. The remedy seems obvious: frequently retrain the model with new data. Unfortunately, in practice this is difficult. Architecting a continuous integration and delivery process for machine learning requires a high level of maturity of organizations and a large degree of standardization and automation.
Standardization and automation
The required amount of automation for continuous delivery in the machine learning domain is not frequently reached. Frequently this becomes the responsibility of the people involved in building the machine learning models. While there is no lack of intelligence or creativity there, most of the work in getting to this high level of automation involves software engineering and IT operations which is far from the core competencies of data scientists. The result is often a codebase full of stitched together scripts that are difficult to maintain and handover. Even relatively young and tech-focused companies like LinkedIn suffer from this.
We have been working on a solution that helps with both standardization and automation in a user-friendly manner. Cubonacci is a code-first platform that helps organizations manage and deploy machine learning at scale. By empowering data scientists and machine learning engineers to customize what they need but taking care of scaling the infrastructure, keeping track of their experiments and deployments and scheduling jobs to keep models up to date, part of the issues we have discussed is taken care of. You can follow us on LinkedIn here.
Data science at scale: human labor
It does not matter how powerful your tools are, there is no such thing as successful machine learning without a functioning data team. The field is developing rapidly and teams cannot wait to try out innovations and learn new skills. How do you manage that properly? This is a difficult question. A dense but excellent manual about this topic in practice called “The Care and Feeding of Data Scientists — How to Build, Manage and Retain a Data Science Team” can be found here as a free pdf.
From our experience in the field, we see that the following aspects are fundamental for the development of a successful data team:
- The right mix of people
- The means for proper support tooling
- A good understanding and relationship with business partners
- Clear objectives that are connected to higher-level business goals
- Regular exchanges with experts outside of the team
- Autonomy on the machine learning approach
- Deployment as a first-class citizen
- Management has to be a full-fledged partner
The key to a good data team is having access to the right people. Finding and retaining specialists requires an environment that allows them to be creative and to have an impact. Give your team the attention, space, and the right tools. You will be surprised by the speed at which they will deliver innovative solutions for your organization.
Written together with Jan van der Vegt