If you are building out a MLOps practice in your organization, you must start by making sure your data scientists can work efficiently. Last thing you want are Data Scientists who spend most of their time on things like data cleaning or manual model deployments - which should be almost entirely automated.
To enable your Data Science practice you must adopt the appropriate machine learning infrastructure for the business case in your organization. This allows for better experimentation tooling and will make your data scientists more productive. The process includes building a strategy to adopt the right tools for data analysis, experimentation, feature stores, training, ML pipeline, model registries, monitoring, and many other key ML activities. With the right infrastructure, your organization will see faster model iteration and ultimately more results in day to day business operations.
Here at Bitstrapped, we have navigated the pains and blindspots of developing MLOps practices through our work with clients. Given this is such a recurring challenge for organizations, we have decided to break down the production ML lifecycle into three distinct architectures. Each architecture fits a stage of the ML lifecycle and our aim in writing this article, is to help your organization understand how to adopt MLOps at your given stage.
Here are the architecture strategies for each stage of MLOps:
The ability to excel at each stage and ultimately progress, is based on an organization's infrastructure readiness. As an organization matures in their machine learning journey and increases the number of models available to them, they want to evolve from one architecture to the next. This is synonymous with how a start up goes from “0-1” and in this case it continues with "1-2" and "2-3".
Below are the three architecture patterns we have observed as being "common" solutions for the requirements present at each of the 3 maturity levels.
MLOps for organizations experimenting with their first model
The following architecture suits organizations who are focused primarily on validating a machine learning use case or Proof of Concept. From a technical perspective, at this stage, you are training models manually through Jupyter notebooks (or similar data science tools) on the local laptops of data scientists, exporting them and “throwing them over the wall” to engineering to move into production. This architecture primarily supports the single-product, single-model organization
MLOps for organizations aiming to deploy more than one model into production
As an organization you’ve been able to deploy a model but are unable to monitor the model performance and improve the model. You may also be thinking about how best to A/B or rollout new versions of your model. Furthermore, you have started collecting labels to improve model performance but lack automation when it comes to re-triggering training pipelines. This tier is great for the single-product, multi-model organization
MLOps for teams that are leveraging machine learning across multiple products, teams, or subdivisions of the organization, who need a consistent way to unify and manage operations, data, and model lifecycles
As an organization you are now not only running multiple features powered by ML but you also have distinct product teams each having to manage and run their own machine learning systems. Some models share data between them and you are finding either data inconsistency across models or way too much data duplication. Each team creates their own “same” data features to train their models. This tier is for the multi-product, multi-model organization
We view building a durable MLOps practice as a prerequisite to transforming your product or service offering using machine learning. Instituting the right MLOps architecture for your stage in the ML journey will ensure your engineering and data science team have a suitable environment to enable model deployment, A/B testing, or rollouts of new versions of your model. Whether you are at Minimum Viable stage, Production, or building out Enterprise ML Architecture, the infrastructure you choose to invest in will contribute greatly to your likelihood of success.
If you are serious about ML and what it can do for your business, you must be just as serious about the infrastructure that makes this all possible. These infrastructure environments can decide the fate of our ML projects and the satisfaction of your team. Getting infrastructure right at each stage unlocks the possibility of graduating to the next stage. The ultimate goal is incorporate ML into your product or service to give you a competitive edge in your business.
If you'd like to dive deeper on this ML architecture approach please reach out via our contact page.