Chapter 1: The Case for MLOps

Typical data science work involves extensive data cleaning, exploration, and analysis. In some projects, you may need to run statistical analyses or create data visualizations to help policymakers understand the impact of an intervention or program. In others, you might experiment with different algorithms and hyperparameters to build the best-performing ML model for a given objective, such as predicting fraudulent transactions or developing a recommendation system.

However, productionizing ML models is a different challenge altogether. ML training scripts must be developed, tested, and packaged into an automated pipeline. Data, models, and code all need to be versioned carefully to ensure traceability. Infrastructure must be set up and secured properly to run the model and store the data. All of this ensures your code and models are reliable, secure, and resilient.

Given the significant differences between these tasks, it's no surprise that even experienced data scientists often struggle with productionising ML models. We explore these differences in detail below through three key areas:

Capabilities: The skills required go beyond what data scientists typically need.
Tools: The tools used for production are quite different from those familiar to data scientists.
Processes: Deploying into production involves strict, rigorous processes that can feel at odds with the flexibility and experimentation that data science work often requires.

1.1. Tools & Skills Required Are Different

The core expertise of data scientists lies in understanding ML algorithms, statistical methods, and data visualization principles. When it comes to coding, data scientists typically focus on processing data and training ML models. However, productionising ML models requires foundational knowledge in DevOps & software engineering. This includes writing robust code, creating tests, managing version control, and handling dependencies.

Consider Python as an example. For most data scientists, installing Python is straightforward: download and install Anaconda, run conda install jupyterlab, and write Python code within the Jupyter Lab environment. If extra libraries are needed, they can be easily installed through Anaconda.

However, deploying ML models introduces additional complexities. Some key challenges include:

Managing Python version & library dependencies: ML scripts often rely on libraries like scikit-learn or pandas, each with its own versioning. These dependencies need to be packaged alongside the ML script to ensure it runs properly in production.
Deploying as an API: The most common method for deploying ML models today is as an API. This requires a basic understanding of APIs, web applications, and networking.

For most data scientists, Jupyter notebooks are the main tool for data preparation, exploratory data analysis, data visualisation, and training ML models. While Jupyter notebooks are convenient for these purposes, they are not that well-suited for production deployments. Here are a few reasons:

Inconsistent code execution: Have you ever encountered situations where your code doesn't run correctly after you restart your Jupyter notebook? That may be because you didn't run your code in the same order or you made ad-hoc changes which you didn't save. Jupyter notebooks allow such situations to happen, which can make it hard for others to replicate your steps correctly.
Lack of proper version control: Collaborating with other data scientists can be quite hard on Jupyter notebooks, especially if you make changes to the same parts of the notebook. As a result, you may have multiple notebooks titled notebook_v1_editedbyA.ipynb or notebook_v1_editedbyA_byB.ipynb. While this may be alright when rushing for a presentation, it's very unreliable when you are planning to deploy a ML model for a system that has to work 24/7.
Not scalable or executable as a script: Jupyter notebooks can only be run within the Jupyter Lab interface, which is a problem because most production systems execute scripts from the command line. Moreover, the Jupyter Lab interface itself uses some computational power, so the code doesn't run as efficiently as it could.

1.2. Deployment processes are different

IT systems are treated very carefully in many agencies, and with very good reason. These systems support mission-critical business functions, be it grant applications, student e-learning, or port operations. They also hold sensitive data, especially income declarations or health records. If any of these systems go down, it would severely affect the agency's ability to do its job.

As such, agencies follow strict processes when new code changes are being introduced to IT systems, with numerous safeguards and checks in place to ensure that these changes do not cause problems or introduce vulnerabilities. This can add significant friction to the entire process, causing simple changes to take days or weeks to be implemented.

Most importantly, this often comes at odds with how data science work is done. Having access to production data and being able to update ML models frequently are both necessary for data scientists to build the best-performing ML models, but these are extremely difficult under a risk-averse approach to managing IT systems.

1.3. Remarks

We hope this has kept your interest high. In the next chapter, we’ll explore the different MLOps maturity levels and help you identify the key areas to focus on based on your agency’s starting point. Building MLOps capabilities is a journey that takes time, but with steady progress, you’ll gain valuable insights into effectively deploying ML in the public sector.