This post was originally published by V Sharma at Medium [AI]
A study published in 2019 was highly cited (link) by various articles, it talks about why more than 80% of data science projects never make it to production. Since then Machine learning operationalization is catching pace, we could refer to Oct 2020 report, where MLOps seems to be trending across all platforms. Various organizations are setting up their data science teams composition from a development to deployment perspective.
The below chart clearly describes various roles involved in the ML model deployment team (Link). If you closely see, you will observe two important roles during machine learning model development, ML engineer and Data Scientist. ML Engineers supposed to know the complete model deployment life cycle and provide that know-how to data scientists during development. However, many a time, we see data scientists developing the ML models in silos, by doing various experimentations and selecting the best model based on its performance metrics. Finally, they hand over this developed ML model to the ML engineers for deployment perspective.
If data scientists (without knowing ML model deployment lifecycles) develop the models independently, there are chances we may not get the best model for the organization from a long-term point of view. No doubt, a data scientist would have selected the most accurate model with low variance and bias, however, model accuracy is not the only important metric from the bird’s eye view of ML Model management.
Now if you see the below chart (Link given), it clearly shows that ML model development is a small part of the large picture and all these components are interconnected. For any ML model to be successful at the enterprise level, all these pieces should work synchronously. However, many a time downstream MLOps choices depend on the selection of algorithms, data preprocessing steps by data scientists.
In this article, I would like to highlight my thoughts around the repercussions of various decisions made by data scientists in the model development stage. This is not a research paper or an exhaustive list, this is just based on my opinion around the end-to-end ML model management lifecycle. In my view, if data scientists understand the complete ML Lifecycle, they could in turn make improved choices. Nothing could beat hands-on experience, if not, at least understanding of various steps and corresponding choices available at each step will help Data Scientists.
Now, let’s see chart 3 below, which shows the ML model automated pipeline deployment cycle. Now many times “Source Code” block is the one in which data scientists complete (after various development experiments) the model and gives it to ML engineers. Now let’s see how the choices at this stage might have downstream repercussions. Following are the points I could think of, feel free to provide in your comments if I have missed any
Continuous Integration Testing
At this stage, you try to incorporate various ML-specific tests in your continuous integration lifecycle i.e. unit tests for your model/features, differential tests, extreme tests, and so on. The main reason you are building this CI pipeline, that in the future if any update to the model happens, you need not spend any time on its integration. Hence you would require to have robust in-built testing in the CI pipeline.
For IT applications CI testing steps are evolved and most of the DevOps engineers are well aware of it. Now for ML models, this responsibility now comes at the shoulders of ML engineers. ML models are very different from IT applications, they behave as per the data provided to them. If ML engineers were not deeply involved during ML model development, they would not be able to envision many data and ML model-related scenarios to create ML-specific in-built tests in the pipeline.
Hence, data scientists, if they understand the nuances of the CI pipeline, can provide better inputs related to the CI testing pipeline.
At this stage, you try to incorporate various ML-specific steps in your continuous deployment lifecycle i.e. testing your HTTP endpoint, deployment in pre-prod environment, manual/automated push to prod environment.
Now you do not want to push your ML model to production right away, without knowing how it behaves in a real-world scenario. Now there are various deployment scenarios to choose from (Canary, Multi-arm bandit, etc.). Again this choice is highly dependent on your business problem, your ML model deployment iteration (major change vs minor update), cost of business decisions, infrastructure cost, and so on.
Hence, data scientists, if they understand nuances of CD pipeline and deployment scenarios, can provide better inputs related to CD pipeline.
Model Monitoring Driven Continuous Training
Now you need to monitor your deploy model for data, concept drifts. What to do when you detect a drift? You obviously would like to take an action, either to retrain the model (automatic action) or trigger a workflow to ask a data scientist to revalidate the model (manual action). For either of these to work effectively, you would need to find why the drift is happening, this is where Explainable and interpretable AI comes in to picture.
Again that choice has to be made during the model development cycle. Maybe choosing an interpretable model makes more sense for your business problem than a complex neural network-based model. As we know explainable AI algorithms come with various ifs and buts for business-critical decisions. Hence, early in the ML project cycle, data scientists can decide to compromise on model accuracy for interpretability.
This is a very broad topic, I would probably write another article on this. However, without looking at the bigger picture, many of the data scientists’ choices can incur the technical debt for the deployed ML model.
Just to exemplify one technical debt concept, let’s assume you are a data scientist in a bicycle manufacturing company with an e-commerce platform. Now you developed and deployed an ML model to recommend the top 10 bicycles based on customer search. Now let’s say another data science team in your company has also deployed a model with dynamic pricing at the same time. Data scientists in both teams would be well aware of the interaction effects of these two models. But many a time these two deployments could also go separately ignoring interactions, which could, in turn, have a negative effect on ROI quantification, model monitoring systems efficacy. This is known as Hidden Feedback Loop technical debt.
I have just captured my thoughts around this topic, however, it is a very broad topic. Lots of details can further be added, please leave your thoughts in the comments.
This post was originally published by V Sharma at Medium [AI]