Speaker
Description
Managing and monitoring AI models in production, also known as machine learning operations (MLOps), has become essential in our days, resulting in the need for highly reliable MLOps platforms and frameworks. In the AI4EOSC project in order to provide our customers with the best available ones, we reviewed the field of open-source MLOps and examined the platforms that serve as the backbone of machine learning systems. Recognizing how tracking experiments may improve the process of organising and analysing the results of machine learning experiments as well as team collaboration and knowledge sharing should be noted. From workflow orchestration to drift detection, every aspect of the machine learning lifecycle was reviewed.
Based on that study and in order to aid scientists in their goal to achieve high model standards and implement MLOps practices, we deployed the MLflow platform for the AI4EOSC and iMagine users, are offering a Frouros drift detection python library, are developing a monitoring system for logging drift detection runs. The provided MLflow platform features a central remote tracking server so that every AI experimentation run either on the AI4EOSC platform or any other resources can be individually tracked and shared with other registered users if desired. Frouros library combines classical and more recent algorithms for both concept and data drift detection.
In this contribution, the global MLOps landscape of the continuously growing AI world will be presented together with our practical implementation in the AI4EOSC project and lessons learned from our users.
Topic | Needs and solutions in scientific computing: Platforms and gateway |
---|