Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Hydrosphere allows you to A/B test your ML models in production.
A/B testing is a great way of measuring how well your models perform or which of your model versions is more effective and taking data-driven decisions upon this knowledge.
Production ML applications always have specific goals, for example driving as many users as possible to perform some action. To achieve these goals, it’s necessary to run online experiments and compare model versions using metrics in order to measure your progress against them. This approach allows to track whether your development efforts lead to desired outcomes.
To perform a basic A/B experiment on an application consisting of 2 variants of a model, you need to train and upload both versions to Hydrosphere, create an application with a single execution stage from them, invoke it by simulating production data flow, then analyze production data using metrics of your choice.
Learn how to set up an A/B application:
Hydrosphere has an internal Model Registry as centralized storage for Model Versions. When you build a Dockerized model and upload it to Hydrosphere or create new model versions, they get uploaded/stored to the configured model registry in the form of images. This organizes and simplifies model management across the platform and production lifecycle.
Hydrosphere is a language-agnostic platform. You can use it with models written in any language and trained in any framework. Your ML models can come from any background, without restrictions of your choices regarding ML model development tools.
In Hydrosphere you operate ML models as Runtimes, which are Docker containers packed with predefined dependencies and gRPC interfaces for loading and serving them on the platform with a model inside. All models that you upload to Hydrosphere must have the corresponding runtimes.
Runtimes are created by building a Docker container with dependencies required for the language that matches your model. You can either use our pre-made runtimes or create your own runtime.
The Hydrosphere component responsible for building Docker images from models for deployment, storing them in the registry, versioning, and more is Manager.
Hydrosphere users can use multiple model versions inside of the same Application stage. Hydrosphere shadows traffic to all model versions inside of an application stage.
Users can specify the likelihood that a model output will be selected as an application stage output by using the weight
argument.
Hydrosphere shadows traffic to all model versions inside of an application stage.
If you want to shadow your traffic between model versions without producing output from them simply set weight
parameter to 0
. This way your model version will receive all incoming traffic, but its output will never be chosen as an output of an application stage.
A Hydrosphere user can create a linear inference pipeline from multiple model versions. Such pipelines are called Applications.
For each model with uploaded training data, Hydrosphere creates an outlier detection (Auto OD) metric, which assigns an outlier score to each request. A request is labeled as an outlier if the outlier score is greater than the 97th percentile of training data outlier scores distribution.
You can observe those models deployed as metrics in your monitoring dashboard. These metrics provide you with information about how novel/anomalous your data is.
If these values of the metric deviate significantly from the average, you can tell that you experience a data drift and need to re-evaluate your ML pipeline to check for errors.
Right now Auto OD feature works only for Models with numerical scalar fields and uploaded training data.
Prediction Explanation service is designed to help Hydrosphere users understand the underlying causes of changes in predictions coming from their models.
Prediction Explanation generates explanations of predictions produced by your models and tells you why a model made a particular prediction. Depending on the type of data your model uses, Prediction Explanation provides an explanation as either a set of logical predicates (if your data is in a tabular format) or a saliency map (if your data is in the image format). A saliency map is a heat map that highlights parts of a picture that a prediction was based on.
Hydrosphere uses model-agnostic methods for explaining your model predictions. Such methods can be used on any machine learning model after they've been uploaded to the platform.
As of now, Hydrosphere supports explaining tabular and image data with Anchor and RISE tools correspondingly.
****Sonar sends data about any failed health checks of live production models and applications to Prometheus AlertManager. Once a user deploys a model to production, adds training data and starts sending production requests, these requests start getting checked by Sonar. If Sonar detects an anomaly (for example, a data check failed, or a metric value exceeded the threshold), AlertManager sends an appropriate alert.
Users can manage alerts by setting up AlertManager for Prometheus on Kubernetes. This can be helpful when you have models that you get too many alerts from and need to filter, group, or partly silence them. AlertManager can take care of grouping, inhibition, silencing of alerts, and routing them to the receiver integration of your choice. To configure alerts, modify the prometheus-am-configmap-<release_name>
ConfigMap.
For more information about Prometheus AlertManager please refer to its official documentation.
Data Projection is a service that visualizes high-dimensional data in a 2D scatter plot with an automatically trained transformer to let you evaluate the data structure and spot clusters, outliers, novel data, or any other patterns. This is especially helpful if your model works with high-dimensional data, such as images or text embeddings.
Data Projection is an important tool, which helps to describe complex things in a simple way. One good visualization can show more than text or data. Monitoring and interpretation of machine learning models are hard tasks that require analyzing a lot of raw data: training data, production requests, as well as model outputs.
Essentially, this data is just numbers that in their original form of vectors and matrices do not have any meaning since it is hard to extract any meaning from thousands of vectors of numbers. In Hydrosphere we want to make monitoring easier and clearer that is why we created a data projection service that can visualize your data in a single plot.
To start working with Data Projection you need to create a model that has an output field with an embedding of your data. Embeddings are real-valued vectors that represent the input features in a lower dimensionality.
Create a model with an embedding
field
Data Projection service delegates the creation of embeddings to the user. It expects that model will create embedding from input features and pass it as output vector. Thus embedding
field is required, models without this field are not supported. Data Projection also expects that output labels field is called class
and model confidence is called respectively confidence
. Other outputs are ignored.
Send data through your model
Check Data Projection service inside the Model Details menu
Inside Data Projection service you can see your requests features projected on a 2D space:
Each point in the plot presents a request. Requests with similar features are close to each other. You can select a specific request point and inspect what it consists of.
Above plot, there are several scores: global score, stability score, MSID score, etc. These scores reflect the quality of projection of multidimensional requests to 2D. To interpret scores you refer to technical documentation on Data Projection service.
In the Colorize menu, you can choose how to colorize model requests: by class, by monitoring metric or by confidence. Data Projection searchers specifically for output scalars class and confidence.
In the Accent Points menu, you can highlight the nearest in original space points to the selected one by picking the nearest variant. Counterfactuals will show you nearest points to selected but with a different predicted label.
Monitoring Dashboard lets you track your performance metrics and get a high-level view of your data health.
Monitoring Dashboard plots all requests streaming through a model version which are colored in respect with how "healthy" they are. On the horizontal axis we group our data by batches and on the vertical axis we group data by signature fields. In this plot cells are determined by their batch and field. Cells are colored from green to red, depending on the average request health inside the batch.
Drift Report service creates a statistical report based on a comparison of training and production data distributions. It compares these two sets of data by a set of statistical tests and finds deviations.
Drift report uses multiple different tests with p=.95 for different features:
Numerical features:
Levene's test with a trimmed mean
Welch's t-test
Mood's test
Kolmogorov–Smirnov test
Categorical features:
Chi-Square test
Unseen categories
Right now Drift Report feature works only for Models with numerical scalar fields.
Hydrosphere Serving Components for Kubeflow Pipelines provide integration between Hydrosphere model serving benefits and Kubeflow orchestration capabilities. This allows launching training jobs as well as serving the same models in Kubernetes in a single pipeline.
You can find examples of sample pipelines here.
The Deploy component allows you to upload a model, trained in a Kubeflow pipelines workflow to a Hydrosphere platform.
For more information, check Hydrosphere Deploy Kubeflow Component
The Release component allows you to create an Application from a model previously uploaded to Hydrosphere platform. This application will be capable of serving prediction requests by HTTP or gRPC.
For more information, check Hydrosphere Release Kubeflow Component