1 of 14

Key Features

Features that make up Hydrosphere Platform

Serving

Model Registry
Inference Pipelines
A/B Model Version Deployment
Traffic Shadowing
Language-Agnostic Deployment

Monitoring

Automatic Outlier Detection
Data Drift Report
Monitoring Dashboard and Data Health Metrics
Alerts

Interpretability

Prediction Explanations
Data Projection

Third-Party Integrations

Kubeflow Components
AWS Sagemaker

Model Registry

Hydrosphere has an internal Model Registry as centralized storage for Model Versions. When you build a Dockerized model and upload it to Hydrosphere or create new model versions, they get uploaded/stored to the configured model registry in the form of images. This organizes and simplifies model management across the platform and production lifecycle.

Inference Pipelines

A Hydrosphere user can create a linear inference pipeline from multiple model versions. Such pipelines are called .

A/B Model Deployments

Hydrosphere allows you to A/B test your ML models in production.

A/B testing is a great way of measuring how well your models perform or which of your model versions is more effective and taking data-driven decisions upon this knowledge.

Production ML applications always have specific goals, for example driving as many users as possible to perform some action. To achieve these goals, it’s necessary to run online experiments and compare model versions using metrics in order to measure your progress against them. This approach allows to track whether your development efforts lead to desired outcomes.

To perform a basic A/B experiment on an application consisting of 2 variants of a model, you need to train and upload both versions to Hydrosphere, create an application with a single execution stage from them, invoke it by simulating production data flow, then analyze production data using metrics of your choice.

Learn how to set up an A/B application:

A/B Analysis for a Recommendation Model

Traffic Shadowing

A/B Deployment

Hydrosphere users can use multiple model versions inside of the same Application stage. Hydrosphere shadows traffic to all model versions inside of an application stage.

Users can specify the likelihood that a model output will be selected as an application stage output by using the weight argument.

Traffic Shadowing

Hydrosphere shadows traffic to all model versions inside of an application stage.

If you want to shadow your traffic between model versions without producing output from them simply set weight parameter to 0. This way your model version will receive all incoming traffic, but its output will never be chosen as an output of an application stage.

Language-Agnostic

Hydrosphere is a language-agnostic platform. You can use it with models written in any language and trained in any framework. Your ML models can come from any background, without restrictions of your choices regarding ML model development tools.

In Hydrosphere you operate ML models as Runtimes, which are Docker containers packed with predefined dependencies and gRPC interfaces for loading and serving them on the platform with a model inside. All models that you upload to Hydrosphere must have the corresponding runtimes.

Runtimes are created by building a Docker container with dependencies required for the language that matches your model. You can either use our pre-made runtimes or create your own runtime.

The Hydrosphere component responsible for building Docker images from models for deployment, storing them in the registry, versioning, and more is Manager.

Automatic Outlier Detection

Anomaly detection is focused on identifying data objects that are different from our expectations. It could be influenced by bad practices like noise, errors, or some unexpected events. Unusual data also can be due to rare, but correct, behavior, which often results in interesting findings, motivating further investigation. Because of these reasons, it is necessary to develop techniques that allow us to identify such unusual events. We assume that such events may induce some objects generated by a ”different mechanism”, which indicates that these objects might contain unexpected patterns that do not conform to normal behavior.

From first sight, anomaly detection is perceived as a classification task to differentiate between normal and abnormal events. However, we usually have a very small number of such abnormalities or do not have them at all, which transforms the problem of outlier detection into the unsupervised context, the type of machine learning, a type of machine learning that looks for previously undetected patterns in a data set with no pre-existing labels and with a minimum of human supervision. For all those reasons it becomes very complicated to establish the most relevant outlier detection algorithm for each case.

Hydrosphere has an engine inside that automatically creates an outlier detection metric for each model. The whole process consists of sequential steps and can be splitted up on several stages:

As a starting point, Hydrosphere utilizes a training data to check whether it has an approapriate format for each feature
Applying Mass Volume curve method to find an appropriate anomaly detection model based on the training data
Upload the model on the cluster and assign it as a anomaly detection metric to the initially trained model

As you might notice the core mechanism that makes a choice about the model is based on the Mass Volume curve. This kind of metric might be assumed as a performance metric for unsupervised anomaly detection. You can learn more about this method and metric by this link. At the moment Hydrosphere makes a choice among three anomaly detection algorithms: Isolation Forest, Local Outlier Factor and One-Class Support Vector Machines. Additional models will be added later.

The last aspect of outlier detection algorithm is choosing an appropriate threshold. Most methods calculate outlier score for each object and then threshold the scores to detect outliers. Most widely used thresholding techniques are based on statistics like standard deviation around mean, median absolute deviation and interquartile range. Unfortunately, these statistics can be significantly biased because of the presence of outliers when calculating these statistics. In Hydropshere, we empirically decided to stay with 97th percentile. It means that a request is labeled as an outlier if the outlier score is greater than the 97th percentile of training data outlier scores distribution.

You can observe those models deployed as metrics in your monitoring dashboard. These metrics provide you with information about how novel/anomalous your data is. If these values of the metric deviate significantly from the average, you can tell that you experience some potential abnormality event. In the case, if you observe a gradually increasing number of such events, then it might be associated with a data drift, which makes a need to re-evaluate your ML pipeline to check for errors.

Supported Models

Right now Auto OD feature works only for Models with numerical scalar fields and uploaded training data.

Data Drift Report

Drift Report service creates a statistical report based on a comparison of training and production data distributions. It compares these two sets of data by a set of statistical tests and finds deviations.

Drift report uses multiple different tests with p=.95 for different features:

Numerical features:

Levene's test with a trimmed mean
Welch's t-test
Mood's test
Kolmogorov–Smirnov test

Categorical features:

Chi-Square test
Unseen categories

Supported Models

Right now Drift Report feature works only for Models with numerical scalar fields.

Monitoring Dashboard

Monitoring Dashboard lets you track your performance metrics and get a high-level view of your data health.

Monitoring Dashboard plots all requests streaming through a model version which are colored in respect with how "healthy" they are. On the horizontal axis we group our data by batches and on the vertical axis we group data by signature fields. In this plot cells are determined by their batch and field. Cells are colored from green to red, depending on the average request health inside the batch.

Alerts

Hydrosphere Alerts about failed data checks and other issues with models are not available in the open-source version. If you are interested in this component please contact us via Gitter or our website.

Overview

****Sonar sends data about any failed health checks of live production models and applications to Prometheus AlertManager. Once a user deploys a model to production, adds training data and starts sending production requests, these requests start getting checked by Sonar. If Sonar detects an anomaly (for example, a data check failed, or a metric value exceeded the threshold), AlertManager sends an appropriate alert.

Users can manage alerts by setting up AlertManager for Prometheus on Kubernetes. This can be helpful when you have models that you get too many alerts from and need to filter, group, or partly silence them. AlertManager can take care of grouping, inhibition, silencing of alerts, and routing them to the receiver integration of your choice. To configure alerts, modify the prometheus-am-configmap-<release_name> ConfigMap.

For more information about Prometheus AlertManager please refer to its official documentation.

Prediction Explanation

Prediction Explanation service is designed to help Hydrosphere users understand the underlying causes of changes in predictions coming from their models.

Prediction Explanation generates explanations of predictions produced by your models and tells you why a model made a particular prediction. Depending on the type of data your model uses, Prediction Explanation provides an explanation as either a set of logical predicates (if your data is in a tabular format) or a saliency map (if your data is in the image format). A saliency map is a heat map that highlights parts of a picture that a prediction was based on.

Hydrosphere uses model-agnostic methods for explaining your model predictions. Such methods can be used on any machine learning model after they've been uploaded to the platform.

As of now, Hydrosphere supports explaining tabular and image data with Anchor and RISE tools correspondingly.

Data Projection

Data Projection is a service that visualizes high-dimensional data in a 2D scatter plot with an automatically trained transformer to let you evaluate the data structure and spot clusters, outliers, novel data, or any other patterns. This is especially helpful if your model works with high-dimensional data, such as images or text embeddings.

Data Projection is an important tool, which helps to describe complex things in a simple way. One good visualization can show more than text or data. Monitoring and interpretation of machine learning models are hard tasks that require analyzing a lot of raw data: training data, production requests, as well as model outputs.

Essentially, this data is just numbers that in their original form of vectors and matrices do not have any meaning since it is hard to extract any meaning from thousands of vectors of numbers. In Hydrosphere we want to make monitoring easier and clearer that is why we created a data projection service that can visualize your data in a single plot.

Usage

To start working with Data Projection you need to create a model that has an output field with an embedding of your data. Embeddings are real-valued vectors that represent the input features in a lower dimensionality.

Create a model with an embedding field
Data Projection service delegates the creation of embeddings to the user. It expects that model will create embedding from input features and pass it as output vector. Thus embedding field is required, models without this field are not supported. Data Projection also expects that output labels field is called class and model confidence is called respectively confidence. Other outputs are ignored.
Send data through your model
Check Data Projection service inside the Model Details menu

Each point in the plot presents a request. Requests with similar features are close to each other. You can select a specific request point and inspect what it consists of.

Above plot, there are several scores: global score, stability score, MSID score, etc. These scores reflect the quality of projection of multidimensional requests to 2D. To interpret scores you refer to technical documentation on Data Projection service.

In the Colorize menu, you can choose how to colorize model requests: by class, by monitoring metric or by confidence. Data Projection searchers specifically for output scalars class and confidence.

In the Accent Points menu, you can highlight the nearest in original space points to the selected one by picking the nearest variant. Counterfactuals will show you nearest points to selected but with a different predicted label.

Kubeflow Components

Hydrosphere Serving Components for Kubeflow Pipelines provide integration between Hydrosphere model serving benefits and Kubeflow orchestration capabilities. This allows launching training jobs as well as serving the same models in Kubernetes in a single pipeline.

You can find examples of sample pipelines here.

Serving components

Deploy

The Deploy component allows you to upload a model, trained in a Kubeflow pipelines workflow to a Hydrosphere platform.

For more information, check Hydrosphere Deploy Kubeflow Component

Release

The Release component allows you to create an Application from a model previously uploaded to Hydrosphere platform. This application will be capable of serving prediction requests by HTTP or gRPC.

For more information, check Hydrosphere Release Kubeflow Component

AWS Sagemaker

Automatic Outlier Detection

Hydrosphere has an engine inside that automatically creates an outlier detection metric for each model. The whole process consists of sequential steps and can be splitted up on several stages:

As a starting point, Hydrosphere utilizes a training data to check whether it has an approapriate format for each feature
Applying Mass Volume curve method to find an appropriate anomaly detection model based on the training data
Upload the model on the cluster and assign it as a anomaly detection metric to the initially trained model

Supported Models

Right now Auto OD feature works only for Models with numerical scalar fields and uploaded training data.