1 of 14

Key Features

Features that make up Hydrosphere Platform

Serving

Model Registry
Inference Pipelines
A/B Model Version Deployment
Traffic Shadowing
Language-Agnostic Deployment

Monitoring

Automatic Outlier Detection
Data Drift Report
Monitoring Dashboard and Data Health Metrics
Alerts

Interpretability

Prediction Explanations
Data Projection

Third-Party Integrations

Kubeflow Components
AWS Sagemaker

Model Registry

Hydrosphere has an internal Model Registry as centralized storage for Model Versions. When you build a Dockerized model and upload it to Hydrosphere or create new model versions, they get uploaded/stored to the configured model registry in the form of images. This organizes and simplifies model management across the platform and production lifecycle.

Inference Pipelines

A Hydrosphere user can create a linear inference pipeline from multiple model versions. Such pipelines are called Applications.

A/B Model Deployments

Hydrosphere allows you to A/B test your ML models in production.

A/B testing is a great way of measuring how well your models perform or which of your model versions is more effective and taking data-driven decisions upon this knowledge.

Production ML applications always have specific goals, for example driving as many users as possible to perform some action. To achieve these goals, it’s necessary to run online experiments and compare model versions using metrics in order to measure your progress against them. This approach allows to track whether your development efforts lead to desired outcomes.

To perform a basic A/B experiment on an application consisting of 2 variants of a model, you need to train and upload both versions to Hydrosphere, create an application with a single execution stage from them, invoke it by simulating production data flow, then analyze production data using metrics of your choice.

Learn how to set up an A/B application:

A/B Analysis for a Recommendation Model

Traffic Shadowing

A/B Deployment

Hydrosphere users can use multiple model versions inside of the same Application stage. Hydrosphere shadows traffic to all model versions inside of an application stage.

Users can specify the likelihood that a model output will be selected as an application stage output by using the weight argument.

Traffic Shadowing

Hydrosphere shadows traffic to all model versions inside of an application stage.

If you want to shadow your traffic between model versions without producing output from them simply set weight parameter to 0. This way your model version will receive all incoming traffic, but its output will never be chosen as an output of an application stage.

Language-Agnostic

Hydrosphere is a language-agnostic platform. You can use it with models written in any language and trained in any framework. Your ML models can come from any background, without restrictions of your choices regarding ML model development tools.

In Hydrosphere you operate ML models as Runtimes, which are Docker containers packed with predefined dependencies and gRPC interfaces for loading and serving them on the platform with a model inside. All models that you upload to Hydrosphere must have the corresponding runtimes.

Runtimes are created by building a Docker container with dependencies required for the language that matches your model. You can either use our pre-made runtimes or create your own runtime.

The Hydrosphere component responsible for building Docker images from models for deployment, storing them in the registry, versioning, and more is Manager.

Automatic Outlier Detection

Anomaly detection is focused on identifying data objects that are different from our expectations. It can be influenced by bad practices like noise, errors, or some unexpected events. Unusual data points can be also due to rare, but correct behaviour, which often results in interesting findings, motivating a further investigation. For these reasons, it is necessary to develop some techniques that could allow us to identify such unusual events. We assume that such events may induce some objects generated by a ”different mechanism”, which indicates that these objects might contain unexpected patterns that do not conform to a normal behaviour.

For each model with uploaded training data, Hydrosphere creates an outlier detection (Auto OD) metric, which assigns an outlier score to each request. A request is labeled as an outlier if the outlier score is greater than the 97th percentile of training data outlier scores distribution.

At first sight, anomaly detection is perceived as a classification problem that differentiate between normal and abnormal events. But that's usually not the case, since abnormalities are not presented enough to be a separate labeled class or even might be completely absent, which transfers the problem of outlier detection into the unsupervised context, a type of machine learning that looks for previously undetected patterns in a dataset with no pre-existing labels and with a minimum of human supervision. Within these constraints we have no choice but to rely on unsupervised machine learning algorithms. Practically those algorithms should look at the data and model normal behaviour as good as possible. After this step they can detect potentially risky events, without having a priori knowledge of what malicious and benign behavior looks like, by checking if a new event is dissimilar enough from the baseline.

But in order to make a right choice, it is essential to measure their performance in terms of specific metrics like accuracy, F1-score, ROC-AUC, etc. For this we would normally need labels that tell whether an event is in fact unusual. But, as we stated before, the data sets we are using do not have labels. How can we still estimate performance? For this purpose we can apply a metric, which is called Area Under Mass Volume curve, which was developd specifically for unsuprevised anomaly ranking. This kind of metric might be assumed as a performance metric similar to ROC Curve, but for unsupervised anomaly detection. Briefly speaking, MV measures the extent to which the spread of the distribution of anomaly score for training data differs from that of a randomly generated uniform distribution. You can learn more about this method as well as this metric by this link.

Hydrosphere has a specific engine inside the platform that automatically creates an outlier detection metric and assigns it to each downloaded model accordingly. The whole process can be divided into several consecutive stages:

As a starting point, Hydrosphere utilizes a training data to check whether it has an appropriate format for each feature
Then it applies the Mass Volume curve method to find an appropriate anomaly detection model. At the moment Hydrosphere chooses among three anomaly detection algorithms: Isolation Forest, Local Outlier Factor, and One-Class Support Vector Machines with prewitten set of hyperparameters. Additional models will be added later.
Finally, it uploads the chosen model on the cluster and assigns it as an anomaly detection metric to the previously trained model

There is an important aspect of outlier detection algorithm, which is concerned about choosing an appropriate threshold. Most outlier detection models calculate outlier score for each sample of the training data and then establishes a threshold score for detecting potential anomalies. In order to find this value, there exist several thresholding techniques, which based on statistics like standard deviation around the mean, median absolute deviation and interquartile range. Unfortunately, these statistics can be significantly biased because of the presence of potential outliers like noise or errors, when calculating these measures. In Hydropshere, we did some preliminary experiments with different datasets to find a value that maximize predictive ability for anomaly detection models and established that 97th percentile of raw outlier scores looks most promising. It means that a request is labeled as an outlier if its anomaly score is greater than that of the 97th percentile of training data outlier scores distribution.

You can observe those assigned models deployed as metrics in your Monitoring dashboard. These metrics provide you with information about how novel/anomalous your data is. If these values of the metric deviate significantly from the common, you can tell that you experience some potential abnormality event. In the case, if you observe a gradually increasing number of such events, then it might be associated with a data drift, which makes a need to re-evaluate your ML pipeline to check for errors.

High-dimensional cases

For more details about high-dimensional problem and algorithms dedicated to overcome this problem you can read here.

Supported Models

Right now Auto OD feature works only for Models with numerical scalar fields and uploaded training data.

Data Drift Report

Drift Report service creates a statistical report based on a comparison of training and production data distributions. It compares these two sets of data by a set of statistical tests and finds deviations.

Drift report uses multiple different tests with p=.95 for different features:

Numerical features:

Levene's test with a trimmed mean
Welch's t-test
Mood's test
Kolmogorov–Smirnov test

Categorical features:

Chi-Square test
Unseen categories

Supported Models

Right now Drift Report feature works only for Models with numerical scalar fields.

Monitoring Dashboard

Monitoring Dashboard lets you track your performance metrics and get a high-level view of your data health.

Monitoring Dashboard plots all requests streaming through a model version which are colored in respect with how "healthy" they are. On the horizontal axis we group our data by batches and on the vertical axis we group data by signature fields. In this plot cells are determined by their batch and field. Cells are colored from green to red, depending on the average request health inside the batch.

Alerts

Overview

**** sends data about any failed health checks of live production models and applications to Prometheus AlertManager. Once a user deploys a model to production, adds training data and starts sending production requests, these requests start getting checked by Sonar. If Sonar detects an anomaly (for example, a data check failed, or a metric value exceeded the threshold), AlertManager sends an appropriate alert.

Users can manage alerts by setting up AlertManager for Prometheus on Kubernetes. This can be helpful when you have models that you get too many alerts from and need to filter, group, or partly silence them. AlertManager can take care of grouping, inhibition, silencing of alerts, and routing them to the receiver integration of your choice. To configure alerts, modify the prometheus-am-configmap-<release_name> ConfigMap.

For more information about Prometheus AlertManager please refer to its .

Prediction Explanation

Prediction Explanation service is designed to help Hydrosphere users understand the underlying causes of changes in predictions coming from their models.

Prediction Explanation generates explanations of predictions produced by your models and tells you why a model made a particular prediction. Depending on the type of data your model uses, Prediction Explanation provides an explanation as either a set of logical predicates (if your data is in a tabular format) or a saliency map (if your data is in the image format). A saliency map is a heat map that highlights parts of a picture that a prediction was based on.

Hydrosphere uses model-agnostic methods for explaining your model predictions. Such methods can be used on any machine learning model after they've been uploaded to the platform.

As of now, Hydrosphere supports explaining tabular and image data with Anchor and RISE tools correspondingly.

Data Projection

Data Projection is a service that visualizes high-dimensional data in a 2D scatter plot with an automatically trained transformer to let you evaluate the data structure and spot clusters, outliers, novel data, or any other patterns. This is especially helpful if your model works with high-dimensional data, such as images or text embeddings.

Data Projection is an important tool, which helps to describe complex things in a simple way. One good visualization can show more than text or data. Monitoring and interpretation of machine learning models are hard tasks that require analyzing a lot of raw data: training data, production requests, as well as model outputs.

Essentially, this data is just numbers that in their original form of vectors and matrices do not have any meaning since it is hard to extract any meaning from thousands of vectors of numbers. In Hydrosphere we want to make monitoring easier and clearer that is why we created a data projection service that can visualize your data in a single plot.

Usage

To start working with Data Projection you need to create a model that has an output field with an embedding of your data. Embeddings are real-valued vectors that represent the input features in a lower dimensionality.

Create a model with an embedding field
Data Projection service delegates the creation of embeddings to the user. It expects that model will create embedding from input features and pass it as output vector. Thus embedding field is required, models without this field are not supported. Data Projection also expects that output labels field is called class and model confidence is called respectively confidence. Other outputs are ignored.
Send data through your model
Check Data Projection service inside the Model Details menu

Inside Data Projection service you can see your requests features projected on a 2D space:

Each point in the plot presents a request. Requests with similar features are close to each other. You can select a specific request point and inspect what it consists of.

Above plot, there are several scores: global score, stability score, MSID score, etc. These scores reflect the quality of projection of multidimensional requests to 2D. To interpret scores you refer to technical documentation on Data Projection service.

In the Colorize menu, you can choose how to colorize model requests: by class, by monitoring metric or by confidence. Data Projection searchers specifically for output scalars class and confidence.

In the Accent Points menu, you can highlight the nearest in original space points to the selected one by picking the nearest variant. Counterfactuals will show you nearest points to selected but with a different predicted label.

Kubeflow Components

Hydrosphere Serving Components for Kubeflow Pipelines provide integration between Hydrosphere model serving benefits and Kubeflow orchestration capabilities. This allows launching training jobs as well as serving the same models in Kubernetes in a single pipeline.

You can find examples of sample pipelines here.

Serving components

Deploy

The Deploy component allows you to upload a model, trained in a Kubeflow pipelines workflow to a Hydrosphere platform.

For more information, check Hydrosphere Deploy Kubeflow Component

Release

The Release component allows you to create an Application from a model previously uploaded to Hydrosphere platform. This application will be capable of serving prediction requests by HTTP or gRPC.

For more information, check Hydrosphere Release Kubeflow Component

AWS Sagemaker

Automatic Outlier Detection

As a starting point, Hydrosphere utilizes a training data to check whether it has an appropriate format for each feature
Then it applies the Mass Volume curve method to find an appropriate anomaly detection model. At the moment Hydrosphere chooses among three anomaly detection algorithms: Isolation Forest, Local Outlier Factor, and One-Class Support Vector Machines with prewitten set of hyperparameters. Additional models will be added later.
Finally, it uploads the chosen model on the cluster and assigns it as an anomaly detection metric to the previously trained model

High-dimensional cases

As mentioned above, outlier detection has turned out to be an import problem in many research fields. Still for high-dimensional data detecting such rare behaviors is not a trivial task. High dimensionality refers to data sets that have a large number of independent variables, components, features, or attributes within the data available for analysis. The complexity of the data analysis increases with respect to the number of dimensions, requiring more sophisticated methods to process the data. As a result, different methods might suffer from diffrent problems. For example, in the high-dimensional perspective, distane between observations might be very small, which will reduce the efficiency of distance-based outlier detection methods. Or, for high-dimensional data some irrelevant attributes may impede the separability of outliers from normal samples. Despite that for some 'big data' cases Hydrosphere's models are able to detect a critical anomalousness, it is not recommended to entirely rely on the results of Auto OD for high-dimensional cases. You can choose a specific algorithm that are more adapted for such cases. Hydrosphere Automatic Outlier Detection allows you to train your own model that is not a part of the Hydrosphere's engine. There is a specific dedicated to a creation of you own custom outlier detection metric. As an example you can find couple of algorithms for this task, which are a part of the PyOD toolbox.

For more details about high-dimensional problem and algorithms dedicated to overcome this problem you can read here.

Supported Models

Right now Auto OD feature works only for Models with numerical scalar fields and uploaded training data.