arrow-left

All pages
gitbookPowered by GitBook
1 of 4

Loading...

Loading...

Loading...

Loading...

Platform Architecture

Hydrosphere is composed of several microservices, united to efficiently serve and monitor machine learning models in production. Hydrosphere features are divided between multiple services. You can learn more about each of them in this section.

hashtag
UI / nginx

Interpretability

Interpretability provides EDA (Exploratory Data Analysis) and explanations for predictions made by your models to make predictions understandable and actionable. It also produces explanations for monitoring metrics to let you know why a particular request was marked as an outlier. The component consists of 2 services:

  • Explanations

  • Data Projections

Both services are built with Celery to run asynchronous tasks from apps and consists of a client, a worker, and a broker that mediates in between. A client generates a task and initiates it by adding a message to a queue, а broker delivers it to a worker, then the worker executes the task.

Interpretability services use MongoDB as both a Celery broker and backend storage to save task results. To save and retrieve model training and production data, the Interpretability component uses S3 storage.

When Explanation or Data Projection receives a task they create a new temporary Servable specifically for the model they need to make an explanation for. They use this Servable to run data through it in order to make new predictions and delete it after.

hashtag
Prediction Explanations

Prediction Explanations generate explanations of model predictions to help you understand them. Depending on the type of data your model uses, it provides an explanation as either a set of logical predicates if your data is in a tabular format or a saliency map if your data is in the image format. Saliency Map is a heat map that highlights parts of a picture that a prediction was based on.

hashtag
Data Projections

Data Projection visualizes high-dimensional data in a 2D scatter plot with an automatically trained UMAP transformer to let you evaluate data structure and spot clusters, outliers, novel data, or any other patterns. It is especially helpful if your model works with high-dimensional data, such as images or text embeddings.

Serving

hashtag
Gateway

Gateway is a service responsible for routing requests to/from or between Servables and Applications and validating these requests for matching a Model's/Application signature.

The Gateway maps a model’s name to a corresponding container. Whenever it receives a request via HTTP API, GRPC, or Kafka Streams, it communicates with that container via the gRPC protocol.

Gateway enables data flow between different stages in an Application Pipeline

hashtag
Manager

Manager is responsible for:

  • Building a Docker Image from your ML model for future deployment

  • Storing these images inside a Docker Registry deployed alongside with

    manager service

  • Versioning these images as Model Versions

Creating running instances of these Model Versions called Servables

inside Kubernetes cluster

  • Combining multiple Model Versions into a linear graph with a single

    endpoint called Application

  • Monitoring

    hashtag
    Automatic Outlier Detection

    hashtag
    Sonar

    Sonar service is responsible for managing metrics, training and production data storage, calculating profiles, and shadowing data to the Model Versions which are used as an outlier detection metrics.

    hashtag
    Drift Report