1 of 5

How-To

This section offers guides that address technical aspects of working with the Hydrosphere platform.

Write definitions
Invoke applications
Monitoring external models
Develop runtimes
Use private pip repository

Invoke applications

Inferencing applications can be achieved using any of the methods described below.

Hydrosphere UI

To send a sample request using Hydrosphere UI, open the desired application, and press the Test button at the upper right corner. We will generate dummy inputs based on your model's contract and send an HTTP request to the model's endpoint.

HTTP Inference

POST /gateway/application/<application_name>

To send an HTTP request, you should send a POST request to the /gateway/application/<applicationName> endpoint with the JSON body containing your request data, composed with respect to the model's contract.

Path Parameters

Name

Type

Description

Request Body

Name

Type

Description

gRPC

To send a gRPC request you need to create a specific client.

import grpc 
import hydro_serving_grpc as hs  # pip install hydro-serving-grpc

# connect to your ML Lamba instance
channel = grpc.insecure_channel("<host>")
stub = hs.PredictionServiceStub(channel)

# 1. define a model, that you'll use
model_spec = hs.ModelSpec(name="model")

# 2. define tensor_shape for Tensor instance
tensor_shape = hs.TensorShapeProto(
    dim=[hs.TensorShapeProto.Dim(size=-1), hs.TensorShapeProto.Dim(size=2)])

# 3. define tensor with needed data
tensor = hs.TensorProto(dtype=hs.DT_DOUBLE, tensor_shape=tensor_shape, double_val=[1,1,1,1])

# 4. create PredictRequest instance
request = hs.PredictRequest(model_spec=model_spec, inputs={"x": tensor})

# call Predict method
result = stub.Predict(request)

import com.google.protobuf.Int64Value;
import io.grpc.ManagedChannel;
import io.grpc.ManagedChannelBuilder;
import io.hydrosphere.serving.tensorflow.DataType;
import io.hydrosphere.serving.tensorflow.TensorProto;
import io.hydrosphere.serving.tensorflow.TensorShapeProto;
import io.hydrosphere.serving.tensorflow.api.Model;
import io.hydrosphere.serving.tensorflow.api.Predict;
import io.hydrosphere.serving.tensorflow.api.PredictionServiceGrpc;

import java.util.Random;

public class HydrosphereClient {

    private final String modelName;         // Actual model name, registered within Hydrosphere platform
    private final Int64Value modelVersion;  // Model version of the registered model within Hydrosphere platform
    private final ManagedChannel channel;
    private final PredictionServiceGrpc.PredictionServiceBlockingStub blockingStub;

    public HydrosphereClient2(String target, String modelName, long modelVersion) {
        this(ManagedChannelBuilder.forTarget(target).build(), modelName, modelVersion);
    }

    HydrosphereClient2(ManagedChannel channel, String modelName, long modelVersion) {
        this.channel = channel;
        this.modelName = modelName;
        this.modelVersion = Int64Value.newBuilder().setValue(modelVersion).build();
        this.blockingStub = PredictionServiceGrpc.newBlockingStub(this.channel);
    }

    private Model.ModelSpec getModelSpec() {
        /*
        Helper method to generate ModelSpec.
         */
        return Model.ModelSpec.newBuilder()
                .setName(this.modelName)
                .setVersion(this.modelVersion)
                .build();
    }

    private TensorProto generateDoubleTensorProto() {
        /*
        Helper method generating random TensorProto object for double values.
        */
        return TensorProto.newBuilder()
                .addDoubleVal(new Random().nextDouble())
                .setDtype(DataType.DT_DOUBLE)
                .setTensorShape(TensorShapeProto.newBuilder().build())  // Empty TensorShape indicates scalar shape
                .build();
    }

    public Predict.PredictRequest generatePredictRequest() {
        /*
        PredictRequest is used to define the data passed to the model for inference.
        */
        return Predict.PredictRequest.newBuilder()
                .putInputs("in", this.generateDoubleTensorProto())
                .setModelSpec(this.getModelSpec())
                .build();
    }


    public Predict.PredictResponse predict(Predict.PredictRequest request) {
        /*
        The actual use of RPC method Predict of the PredictionService to invoke prediction.
        */
        return this.blockingStub.predict(request);
    }

    public static void main(String[] args) throws Exception {
        HydrosphereClient client = new HydrosphereClient("<host>", "example", 2);
        Predict.PredictRequest request = client.generatePredictRequest();
        Predict.PredictResponse response = client.predict(request);
        System.out.println(response);
    }
}

Python SDK

You can learn more about our Python SDK here.

import hydrosdk as hs

hs_cluster = hs.Cluster(http_address='{HTTP_CLUSTER_ADDRESS}',
                         grpc_address='{GRPC_CLUSTER_ADDRESS}',)

app = hs.Application.find(hs_cluster, "{APP_NAME}")

predictor = adult_servable.predictor()

data  = ...  # your data
predictor.predict(data)

Write definitions

Resource definitions describe Hydrosphere entities.

An entity could be your model, application, or deployment configuration. Each definition is represented by a .yaml file.

Base definition

Every definition must include the following fields:

kind: defines the type of a resource
name: defines the name of a resource

The only valid options for kind are:

Model
Application
DeploymentConfiguration

kind: Model

A model definition must contain the following fields:

runtime: a string defining the runtime Docker image that will be used to run a model. You can learn more about runtimes here.
contract: an object defining the inputs and outputs of a model.

A model definition can contain the following fields:

payload: a list of files that should be added to the container.
install-command: a string defining a command that should be executed during the container build.
training-data: a string defining a path to the file that will be uploaded to Hydrosphere and used as a training data reference. It can be either a local file or a URI to an S3 object. At the moment we only support .csv files.
metadata: an object defining additional user metadata that will be displayed on the Hydrosphere UI.

The example below shows how a model can be defined on the top level.

serving.yaml

kind: "Model"
name: "sample_model"
training-data: "s3://bucket/train.csv" | "/temp/file.csv"
runtime: "hydrosphere/serving-runtime-python-3.6:$released_version$"
install-command: "sudo apt install jq && pip install -r requirements.txt" 
payload: 
  - "./requirements.txt"
contract:
  ...
metadata:
  ...

Contract object

contract object must contain the following fields:

inputs: an object, defining all inputs of a model
outputs: an object, defining all outputs of a model

contract object can contain the following fields:

name: a string defining the signature of the model that should be used to process requests

Field object

field object must contain the following fields:

shape: either "scalar" or a list of integers, defining the shape of your data. If a shape is defined as a list of integers, it can have -1 value at the very beginning of the list, indicating that this field has an arbitrary number of "entities". -1 cannot be put anywhere aside from the beginning of the list.
type: a string defining the type of data.

field object can contain the following fields:

profile: a string, defining the profile type of your data.

The only valid options for type are:

bool — Boolean
string — String in bytes
half — 16-bit half-precision floating-point
float16 — 16-bit half-precision floating-point
float32 — 32-bit single-precision floating-point
double — 64-bit double-precision floating-point
float64 — 64-bit double-precision floating-point
uint8 — 8-bit unsigned integer
uint16 — 16-bit unsigned integer
uint32 — 32-bit unsigned integer
uint64 — 64-bit unsigned integer
int8 — 8-bit signed integer
int16 — 16-bit signed integer
int32 — 32-bit signed integer
int64 — 64-bit signed integer
qint8 — Quantized 8-bit signed integer
quint8 — Quantized 8-bit unsigned integer
qint16 — Quantized 16-bit signed integer
quint16 — Quantized 16-bit unsigned integer
complex64 — 64-bit single-precision complex
complex128 — 128-bit double-precision complex

The only valid options for profile are:

text — monitoring such fields will be done with text-oriented algorithms.
image — monitoring such fields will be done with image-oriented algorithms.
numerical — monitoring such fields will be done with numerical-oriented algorithms.
categorical — monitoring such fields will be done with categorical-oriented algorithms.

The example below shows how a contract can be defined on the top level.

name: "infer"
inputs:
  input_field_1:
    shape: [-1, 1]
    type: string
    profile: text
  input_field_2:
    shape: [200, 200]
    type: int32
    profile: categorical
outputs: 
  output_field_1:
    shape: scalar
    type: int32 
    profile: numerical

Metadata object

metadata object can represent any arbitrary information specified by the user. The structure of the object is not strictly defined. The only constraint is that the object must have a key-value structure, where a value can only be of a simple data type (string, number, boolean).

The example below shows, how metadata can be defined.

metadata:
  experiment: "demo"
  environment: "kubernetes"

The example below shows a complete definition of a sample model.

kind: "Model"
name: "sample_model"
training-data: "s3://bucket/train.csv" | "/temp/file.csv"
runtime: "hydrosphere/serving-runtime-python-3.6:$released_version$"
install-command: "sudo apt install jq && pip install -r requirements.txt" 
payload: 
  - "./*"
contract:
  name: "infer"
  inputs:
    input_field_1:
      shape: [-1, 1]
      type: string
      profile: text
    input_field_2:
      shape: [-1, 1]
      type: int32
      profile: numerical
  outputs: 
    output_field_1:
      shape: scalar
      type: int32 
      profile: numerical
metadata:
  experiment: "demo"
  environment: "kubernetes"

kind: Application

The application definition must contain one of the following fields:

singular: An object, defining a single-model application;
pipeline: A list of objects, defining an application as a pipeline of models.

Singular object

singular object represents an application consisting only of one model. The object must contain the following fields:

model: A string, defining a model version. It is expected to be in the form model-name:model-version.

The example below shows how a singular application can be defined.

kind: "Application"
name: "sample_application"
singular:
  model: "sample_model:1"

Pipeline object

pipeline represents a list of stages, representing models.

stage object must contain the following fields:

model: A string defining a model version. It is expected to be in the form model-name:model-version.

stage object can contain the following fields:

weight: A number defining the weight of the model. All models' weights in a stage must add up to 100.

The example below shows how a pipeline application can be defined.

kind: Application
name: sample-claims-app
pipeline:
  - - model: "claims-preprocessing:1"
  - - model: "claims-model:1"
      weight: 80
    - model: "claims-model:2"
      weight: 20

In this application, 100% of the traffic will be forwarded to the claims-preprocessing:1 model version and the output will be fed into claims-model. 80% of the traffic will go to the claims-model:1 model version, 20% of the traffic will go to the claims-model:2 model version.

kind: DeploymentConfiguration

The DeploymentConfiguration resource definition can contain the following fields:

hpa: An object defining HorizontalPodAutoscalerSpec
container: An object defining settings applied on a container level
deployment: An object defining settings applied on a deployment level
pod: An object defining settings applied on a pod level

HPA object

The hpa object closely resembles the Kubernetes HorizontalPodAutoscalerSpec object

The hpa object must contain:

minReplicas : minReplicas is the lower limit for the number of replicas to which the autoscaler can scale down.
maxReplicas : integer, upper limit for the number of pods that can be set by the autoscaler; cannot be smaller than minReplicas.
cpuUtilization : integer from 1 to 100, target average CPU utilization (represented as a percentage of requested CPU) over all the pods; if not specified the default autoscaling policy will be used.

Container object

The container object can contain:

resources : object with limits and requests fields. Closely resembles the k8s ResourceRequirements object
env : object with string keys and string values which is used to set environment variables.

Pod object

The hpa object is similar to the Kubernetes PodSpec object.

The pod object can contain

nodeSelector : selector which must be true for the pod to fit on a node. Selector which must match a node's labels for the pod to be scheduled on that node. More info.
affinity : pod's scheduling constraints. Represented by an Affinity object.
tolerations : array of Tolerations.

Deployment object

The deployment object must contain:

replicaCount : integer, number of desired pods. This is a pointer to distinguish between explicit zero and not specified. Defaults to 1.

Example

The example below shows how a deployment configuration can be defined.

kind: DeploymentConfiguration
name: cool-deployment-config
hpa:
  minReplicas: 2
  maxReplicas: 10
  cpuUtilization: 80
deployment:
  replicaCount: 4
container:
  resources:
    limits:
      cpu: 500m
      memory: 4G
    requests:
      cpu: 250m
      memory: 2G
  env:
    foo: bar
pod:
  nodeSelector:
    im: a map
    foo: bar
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: exp1
            operator: Exists
          matchFields:
          - key: fields1
            operator: Exists
      preferredDuringSchedulingIgnoredDuringExecution:
      - preference:
          matchExpressions:
          - key: exp2
            operator: NotIn
            values:
            - aaaa
            - bvzv
            - czxc
          matchFields:
          - key: fields3
            operator: NotIn
            values:
            - aaa
            - cccc
            - zxcc
        weight: 100
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: value
            operator: Exists
          - key: key
            operator: NotIn
            values:
            - a
            - b
        namespaces:
        - namespace1
        topologyKey: top
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchLabels:
              key: a
            matchExpressions:
            - key: key1
              operator: In
              values:
              - a
              - b
            - key: value2
              operator: NotIn
              values:
              - b
          namespaces:
          - namespace2
          topologyKey: topo_valur
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: value
            operator: Exists
          - key: key2
            operator: NotIn
            values:
            - a
            - b
          - key: key3
            operator: DoesNotExist
        namespaces:
        - namespace1
        topologyKey: top
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchLabels:
              key: a
            matchExpressions:
            - key: key
              operator: In
              values:
              - a
              - b
            - key: key2
              operator: NotIn
              values:
              - b
          namespaces:
          - namespace2
          topologyKey: toptop
  tolerations:
  - effect: PreferNoSchedule
    key: equalToleration
    tolerationSeconds: 30
    operator: Equal
    value: kek
  - key: equalToleration
    operator: Exists
    effect: PreferNoSchedule
    tolerationSeconds: 30

Develop runtimes

Sometimes our runtime images are not flexible enough. In that case, you might want to implement one yourself.

The key things you need to know to write your own runtime are:

How to implement a predefined gRPC service for a dedicated language
How to our contracts' protobufs work to describe entry points, such as inputs and outputs
How to create your own Docker image and publish it to an open registry

Generate GRPC code

There are different approaches to generating client and server gRPC code in different languages. Let's have a look at how to do that in Python.

First, let's clone our protos repository and prepare a folder for the generated code:

$ git clone https://github.com/Hydrospheredata/hydro-serving-protos
$ mkdir runtime

To generate the gRPC code we need to install additional packages:

$ pip install grpcio-tools googleapis-common-protos

Our custom runtime will require contracts and tf protobuf messages. Let's generate them:

$ python -m grpc_tools.protoc --proto_path=./hydro-serving-protos/src/ --python_out=./runtime/ --grpc_python_out=./runtime/ $(find ./hydro-serving-protos/src/hydro_serving_grpc/contract/ -type f -name '*.proto')
$ python -m grpc_tools.protoc --proto_path=./hydro-serving-protos/src/ --python_out=./runtime/ --grpc_python_out=./runtime/ $(find ./hydro-serving-protos/src/hydro_serving_grpc/tf/ -type f -name '*.proto')
$ cd runtime
$ find ./hydro_serving_grpc -type d -exec touch {}/__init__.py \;

The structure of the runtime should now be as follows:

runtime
└── hydro_serving_grpc
    ├── __init__.py
    ├── contract
    │   ├── __init__.py
    │   ├── model_contract_pb2.py
    │   ├── model_contract_pb2_grpc.py
    │   ├── model_field_pb2.py
    │   ├── model_field_pb2_grpc.py
    │   ├── model_signature_pb2.py
    │   └── model_signature_pb2_grpc.py
    └── tf
        ├── __init__.py
        ├── api
        │   ├── __init__.py
        │   ├── model_pb2.py
        │   ├── model_pb2_grpc.py
        │   ├── predict_pb2.py
        │   ├── predict_pb2_grpc.py
        │   ├── prediction_service_pb2.py
        │   └── prediction_service_pb2_grpc.py
        ├── tensor_pb2.py
        ├── tensor_pb2_grpc.py
        ├── tensor_shape_pb2.py
        ├── tensor_shape_pb2_grpc.py
        ├── types_pb2.py
        └── types_pb2_grpc.py

Implement Service

Now that we have everything set up, let's implement a runtime. Create a runtime.py file and put in the following code:

from hydro_serving_grpc.tf.api.predict_pb2 import PredictRequest, PredictResponse
from hydro_serving_grpc.tf.api.prediction_service_pb2_grpc import PredictionServiceServicer, add_PredictionServiceServicer_to_server
from hydro_serving_grpc.tf.types_pb2 import *
from hydro_serving_grpc.tf.tensor_pb2 import TensorProto
from hydro_serving_grpc.contract.model_contract_pb2 import ModelContract
from concurrent import futures

import os
import time
import grpc
import logging
import importlib


class RuntimeService(PredictionServiceServicer):
    def __init__(self, model_path, contract):
        self.contract = contract
        self.model_path = model_path
        self.logger = logging.getLogger(self.__class__.__name__)

    def Predict(self, request, context):
        self.logger.info(f"Received inference request: {request}")

        module = importlib.import_module("func_main")
        executable = getattr(module, self.contract.predict.signature_name)
        result = executable(**request.inputs)

        if not isinstance(result, hs.PredictResponse):
            self.logger.warning(f"Type of a result ({result}) is not `PredictResponse`")
            context.set_code(grpc.StatusCode.OUT_OF_RANGE)
            context.set_details(f"Type of a result ({result}) is not `PredictResponse`")
            return PredictResponse()
        return result


class RuntimeManager:
    def __init__(self, model_path, port):
        self.logger = logging.getLogger(self.__class__.__name__)
        self.port = port
        self.model_path = model_path
        self.server = None

        with open(os.path.join(model_path, 'contract.protobin')) as file:
            contract = ModelContract.ParseFromString(file.read())
        self.servicer = RuntimeService(os.path.join(self.model_path, 'files'), contract)

    def start(self):
        self.logger.info(f"Starting PythonRuntime at {self.port}")
        self.server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
        add_PredictionServiceServicer_to_server(self.servicer, self.server)
        self.server.add_insecure_port(f'[::]:{self.port}')
        self.server.start()

    def stop(self, code=0):
        self.logger.info(f"Stopping PythonRuntime at {self.port}")
        self.server.stop(code)
© 2020 GitHub, Inc.

Let's quickly review what we have here. RuntimeManager simply manages our service, i.e. starts it, stops it, and holds all necessary data. RuntimeService is a service that actually implements thePredict(PredictRequest) RPC function.

The model will be stored inside the /model directory in the Docker container. The structure of /model is a follows:

model
├── contract.protobin
└── files
    ├── ...
    └── ...

Thecontract.protobin file will be created by the Manager service. It contains a binary representation of the ModelContract message.

files directory contains all files of your model.

To run this service let's create an another file main.py.

from runtime import RuntimeManager

import os
import time
import logging

logging.basicConfig(level=logging.INFO)

if __name__ == '__main__':
    runtime = RuntimeManager('/model', port=int(os.getenv('APP_PORT', "9090")))
    runtime.start()

    try:
        while True:
            time.sleep(60 * 60 * 24)
    except KeyboardInterrupt:
        runtime.stop()

Publish Runtime

Before we can use the runtime, we have to package it into a container.

To add requirements for installing dependencies, create a requirements.txt file and put inside:

grpcio==1.12.1 
googleapis-common-protos==1.5.3

Create a Dockerfile to build our image:

FROM python:3.6.5 

ADD . /app
RUN pip install -r /app/requirements.txt

ENV APP_PORT=9090

VOLUME /model 
WORKDIR /app

CMD ["python", "main.py"]

APP_PORT is an environment variable used by Hydrosphere. When Hydrosphere invokes Predict method, it does so via the defined port.

The structure of the runtime folder should now look like this:

runtime
├── Dockerfile
├── hydro_serving_grpc
│   ├── __init__.py
│   ├── contract
│   │   ├── __init__.py
│   │   ├── model_contract_pb2.py
│   │   ├── model_contract_pb2_grpc.py
│   │   ├── model_field_pb2.py
│   │   ├── model_field_pb2_grpc.py
│   │   ├── model_signature_pb2.py
│   │   └── model_signature_pb2_grpc.py
│   └── tf
│       ├── __init__.py
│       ├── api
│       │   ├── __init__.py
│       │   ├── model_pb2.py
│       │   ├── model_pb2_grpc.py
│       │   ├── predict_pb2.py
│       │   ├── predict_pb2_grpc.py
│       │   ├── prediction_service_pb2.py
│       │   └── prediction_service_pb2_grpc.py
│       ├── tensor_pb2.py
│       ├── tensor_pb2_grpc.py
│       ├── tensor_shape_pb2.py
│       ├── tensor_shape_pb2_grpc.py
│       ├── types_pb2.py
│       └── types_pb2_grpc.py
├── main.py
├── requirements.txt
└── runtime.py

Build and push the Docker image:

$ docker build -t {username}/python-runtime-example
$ docker push {username}/python-runtime-example

Remember that the registry has to be accessible to the Hydrosphere platform so it can pull the runtime whenever it has to run a model with this runtime.

That's it. You have just created a simple runtime that you can use in your own projects. It is an almost identical version of our python runtime implementation. You can always look up details there.

Use private pip repositories

To use private pip repository you must add customized pip.conf file pointing to your custom PyPI repository.

For example, your custom pip.conf file can look like this:

[global]
timeout = 60
index-url = http://pypi.python.org/simple/

If you need to specify the certificate to use during pip install you want to specify the path to it in a pip.conf file e.g.

[global]
timeout = 60
index-url = http://pypi.python.org/simple/
cert = /model/files/cert.pem

You can tell pip to use this pip.conffile in the install-command field inside serving.yaml:

kind: Model
name: linear_regression
runtime: "hydrosphere/serving-runtime-python-3.7:$released_version$"
install-command: "PIP_CONFIG_FILE=pip.conf pip install -r requirements.txt"
payload:
  - "src/"
  - "requirements.txt"
  - "pip.conf"  # location of your pip.conf
  - "cert.pem"  # location of your certificate. It'll be available under /model/files/cert.pem
  - "model.h5"
contract:
  name: infer
  inputs:
    x:
      shape: [-1, 2]
      type: double
  outputs:
    y:
      shape: [-1]
      type: double