hs) along with Python SDK (
hydrosdk) installed on your local machine. If you don't have them yet, please follow these guides first:
hs clusterin your terminal. This command shows the name and server address of a cluster you’re currently using. If it shows that you're not using a local cluster, you can configure one with the following commands:
data/folder. Next you need to setup your working environment using the following packages:
int64) for OrdinalEncoder to obtain integers for categorical descriptors after transformation. Transforming the class column usually is not necessary. Also we can remove rows that contain question marks in some samples. Once the preprocessing is complete, you can delete the DataFrame (
fit()method in our case. After the training step, we can save a model with
model/model folder. Training data can be saved as a
csvfile, but don't forget to place
index=Falseto ignore index column and avoid further confusions with reading it again.
func_main.pyand store it in the
srcfolder inside the directory where your model is stored. Your directory structure should look like this:
func_main.pyshould be as follows:
colswe preserve column names as a list sorted by order of their appearance in the DataFrame.
func_main.py. You need to create
requirements.txtin the folder with your model and add the following libraries to it:
SignatureBuilder. A signature contains information about which method inside the
func_main.pyshould be called, as well as shapes and types of its inputs and outputs. You can use
X.dtypesto check what types of data you have for each column. You can use
int64fields for all our independent variables after transformation. Our class variable (
income) initially consists of two classes with text names instead of numbers, which means that it should be defined as the string (
str) in the signature. In addition, you can specify the type of profiling for each variable using
ProfilingTypeso Hydrosphere could know what this variable is about and analyze it accordingly. For this purpose, we can create a dictionary, which could contain keys as our variables and values as our profiling types. Otherwise, you can describe them one by one as a parameter in the input. Finally, we can complete our signature with assigning our output variable by
with_outputmethod and giving it a name (e.g.
y), type, shape and profiling type. Afterwards we can build our signature by the
pathvariable to define the root model folder and
payloadto point out paths to all files that we need to upload. At this point, we can combine all our efforts by using
ModelVersionBuilderobject, which describes our models and other objects associated with models before the uploading step. It has different methods that are responsible for assigning and uploading different components. For example, we can:
with_trainig_data(). Please note that the training data is required if you want to utilize various services as Data Drift, Automatic Outlier Detection and Data Visualization.
ModelVersionBuilderis prepared we can apply the
uploadmethod to upload it.
ModelVersionhelps to check whether our model was successfully uploaded to the platform by looking for it.
ModelVersionswith monitoring and other benefits. For that purpose, we are able to apply
ExecutionStageBuilder, which describes the model pipeline for an application. In turn, applications provide Predictor objects, which should be used for data inference purposes. Don't pay much attention to
weightparameter, it is needed for A/B testing.
predictmethod which we can use to send our data to the model. We can try to make predictions for our test set that has preliminarily been converted to a list of dictionaries. You can check the results using the name that we have used for an output of Signature and preserve it in any format you would prefer. Before making a prediction don't forget to make a small pause to finish all necessary loadings.
http://localhost. Here you can find all your models. Click on a model to view some basic information about it: versions, building logs, created applications, model's environments, and other services associated with deployed models.
metricpostscript at the end of the name. This is your automatically formed monitoring model for outlier detection. Learn more about the Automatic Outlier Detection feature here.
serving.yamlfile. You should get the following file structure:
hs apply -f serving.yaml. To monitor your model you can use Hydrosphere UI as was previously shown.