A/B Analysis for a Recommendation Model
Estimated completion time: 14 min.

Overview

In this tutorial, you will learn how to retrospectively compare the behavior of two different models.
By the end of this tutorial you will know how to:
  • Set up an A/B application
  • Analyze production data

Prerequisites

Set Up an A/B Application

Prepare a model for uploading

requirements.txt
1
lightfm==1.15
2
numpy~=1.18
3
joblib~=0.15
4
tqdm~=4.62.0
Copied!
Install the dependencies in your local environment.
1
pip install -r requirements.txt
Copied!
train_model.py
1
import sys
2
​
3
import joblib
4
from lightfm import LightFM
5
from lightfm.datasets import fetch_movielens
6
​
7
if __name__ == "__main__":
8
no_components = int(sys.argv[1])
9
print(f"Number of components is set to {no_components}")
10
​
11
# Load the MovieLens 100k dataset. Only five
12
# star ratings are treated as positive.
13
data = fetch_movielens(min_rating=5.0)
14
​
15
# Instantiate and train the model
16
model = LightFM(no_components=no_components, loss='warp')
17
model.fit(data['train'], epochs=30, num_threads=2)
18
​
19
# Save the model
20
joblib.dump(model, "model.joblib")
Copied!
src/func_main.py
1
import joblib
2
import numpy as np
3
from lightfm import LightFM
4
​
5
# Load model once
6
model: LightFM = joblib.load("/model/files/model.joblib")
7
​
8
# Get all item ids
9
item_ids = np.arange(0, 1682)
10
​
11
​
12
def get_top_rank_item(user_id):
13
# Calculate scores per item id
14
y = model.predict(user_ids=[user_id], item_ids=item_ids)
15
​
16
# Pick top 3
17
top_3 = y.argsort()[:-4:-1]
18
​
19
# Return {'top_1': ..., 'top_2': ..., 'top_3': ...}
20
return dict([(f"top_{i + 1}", item_id) for i, item_id in enumerate(top_3)])
Copied!
serving.yaml
1
kind: Model
2
name: movie_rec
3
runtime: hydrosphere/serving-runtime-python-3.7:3.0.0
4
install-command: sudo apt install --yes gcc && pip install -r requirements.txt
5
payload:
6
- src/
7
- requirements.txt
8
- model.joblib
9
contract:
10
name: get_top_rank_item
11
inputs:
12
user_id:
13
shape: scalar
14
type: int64
15
profile: numerical
16
outputs:
17
top_1:
18
shape: scalar
19
type: int64
20
profile: numerical
21
top_2:
22
shape: scalar
23
type: int64
24
profile: numerical
25
top_3:
26
shape: scalar
27
type: int64
28
profile: numerical
Copied!

Upload Model A

We train and upload our model with 5 components as movie_rec:v1
1
python train_model.py 5
2
hs apply -f serving.yaml
Copied!

Upload Model B

Next, we train and upload a new version of our original model with 20 components as movie_rec:v2
1
python train_model.py 20
2
hs apply -f serving.yaml
Copied!
We can check that we have multiple versions of our model by running:
1
hs model list
Copied!

Create an Application

To create an A/B deployment we need to create an Application with a single execution stage consisting of two model variants. These model variants are our Model A and Model B correspondingly.
The following code will create such an application:
1
from hydrosdk import ModelVersion, Cluster
2
from hydrosdk.application import ApplicationBuilder, ExecutionStageBuilder
3
​
4
cluster = Cluster('http://localhost')
5
​
6
model_a = ModelVersion.find(cluster, "movie_rec", 1)
7
model_b = ModelVersion.find(cluster, "movie_rec", 2)
8
​
9
stage_builder = ExecutionStageBuilder()
10
stage = stage_builder.with_model_variant(model_version=model_a, weight=50). \
11
with_model_variant(model_version=model_b, weight=50). \
12
build()
13
​
14
app = ApplicationBuilder("movie-ab-app").with_stage(stage).build(cluster)
Copied!

Invoking movie-ab-app

We'll simulate production data flow by repeatedly asking our model for recommendations.
1
import numpy as np
2
from hydrosdk import Cluster, Application
3
from tqdm.auto import tqdm
4
​
5
cluster = Cluster("http://localhost", grpc_address="localhost:9090")
6
​
7
app = Application.find(cluster, "movie-ab-app")
8
predictor = app.predictor()
9
​
10
user_ids = np.arange(0, 943)
11
​
12
for uid in tqdm(np.random.choice(user_ids, 2000, replace=True)):
13
result = predictor.predict({"user_id": uid})
Copied!
Last modified 5mo ago