import sysimport joblibfrom lightfm import LightFMfrom lightfm.datasets import fetch_movielensif__name__=="__main__": no_components =int(sys.argv[1])print(f"Number of components is set to {no_components}")# Load the MovieLens 100k dataset. Only five# star ratings are treated as positive. data =fetch_movielens(min_rating=5.0)# Instantiate and train the model model =LightFM(no_components=no_components, loss='warp') model.fit(data['train'], epochs=30, num_threads=2)# Save the model joblib.dump(model, "model.joblib")
src/func_main.py
import joblibimport numpy as npfrom lightfm import LightFM# Load model oncemodel: LightFM = joblib.load("/model/files/model.joblib")# Get all item idsitem_ids = np.arange(0, 1682)defget_top_rank_item(user_id):# Calculate scores per item id y = model.predict(user_ids=[user_id], item_ids=item_ids)# Pick top 3 top_3 = y.argsort()[:-4:-1]# Return {'top_1': ..., 'top_2': ..., 'top_3': ...}returndict([(f"top_{i +1}", item_id) for i, item_id inenumerate(top_3)])
We train and upload our model with 5 components as movie_rec:v1
pythontrain_model.py5hsupload
Upload Model B
Next, we train and upload a new version of our original model with 20 components as movie_rec:v2
pythontrain_model.py20hsupload
We can check that we have multiple versions of our model by running:
hs model list
Create an Application
To create an A/B deployment we need to create an Application with a single execution stage consisting of two model variants. These model variants are our Model A and Model B correspondingly.
The following code will create such an application:
Each request-response pair is stored in S3 (or in minio if deployed locally) in parquet files. We'll use fastparquet package to read these files and use s3fs package to connect to S3.
import fastparquet as fpimport s3fss3 = s3fs.S3FileSystem(client_kwargs={'endpoint_url': 'http://localhost:9000'}, key='minio', secret='minio123')# The data is stored in `feature-lake` bucket by default # Lets print files in this folders3.ls("feature-lake/")
The only file in the feature-lake folder is ['feature-lake/movie_rec']. Data stored in S3 is stored under the following path: feature-lake/MODEL_NAME/MODEL_VERSION/YEAR/MONTH/DAY/*.parquet
# We fetch all parquet files with globversion_1_paths = s3.glob("feature-lake/movie_rec/1/*/*/*/*.parquet")version_2_paths = s3.glob("feature-lake/movie_rec/2/*/*/*/*.parquet")myopen = s3.open# use s3fs as the filesystem to read parquet files into a pandas dataframefp_obj = fp.ParquetFile(version_1_paths, open_with=myopen)df_1 = fp_obj.to_pandas()fp_obj = fp.ParquetFile(version_2_paths, open_with=myopen)df_2 = fp_obj.to_pandas()
Now that we have loaded the data, we can start analyzing it.
Compare production data with new labeled data
To compare differences between model versions we'll use two metrics:
Latency - we compare the time delay between the request received and the response produced.
Mean Top-3 Hit Rate - we compare recommendations to those the user has rated. If they match then increase the hit rate by 1. Do this for the complete test set to get the hit rate.
Latencies
Let's calculate the 95th percentile of our latency distributions per model version and plot them. Latencies are stored in the _hs_latency column in our dataframes.
In our case, the output was 13.0ms against 12.0ms. Results may differ.
Furthermore, we can visualize our data. To plot latency distribution we'll use the Matplotlib library.
import matplotlib.pyplot as plt# Resize the canvasplt.gcf().set_size_inches(10, 5)# Plot latency histogramsplt.hist(latency_v1, range=(0, 20), normed=True, bins=20, alpha=0.6, label="Latency Model v1")plt.hist(latency_v2, range=(0, 20), normed=True, bins=20, alpha=0.6, label="Latency Model v2")# Plot previously computed percentilesplt.vlines(p95_v1, 0, 0.1, color="#1f77b4", label="95th percentile for model version 1")plt.vlines(p95_v2, 0, 0.1, color="#ff7f0e", label="95th percentile for model version 2")plt.legend()plt.title("Latency Comparison between v1 and v2")
Mean Top-3 Hit Rate
Next, we'll calculate hit rates. To do so, we need new labeled data. For recommender systems, this data is usually available after a user has clicked\watched\liked\rated the item we've recommended to him. We'll use the test part of movielens as labeled data.
To measure how well our models were recommending movies we'll use a hit rate metric. It calculates how many movies users have watched and rated with 4 or 5 out of 3 movies recommended to him.
from lightfm.datasets import fetch_movielenstest_data =fetch_movielens(min_rating=5.0)['test']test_data = test_data.toarray()# Dict with model version as key and mean hit rate as valuemean_hit_rate ={}for version, df in{"v1": df_1,"v2": df_2}.items():# Dict with user id as key and hit rate as value hit_rates ={}for x in df.itertuples(): hit_rates[x.user_id]=0for top_x in ("top_1","top_2","top_3"): hit_rates[x.user_id]+= test_data[x.user_id,getattr(x, top_x)]>=4 mean_hit_rate[version]=round(sum(hit_rates.values()) /len(hit_rates), 3)
In our case the mean_hit_rate variable is {'v1': 0.137, 'v2': 0.141} . Which means that the second model version is better in terms of hit rate.
You have successfully completed the tutorial! 🚀
Now you know how to read and analyze automatically stored data.