Distributions

Item Distribution

Calculates item popularity in recommendations using 10 popularity bins.

replay.distributions.item_distribution(log, recommendations, k)

Calculate item distribution in log and recommendations.

Parameters

log (Union[DataFrame, DataFrame]) – historical DataFrame used to calculate popularity
recommendations (Union[DataFrame, DataFrame]) – model recommendations
k (int) – length of a recommendation list

Return type

DataFrame

Returns

DataFrame with results

You can plot the result. Here is the example for MovieLens log.

replay.distributions.plot_item_dist(item_dist, palette='magma', col='rec_count')

Show the results of item_distribution method

Parameters

Returns

plot

Metric.user_distribution(log, recommendations, ground_truth, k)

Get mean value of metric for all users with the same number of ratings.

Parameters

log (Union[DataFrame, DataFrame]) – history DataFrame to calculate number of ratings per user
recommendations (Union[DataFrame, DataFrame]) – prediction DataFrame
ground_truth (Union[DataFrame, DataFrame]) – test data
k (Union[Iterable[int], int]) – depth cut-off

Return type

DataFrame

Returns

pandas DataFrame

If you plot this, you can get something like

replay.distributions.plot_user_dist(user_dist, window=1, title='')

Plot mean metric value by the number of user ratings

Parameters

Returns

plot object