Distributions

Item Distribution

Calculates item popularity in recommendations using 10 popularity bins.

replay.distributions.item_distribution(log, recommendations, k)

Calculate item distribution in log and recommendations.

Parameters
  • log (Union[DataFrame, DataFrame]) – historical DataFrame used to calculate popularity

  • recommendations (Union[DataFrame, DataFrame]) – model recommendations

  • k (int) – length of a recommendation list

Return type

DataFrame

Returns

DataFrame with results

You can plot the result. Here is the example for MovieLens log.

../../_images/item_pop.jpg
replay.distributions.plot_item_dist(item_dist, palette='magma', col='rec_count')

Show the results of item_distribution method

Parameters
  • item_dist (DataFrame) – pd.DataFrame

  • palette (str) – colour scheme for seaborn

  • col (str) – column to use for a plot

Returns

plot

User Distribution

Metric.user_distribution(log, recommendations, ground_truth, k)

Get mean value of metric for all users with the same number of ratings.

Parameters
  • log (Union[DataFrame, DataFrame]) – history DataFrame to calculate number of ratings per user

  • recommendations (Union[DataFrame, DataFrame]) – prediction DataFrame

  • ground_truth (Union[DataFrame, DataFrame]) – test data

  • k (Union[Iterable[int], int]) – depth cut-off

Return type

DataFrame

Returns

pandas DataFrame

If you plot this, you can get something like

../../_images/user_dist.jpg
replay.distributions.plot_user_dist(user_dist, window=1, title='')

Plot mean metric value by the number of user ratings

Parameters
  • user_dist (DataFrame) – output of user_distribution method for a metric

  • window (int) – the number of closest values to average for smoothing

  • title (str) – plot title

Returns

plot object