Distributions
Item Distribution
Calculates item popularity in recommendations using 10 popularity bins.
- replay.distributions.item_distribution(log, recommendations, k)
- Calculate item distribution in - logand- recommendations.- Parameters
- log ( - Union[- DataFrame,- DataFrame]) – historical DataFrame used to calculate popularity
- recommendations ( - Union[- DataFrame,- DataFrame]) – model recommendations
- k ( - int) – length of a recommendation list
 
- Return type
- DataFrame
- Returns
- DataFrame with results 
 
You can plot the result. Here is the example for MovieLens log.
 
- replay.distributions.plot_item_dist(item_dist, palette='magma', col='rec_count')
- Show the results of - item_distributionmethod- Parameters
- item_dist ( - DataFrame) –- pd.DataFrame
- palette ( - str) – colour scheme for seaborn
- col ( - str) – column to use for a plot
 
- Returns
- plot 
 
User Distribution
- Metric.user_distribution(log, recommendations, ground_truth, k)
- Get mean value of metric for all users with the same number of ratings. - Parameters
- log ( - Union[- DataFrame,- DataFrame]) – history DataFrame to calculate number of ratings per user
- recommendations ( - Union[- DataFrame,- DataFrame]) – prediction DataFrame
- ground_truth ( - Union[- DataFrame,- DataFrame]) – test data
- k ( - Union[- Iterable[- int],- int]) – depth cut-off
 
- Return type
- DataFrame
- Returns
- pandas DataFrame 
 
If you plot this, you can get something like
 
- replay.distributions.plot_user_dist(user_dist, window=1, title='')
- Plot mean metric value by the number of user ratings - Parameters
- user_dist ( - DataFrame) – output of- user_distributionmethod for a metric
- window ( - int) – the number of closest values to average for smoothing
- title ( - str) – plot title
 
- Returns
- plot object