Distributions
Item Distribution
Calculates item popularity in recommendations using 10 popularity bins.
- replay.distributions.item_distribution(log, recommendations, k)
Calculate item distribution in
log
andrecommendations
.- Parameters
log (
Union
[DataFrame
,DataFrame
]) – historical DataFrame used to calculate popularityrecommendations (
Union
[DataFrame
,DataFrame
]) – model recommendationsk (
int
) – length of a recommendation list
- Return type
DataFrame
- Returns
DataFrame with results
You can plot the result. Here is the example for MovieLens log.
- replay.distributions.plot_item_dist(item_dist, palette='magma', col='rec_count')
Show the results of
item_distribution
method- Parameters
item_dist (
DataFrame
) –pd.DataFrame
palette (
str
) – colour scheme for seaborncol (
str
) – column to use for a plot
- Returns
plot
User Distribution
- Metric.user_distribution(log, recommendations, ground_truth, k)
Get mean value of metric for all users with the same number of ratings.
- Parameters
log (
Union
[DataFrame
,DataFrame
]) – history DataFrame to calculate number of ratings per userrecommendations (
Union
[DataFrame
,DataFrame
]) – prediction DataFrameground_truth (
Union
[DataFrame
,DataFrame
]) – test datak (
Union
[Iterable
[int
],int
]) – depth cut-off
- Return type
DataFrame
- Returns
pandas DataFrame
If you plot this, you can get something like
- replay.distributions.plot_user_dist(user_dist, window=1, title='')
Plot mean metric value by the number of user ratings
- Parameters
user_dist (
DataFrame
) – output ofuser_distribution
method for a metricwindow (
int
) – the number of closest values to average for smoothingtitle (
str
) – plot title
- Returns
plot object