Scenarios

Scenarios are a series of actions for recommendations

Fallback

class replay.scenarios.Fallback(main_model, fallback_model=<replay.models.pop_rec.PopRec object>, threshold=0)

Fill missing recommendations using fallback model. Behaves like a recommender and have the same interface.

__init__(main_model, fallback_model=<replay.models.pop_rec.PopRec object>, threshold=0)

Create recommendations with main_model, and fill missing with fallback_model. relevance of fallback_model will be decrease to keep main recommendations on top.

Parameters
  • main_model (BaseRecommender) – initialized model

  • fallback_model (BaseRecommender) – initialized model

  • threshold (int) – number of interactions by which users are divided into cold and hot

optimize(train, test, user_features=None, item_features=None, param_borders=None, criterion=<replay.metrics.ndcg.NDCG object>, k=10, budget=10, new_study=True)

Searches best parameters with optuna.

Parameters
  • train (Union[DataFrame, DataFrame]) – train data

  • test (Union[DataFrame, DataFrame]) – test data

  • user_features (Union[DataFrame, DataFrame, None]) – user features

  • item_features (Union[DataFrame, DataFrame, None]) – item features

  • param_borders (Optional[Dict[str, Dict[str, List[Any]]]]) – a dictionary with keys main and fallback containing dictionaries with search grid, where key is the parameter name and value is the range of possible values {param: [low, high]}.

  • criterion (Metric) – metric to use for optimization

  • k (int) – recommendation list length

  • budget (int) – number of points to try

  • new_study (bool) – keep searching with previous study or start a new study

Return type

Tuple[Dict[str, Any]]

Returns

tuple of dictionaries with best parameters

Two Stage Scenario

class replay.scenarios.TwoStagesScenario(train_splitter=<replay.splitters.user_log_splitter.UserSplitter object>, first_level_models=<replay.models.als.ALSWrap object>, fallback_model=<replay.models.pop_rec.PopRec object>, use_first_level_models_feat=False, second_model_params=None, second_model_config_path=None, num_negatives=100, negatives_type='first_level', use_generated_features=False, user_cat_features_list=None, item_cat_features_list=None, custom_features_processor=None, seed=123)

train:

  1. take input log and split it into first_level_train and second_level_train default splitter splits each user’s data 50/50

  2. train first_stage_models on first_stage_train

  3. create negative examples to train second stage model using one of:

    • wrong recommendations from first stage

    • random examples

    use num_negatives to specify number of negatives per user

  4. augments dataset with features:

    • get 1 level recommendations for positive examples from second_level_train and for generated negative examples

    • add user and item features

    • generate statistical and pair features

  5. train TabularAutoML from LightAutoML

inference:

  1. take log

  2. generate candidates, their number can be specified with num_candidates

  3. add features as in train

  4. get recommendations

__init__(train_splitter=<replay.splitters.user_log_splitter.UserSplitter object>, first_level_models=<replay.models.als.ALSWrap object>, fallback_model=<replay.models.pop_rec.PopRec object>, use_first_level_models_feat=False, second_model_params=None, second_model_config_path=None, num_negatives=100, negatives_type='first_level', use_generated_features=False, user_cat_features_list=None, item_cat_features_list=None, custom_features_processor=None, seed=123)
Parameters
  • train_splitter (Splitter) – splitter to get first_level_train and second_level_train. Default is random 50% split.

  • first_level_models (Union[List[BaseRecommender], BaseRecommender]) – model or a list of models

  • fallback_model (Optional[BaseRecommender]) – model used to fill missing recommendations at first level models

  • use_first_level_models_feat (Union[List[bool], bool]) – flag or a list of flags to use features created by first level models

  • second_model_params (Union[Dict, str, None]) – TabularAutoML parameters

  • second_model_config_path (Optional[str]) – path to config file for TabularAutoML

  • num_negatives (int) – number of negative examples used during train

  • negatives_type (str) – negative examples creation strategy,``random`` or most relevant examples from first-level

  • use_generated_features (bool) – flag to use generated features to train second level

  • user_cat_features_list (Optional[List]) – list of user categorical features

  • item_cat_features_list (Optional[List]) – list of item categorical features

  • custom_features_processor (Optional[HistoryBasedFeaturesProcessor]) – you can pass custom feature processor

  • seed (int) – random seed

fit(log, user_features=None, item_features=None)

Fit a recommendation model

Parameters
  • log (DataFrame) – historical log of interactions [user_idx, item_idx, timestamp, relevance]

  • user_features (Optional[DataFrame]) – user features [user_idx, timestamp] + feature columns

  • item_features (Optional[DataFrame]) – item features [item_idx, timestamp] + feature columns

Return type

None

Returns

optimize(train, test, user_features=None, item_features=None, param_borders=None, criterion=<replay.metrics.precision.Precision object>, k=10, budget=10, new_study=True)

Optimize first level models with optuna.

Parameters
  • train (Union[DataFrame, DataFrame]) – train DataFrame [user_id, item_id, timestamp, relevance]

  • test (Union[DataFrame, DataFrame]) – test DataFrame [user_id, item_id, timestamp, relevance]

  • user_features (Union[DataFrame, DataFrame, None]) – user features [user_id , timestamp] + feature columns

  • item_features (Union[DataFrame, DataFrame, None]) – item features``[item_id]`` + feature columns

  • param_borders (Optional[List[Dict[str, List[Any]]]]) – list with param grids for first level models and a fallback model. Empty dict skips optimization for that model. Param grid is a dict {param: [low, high]}.

  • criterion (Metric) – metric to optimize

  • k (int) – length of a recommendation list

  • budget (int) – number of points to train each model

  • new_study (bool) – keep searching with previous study or start a new study

Return type

Tuple[List[Dict[str, Any]], Optional[Dict[str, Any]]]

Returns

list of dicts of parameters

predict(log, k, users=None, items=None, user_features=None, item_features=None, filter_seen_items=True)

Get recommendations

Parameters
  • log (DataFrame) – historical log of interactions [user_idx, item_idx, timestamp, relevance]

  • k (int) – number of recommendations for each user

  • users (Union[DataFrame, Iterable, None]) – users to create recommendations for dataframe containing [user_idx] or array-like; if None, recommend to all users from log

  • items (Union[DataFrame, Iterable, None]) – candidate items for recommendations dataframe containing [item_idx] or array-like; if None, take all items from log. If it contains new items, relevance for them will be 0.

  • user_features (Optional[DataFrame]) – user features [user_idx , timestamp] + feature columns

  • item_features (Optional[DataFrame]) – item features [item_idx , timestamp] + feature columns

  • filter_seen_items (bool) – flag to remove seen items from recommendations based on log.

Return type

DataFrame

Returns

recommendation dataframe [user_idx, item_idx, relevance]