Scenarios
Scenarios are a series of actions for recommendations
Fallback
- class replay.scenarios.Fallback(main_model, fallback_model=<replay.models.pop_rec.PopRec object>, threshold=0)
Fill missing recommendations using fallback model. Behaves like a recommender and have the same interface.
- __init__(main_model, fallback_model=<replay.models.pop_rec.PopRec object>, threshold=0)
Create recommendations with main_model, and fill missing with fallback_model. relevance of fallback_model will be decrease to keep main recommendations on top.
- Parameters
main_model (
BaseRecommender
) – initialized modelfallback_model (
BaseRecommender
) – initialized modelthreshold (
int
) – number of interactions by which users are divided into cold and hot
- optimize(train, test, user_features=None, item_features=None, param_borders=None, criterion=<replay.metrics.ndcg.NDCG object>, k=10, budget=10, new_study=True)
Searches best parameters with optuna.
- Parameters
train (
Union
[DataFrame
,DataFrame
]) – train datatest (
Union
[DataFrame
,DataFrame
]) – test datauser_features (
Union
[DataFrame
,DataFrame
,None
]) – user featuresitem_features (
Union
[DataFrame
,DataFrame
,None
]) – item featuresparam_borders (
Optional
[Dict
[str
,Dict
[str
,List
[Any
]]]]) – a dictionary with keys main and fallback containing dictionaries with search grid, where key is the parameter name and value is the range of possible values{param: [low, high]}
.criterion (
Metric
) – metric to use for optimizationk (
int
) – recommendation list lengthbudget (
int
) – number of points to trynew_study (
bool
) – keep searching with previous study or start a new study
- Return type
Tuple
[Dict
[str
,Any
]]- Returns
tuple of dictionaries with best parameters
Two Stage Scenario
- class replay.scenarios.TwoStagesScenario(train_splitter=<replay.splitters.user_log_splitter.UserSplitter object>, first_level_models=<replay.models.als.ALSWrap object>, fallback_model=<replay.models.pop_rec.PopRec object>, use_first_level_models_feat=False, second_model_params=None, second_model_config_path=None, num_negatives=100, negatives_type='first_level', use_generated_features=False, user_cat_features_list=None, item_cat_features_list=None, custom_features_processor=None, seed=123)
train:
take input
log
and split it into first_level_train and second_level_train default splitter splits each user’s data 50/50train
first_stage_models
onfirst_stage_train
create negative examples to train second stage model using one of:
wrong recommendations from first stage
random examples
use
num_negatives
to specify number of negatives per useraugments dataset with features:
get 1 level recommendations for positive examples from second_level_train and for generated negative examples
add user and item features
generate statistical and pair features
train
TabularAutoML
from LightAutoML
inference:
take
log
generate candidates, their number can be specified with
num_candidates
add features as in train
get recommendations
- __init__(train_splitter=<replay.splitters.user_log_splitter.UserSplitter object>, first_level_models=<replay.models.als.ALSWrap object>, fallback_model=<replay.models.pop_rec.PopRec object>, use_first_level_models_feat=False, second_model_params=None, second_model_config_path=None, num_negatives=100, negatives_type='first_level', use_generated_features=False, user_cat_features_list=None, item_cat_features_list=None, custom_features_processor=None, seed=123)
- Parameters
train_splitter (
Splitter
) – splitter to getfirst_level_train
andsecond_level_train
. Default is random 50% split.first_level_models (
Union
[List
[BaseRecommender
],BaseRecommender
]) – model or a list of modelsfallback_model (
Optional
[BaseRecommender
]) – model used to fill missing recommendations at first level modelsuse_first_level_models_feat (
Union
[List
[bool
],bool
]) – flag or a list of flags to use features created by first level modelssecond_model_params (
Union
[Dict
,str
,None
]) – TabularAutoML parameterssecond_model_config_path (
Optional
[str
]) – path to config file for TabularAutoMLnum_negatives (
int
) – number of negative examples used during trainnegatives_type (
str
) – negative examples creation strategy,``random`` or most relevant examples fromfirst-level
use_generated_features (
bool
) – flag to use generated features to train second leveluser_cat_features_list (
Optional
[List
]) – list of user categorical featuresitem_cat_features_list (
Optional
[List
]) – list of item categorical featurescustom_features_processor (
Optional
[HistoryBasedFeaturesProcessor
]) – you can pass custom feature processorseed (
int
) – random seed
- fit(log, user_features=None, item_features=None)
Fit a recommendation model
- Parameters
log (
DataFrame
) – historical log of interactions[user_idx, item_idx, timestamp, relevance]
user_features (
Optional
[DataFrame
]) – user features[user_idx, timestamp]
+ feature columnsitem_features (
Optional
[DataFrame
]) – item features[item_idx, timestamp]
+ feature columns
- Return type
None
- Returns
- optimize(train, test, user_features=None, item_features=None, param_borders=None, criterion=<replay.metrics.precision.Precision object>, k=10, budget=10, new_study=True)
Optimize first level models with optuna.
- Parameters
train (
Union
[DataFrame
,DataFrame
]) – train DataFrame[user_id, item_id, timestamp, relevance]
test (
Union
[DataFrame
,DataFrame
]) – test DataFrame[user_id, item_id, timestamp, relevance]
user_features (
Union
[DataFrame
,DataFrame
,None
]) – user features[user_id , timestamp]
+ feature columnsitem_features (
Union
[DataFrame
,DataFrame
,None
]) – item features``[item_id]`` + feature columnsparam_borders (
Optional
[List
[Dict
[str
,List
[Any
]]]]) – list with param grids for first level models and a fallback model. Empty dict skips optimization for that model. Param grid is a dict{param: [low, high]}
.criterion (
Metric
) – metric to optimizek (
int
) – length of a recommendation listbudget (
int
) – number of points to train each modelnew_study (
bool
) – keep searching with previous study or start a new study
- Return type
Tuple
[List
[Dict
[str
,Any
]],Optional
[Dict
[str
,Any
]]]- Returns
list of dicts of parameters
- predict(log, k, users=None, items=None, user_features=None, item_features=None, filter_seen_items=True)
Get recommendations
- Parameters
log (
DataFrame
) – historical log of interactions[user_idx, item_idx, timestamp, relevance]
k (
int
) – number of recommendations for each userusers (
Union
[DataFrame
,Iterable
,None
]) – users to create recommendations for dataframe containing[user_idx]
orarray-like
; ifNone
, recommend to all users fromlog
items (
Union
[DataFrame
,Iterable
,None
]) – candidate items for recommendations dataframe containing[item_idx]
orarray-like
; ifNone
, take all items fromlog
. If it contains new items,relevance
for them will be0
.user_features (
Optional
[DataFrame
]) – user features[user_idx , timestamp]
+ feature columnsitem_features (
Optional
[DataFrame
]) – item features[item_idx , timestamp]
+ feature columnsfilter_seen_items (
bool
) – flag to remove seen items from recommendations based onlog
.
- Return type
DataFrame
- Returns
recommendation dataframe
[user_idx, item_idx, relevance]