Settings
Spark session
This library uses session_handler.State
to provide universal access to the same session for all modules.
Default session will be created automatically and can be accessed as a session
attribute.
from replay.session_handler import State
State().session
There is also a helper function to provide basic settings for the creation of Spark session
- replay.session_handler.get_spark_session(spark_memory=None, shuffle_partitions=None)
Get default SparkSession
- Parameters
spark_memory (
Optional
[int
]) – GB of memory allocated for Spark; 70% of RAM by default.shuffle_partitions (
Optional
[int
]) – number of partitions for Spark; triple CPU count by default
- Return type
SparkSession
You can pass any Spark session to State
for it to be available in library.
from replay.session_handler import get_spark_session
session = get_spark_session(2)
State(session)
- class replay.session_handler.State(session=None, device=None)
All modules look for Spark session via this class. You can put your own session here.
Other parameters are stored here too:
default device
forpytorch
(CPU/CUDA)
Logging
Logger name is replay
.
Default level is logging.INFO
.
import logging
logger = logging.getLogger("replay")
logger.setLevel(logging.DEBUG)