Profiling and Logging
The Profiler
class is responsible for profiling the code and logging. The main use case of it is within the feature
view.
How it Works
To enable Profiler
to work on a method, there is a decorator that, when added to a method, modifies the method call to
pass the function and its arguments to the run
method of the Profiler.
from seshat.profiler import track
@track
def example_method(*args, **kwargs):
...
Default Patching
The Profiler has a class method track_default_methods
. This method, by default, tracks some important methods within
the project. These include:
- The
__call__
method of transformers - The
fetch
,save
,insert
,update
, andcreate_table
methods of the Source - The
convert
method of SFrame
To achieve this, it searches through all files, finds the modules and their methods, and adds the track
decorator to
them.
Setup Profiler
To set up the profiler, the setup
method must be called. This method first ensures the directory for logs exists.
Then, if default patching is enabled, the patching is executed. After that, Python logging is configured, and the logger
is set into the Profiler.
Configuration
The Profiler
has a ProfileConfig
dataclass. This dataclass has these fields:
log_level
log_dir
show_in_console
default_tracking
mem_profile_conf
cprofile_conf
mem_profile_conf
is itself another dataclass (MemProfileConfig
) that shows the memory profiling configuration. These
configurations have these fields:
log_path
enable
cprofile_conf
is another field of ProfileConfig
. This field is also a dataclass (CProfileConfig
) with these
fields:
log_path
enable
Event Logging
By setting up the profiler, overall logging will be enabled for each tracked method. These logs contain the following information:
- Alert when the method starts
- Alert when the method finishes
- Time spent and memory changes
CProfile Logging
CProfile
logging shows how many times each method is executed and provides various timing information about the
execution of the method.
Memory Profile Logging
The memory profile shows the change in memory usage for each line of the tracked methods.
Example Configuration
Here is an example of how to configure the profiler:
from seshat.profiler import Profiler, ProfileConfig, MemProfileConfig, CProfileConfig
# Configuration
profile_config = ProfileConfig(
log_level=logging.INFO,
log_dir="./logs",
show_in_console=True,
default_tracking=True,
mem_profile_conf=MemProfileConfig(log_path="memory.txt", enable=True),
cprofile_conf=CProfileConfig(log_path="cprofile.txt", enable=True),
)
# Setup
Profiler.setup(profile_config)
Example of Event Logs
By enabling profiling and setting show_in_console
to true, the event logs will print on the console.
2024-06-18 13:01:08,283 - INFO - >>> start LocalSource.fetch:
- /path_to_seshat_sdk/seshat/sdk-seshat-python/seshat/source/local/base.py:24
2024-06-18 13:01:08,625 - INFO - >>> finish LocalSource.fetch:
- Memory Changing: +78.21875 MB
- Time Spent in method itself: 0.000024 seconds
- Cumulative Time Spent: 0.128086 seconds
- /path_to_seshat_sdk/seshat/sdk-seshat-python/seshat/source/local/base.py:24
2024-06-18 13:01:08,625 - INFO - >>> start Pipeline.__call__:
- /path_to_seshat_sdk/seshat/sdk-seshat-python/seshat/transformer/pipeline/base.py:40
2024-06-18 13:01:08,836 - INFO - >>> finish Pipeline.__call__:
- Memory Changing: +0.015625 MB
- Time Spent in method itself: 0.000002 seconds
- Cumulative Time Spent: 0.000015 seconds
- /path_to_seshat_sdk/seshat/sdk-seshat-python/seshat/transformer/pipeline/base.py:40
By understanding how to configure and use the Profiler, you can effectively monitor and optimize the performance of your code, ensuring efficient and resource-aware execution.