matrice.actionTracker module#
- class matrice.actionTracker.ActionTracker(action_id=None)[source]#
Bases:
object
Tracks and manages the status, actions, and related data of a model’s lifecycle, including training, evaluation, and deployment processes.
The ActionTracker is responsible for tracking various stages of an action (e.g., model training, evaluation, or deployment), logging details, fetching configuration parameters, downloading model checkpoints, and handling error logging. It interacts with the backend system to retrieve and update action statuses.
- Parameters:
action_id (str, optional) – The unique identifier of the action to be tracked. If not provided, the class will initialize without an active action. The action_id is typically linked to specific activities such as model training, evaluation, or deployment.
- rpc#
A Remote Procedure Call (RPC) client for interacting with the backend API.
- Type:
RPCClient
- action_id#
The ObjectId representing the action being tracked. This is used for retrieving action details from the backend.
- Type:
bson.ObjectId
- action_id_str#
The string representation of the action_id.
- Type:
str
- action_doc#
The detailed document containing information about the action, including its status, type, and related model details.
- Type:
dict
- action_type#
The type of action being tracked, such as ‘model_train’, ‘model_eval’, or ‘deploy_add’.
- Type:
str
- _idModel#
The ObjectId of the model associated with the current action.
- Type:
bson.ObjectId
- _idModel_str#
The string representation of _idModel.
- Type:
str
- session#
A session object that manages the user session and ensures that API requests are authorized.
- Type:
Examples
>>> tracker = ActionTracker(action_id="60f5f5bfb5a1c2a123456789") >>> tracker.get_job_params() >>> tracker.update_status("training", "in_progress", "Model training started") >>> tracker.log_epoch_results(1, [{'loss': 0.25, 'accuracy': 0.92}])
- __init__(action_id=None)[source]#
Initializes the ActionTracker instance and retrieves details related to the specified action ID.
This constructor fetches the action document, which contains metadata about the action, including the model’s ID. If no action_id is provided, the tracker is initialized without an action.
- Parameters:
action_id (str, optional) – The unique identifier of the action to track. If not provided, the instance is initialized without an action.
- Raises:
ConnectionError – If there is an error retrieving action details from the backend.
SystemExit – If there is a critical error during initialization, causing the system to terminate.
Examples
>>> tracker = ActionTracker(action_id="60f5f5bfb5a1c2a123456789") >>> print(tracker.action_type) # Outputs the action type, e.g., "model_train"
- add_index_to_category(indexToCat)[source]#
Adds an index-to-category mapping to the model.
This function is used to establish a relationship between numerical indices and their corresponding categorical labels for the model. This mapping is essential for interpreting the model’s output, particularly when the model is designed to classify input data into distinct categories.
When to Use:#
This function is typically called after the model has been trained
but before deploying the model for inference. It ensures that the indices output by the model during predictions can be accurately translated to human-readable category labels. - It is also useful when there are changes in the class labels or when initializing a new model.
- type indexToCat:
- param indexToCat:
A dictionary mapping integer indices to category names. For example, {0: ‘cat’, 1: ‘dog’, 2: ‘bird’} indicates that index 0 corresponds to ‘cat’, index 1 to ‘dog’, and index 2 to ‘bird’.
- type indexToCat:
dict
- raises Exception:
If an error occurs while trying to add the mapping, it logs the error details and exits the process.
Examples
>>> index_mapping = {0: 'cat', 1: 'dog', 2: 'bird'} >>> add_index_to_category(index_mapping)
- download_model(model_path, model_type='trained')[source]#
Downloads a model from the backend system.
- Parameters:
model_path (str) – The path to save the downloaded model. The file will be saved at this location after downloading.
model_type (str, optional) – The type of the model (“trained” or “exported”). Defaults to “trained”.
- Returns:
True if the download was successful, False otherwise. The function will log an error and exit if an exception occurs during the download process.
- Return type:
bool
Examples
>>> success = download_model("path/to/save/model.pth") >>> if success: >>> print("Model downloaded successfully!") >>> else: >>> print("Model download failed.")
- get_checkpoint_path(model_config)[source]#
Determines the checkpoint path for the model based on the configuration provided.
This function checks if the model’s checkpoint should be retrieved from a pre-trained source or a specific model ID. It also handles downloading the model if necessary.
- Parameters:
model_config (dict) – A dictionary containing the configuration parameters for the model, such as checkpoint_type and model_checkpoint.
- Returns:
A tuple containing: - The absolute path of the model checkpoint if found. - A boolean indicating whether the model is pre-trained.
- Return type:
tuple
- Raises:
FileNotFoundError – If the model checkpoint cannot be downloaded or located.
ConnectionError – If there is an issue communicating with the model’s API.
Examples
>>> config = {"checkpoint_type": "model_id", "model_checkpoint": "12345abcde"} >>> checkpoint_path, is_pretrained = tracker.get_checkpoint_path(config) >>> print(checkpoint_path, is_pretrained)
- get_index_to_category(is_exported=False)[source]#
Fetches the index-to-category mapping for the model.
This function retrieves the current mapping of indices to categories from the backend system. This is crucial for understanding the model’s predictions, as it allows users to decode the model outputs back into meaningful category labels.
When to Use:#
This function is often called before making predictions with the model
to ensure that the index-to-category mapping is up to date and correctly reflects the model’s configuration. - It can also be used after exporting a model to validate that the expected mappings are correctly stored and accessible.
- type is_exported:
- param is_exported:
A flag indicating whether to fetch the mapping for an exported model. Defaults to False. If True, the mapping is retrieved based on the export ID.
- type is_exported:
bool, optional
- returns:
The index-to-category mapping as a dictionary, where keys are indices and values are corresponding category names.
- rtype:
dict
- raises Exception:
If an error occurs during the retrieval process, it logs the error details and exits the process.
Examples
>>> mapping = get_index_to_category() >>> print(mapping) {0: 'cat', 1: 'dog', 2: 'bird'}
>>> exported_mapping = get_index_to_category(is_exported=True) >>> print(exported_mapping) {0: 'cat', 1: 'dog'}
- get_job_params()[source]#
Fetches the parameters for the job associated with the current action.
This method retrieves the parameters required to perform a specific action, such as model training or evaluation. The parameters are returned as a dot-accessible dictionary (__dotdict) for convenience.
- Returns:
A dot-accessible dictionary containing the job parameters.
- Return type:
__dotdict
- Raises:
KeyError – If the job parameters cannot be found in the action document.
SystemExit – If the job parameters cannot be retrieved and the system needs to terminate.
Examples
>>> job_params = tracker.get_job_params() >>> print(job_params.learning_rate) # Accessing parameters using dot notation
- log_epoch_results(epoch, epoch_result_list)[source]#
Logs the results of an epoch during model training or evaluation.
This method records various metrics (like loss and accuracy) for a specific epoch. It updates the action status and logs the results for tracking purposes.
- Parameters:
epoch (int) – The epoch number for which the results are being logged.
results (list of dict) – A list of dictionaries containing the metric results for the epoch.
- Return type:
None
- Raises:
ValueError – If the epoch number is invalid.
Examples
>>> tracker.log_epoch_results(1, [{'loss': 0.25, 'accuracy': 0.92}])
- round_metrics(epoch_result_list)[source]#
Rounds the metrics in the epoch results to 4 decimal places.
- Parameters:
epoch_result_list (list) –
- A list of result dictionaries for the epoch. Each dictionary contains:
”metricValue” (float): The value of the metric to be rounded.
- Returns:
The updated list of epoch results with rounded metrics. Each metric value is rounded to four decimal places, with special handling for invalid values (NaN or infinity).
- Return type:
list
Examples
>>> results = [{'metricValue': 0.123456}, {'metricValue': float('inf')}, {'metricValue': None}] >>> rounded_results = round_metrics(results) >>> print(rounded_results) [{'metricValue': 0.1235}, {'metricValue': 0}, {'metricValue': 0.0001}]
- save_evaluation_results(list_of_result_dicts)[source]#
Saves the evaluation results for a model.
- Parameters:
list_of_result_dicts (list) – A list of dictionaries containing the evaluation results. Each dictionary should include relevant metrics and their values for the model’s performance.
- Raises:
Exception – Logs an error and exits if an exception occurs during the saving process.
Examples
>>> evaluation_results = [ >>> {"metric": "accuracy", "value": 0.95}, >>> {"metric": "loss", "value": 0.05}, >>> ] >>> save_evaluation_results(evaluation_results)
- update_status(stepCode, status, status_description)[source]#
Updates the status of the tracked action in the backend system.
This method allows changing the action’s status, such as from “in progress” to “completed” or “error”. It logs the provided message with the updated status.
- Parameters:
action_name (str) – The name of the action being tracked (e.g., “training”, “evaluation”).
status (str) – The new status to set for the action (e.g., “in_progress”, “completed”, “error”).
message (str) – A message providing context about the status update.
- Return type:
None
Examples
>>> tracker.update_status("training", "completed", "Training completed successfully")
- upload_checkpoint(checkpoint_path, model_type='trained')[source]#
Uploads a model checkpoint to the backend system.
- Parameters:
checkpoint_path (str) – The file path of the checkpoint to upload. This should point to a valid model checkpoint file.
model_type (str, optional) – The type of the model (“trained” or “exported”). Defaults to “trained”, which refers to a model that has been trained but not yet exported.
- Returns:
True if the upload was successful, False otherwise. The function will log an error and exit if an exception occurs during the upload process.
- Return type:
bool
Examples
>>> success = upload_checkpoint("path/to/checkpoint.pth") >>> if success: >>> print("Checkpoint uploaded successfully!") >>> else: >>> print("Checkpoint upload failed.")