matrice.experiment module#

Module for interacting with backend API to manage experiment.

class matrice.experiment.Experiment(session, experiment_id='', experiment_name='')[source]#

Bases: object

A class to manage experiment-related operations within a project.

Initialize a new experiment instance.

Parameters:
  • session (Session) – The session object that manages the connection to the server.

  • experiment_id (str, optional) – The ID of the experiment (default is an empty string).

  • experiment_name (str, optional) – The name of the experiment (default is an empty string).

Example

>>> session = Session(account_number="account_number")
>>> experiment = Experiment(session=session_object, experiment_id=experiment_id, experiment_name=experiment_name)
__init__(session, experiment_id='', experiment_name='')[source]#

Initialize the object with session details and experiment-specific information.

This constructor sets up the required project and session details, initializes models for training, and fetches experiment details if available.

If no experiment_id or experiment_name is provided, the experiment data will be fetched from the server using the get_details method, and attributes like dataset information, primary metric, and model inputs/outputs will be set based on the response.

Parameters:
  • session (object) – The session object containing project ID, project name, and the RPC client.

  • experiment_id (str, optional) – The ID of the experiment to fetch details for. Defaults to an empty string, which means details will be fetched if not provided.

  • experiment_name (str, optional) – The name of the experiment to fetch details for. Defaults to an empty string, which means details will be fetched if not provided.

project_id#

The project ID associated with the current session.

Type:

str

project_name#

The project name associated with the current session.

Type:

str

session#

The session object used to make API calls.

Type:

object

rpc#

The RPC client for making HTTP requests.

Type:

object

models_for_training#

A list to store models that are initialized for training.

Type:

list

experiment_id#

The ID of the experiment. Set based on the provided or fetched experiment data.

Type:

str

experiment_name#

The name of the experiment. Set based on the provided or fetched experiment data.

Type:

str

experiment_data#

The full data of the experiment as fetched from the API.

Type:

dict

dataset_id#

The ID of the dataset associated with the experiment.

Type:

str

dataset_name#

The name of the dataset associated with the experiment.

Type:

str

dataset_version#

The version of the dataset used in the experiment.

Type:

str

primary_metric#

The primary metric used to evaluate the model’s performance in the experiment.

Type:

str

model_inputs#

A list of inputs used by the model in the experiment.

Type:

list

model_outputs#

A list of outputs generated by the model in the experiment.

Type:

list

target_runtime#

The runtime environment for the model in the experiment.

Type:

str

Return type:

None

Example

>>> session = Session(account_number="account_number")
>>> exp = Experiment(session, experiment_id="exp123", experiment_name="My Experiment")
>>> print(exp.experiment_id)  # Output: "exp123"
>>> print(exp.dataset_name)  # Output: "Sample Dataset"

Notes

If there is an error fetching the experiment details, a message will be printed to the console.

add_models_for_training(models, model_configs, compute_alias='')[source]#

Add models to the training queue for the experiment.

This method prepares and sends model configurations to the backend for training. It supports both single model and batch model submissions.

Parameters:
  • models (ModelArch or list of ModelArch) – A single model instance or a list of model instances to be trained.

  • model_configs (dict or list of dict) – Configuration dictionary or list of dictionaries containing model settings. Each dictionary should include: - is_autoML (bool): Flag for AutoML usage - tuning_type (str): Type of model tuning - model_checkpoint (str): Model checkpoint information - checkpoint_type (str): Type of checkpoint - action_config (dict): Configuration for model actions - model_config (dict): Model-specific configuration

  • compute_alias (str, optional) – Alias for the compute resource to use for training (default: “”)

Returns:

A tuple containing three elements: - API response (dict): The raw response from the API - error_message (str or None): Error message if an error occurred, None otherwise - status_message (str): Status message indicating success or failure

Return type:

tuple

Notes

The method accumulates model configurations in self.models_for_training and sends them as a batch to the backend. The list is cleared after submission.

Example

>>> model = ModelArch(session, model_key="resnet50")
>>> config = {
...     "is_autoML": True,
...     "tuning_type": "auto",
...     "model_checkpoint": "predefined",
...     "checkpoint_type": "auto",
...     "action_config": {},
...     "model_config": {}
... }
>>> resp, err, msg = experiment.add_models_for_training(model, config, "GPU-A100")
>>> if err:
...     print(f"Error: {err}")
... else:
...     print(f"Success: {msg}")
get_best_model()[source]#

Retrieve the model with the highest test score from the experiment.

Returns the best performing model based on test score, as determined during the most recent call to list_models(). Must call list_models() first to populate best model data.

Returns:

A tuple containing four elements: - best_model (Model or None): Model instance with highest test score - best_model_test_score (float or None): Test score of the best model - error_message (str or None): Error message if an error occurred, None otherwise - status_message (str): Status message indicating success or failure

Return type:

tuple

Example

>>> best_model, test_score, err, msg = experiment.get_best_model()
>>> if err:
...     print(f"Error: {err}")
>>> elif best_model:
...     print(f"Best Model: {best_model.name}, Score: {test_score}")
>>> else:
...     print("No models found")
get_details()[source]#

Retrieve details of the experiment based on the experiment ID or name.

This method fetches experiment details by ID if available; otherwise, it attempts to fetch by name. Raises a ValueError if neither identifier is provided.

Returns:

A tuple containing experiment details, error message (if any), and a status message.

Return type:

tuple

Raises:

ValueError – If neither ‘experiment_id’ nor ‘experiment_name’ is provided.

Example

>>> experiment_details = experiment.get_details()
>>> if isinstance(experiment_details, dict):
>>>     print("Experiment Details:", experiment_details)
>>> else:
>>>     print("Failed to retrieve experiment details.")
list_models()[source]#

Fetch and list all models associated with the current experiment.

Retrieves models from the backend and updates the experiment’s best model tracking. The best model is determined by the highest test score among all models.

Returns:

A tuple containing five elements: - models (list): List of Model instances containing model information - status_list (list): List of model status strings corresponding to each model - response (dict): Raw API response - error_message (str or None): Error message if an error occurred, None otherwise - status_message (str): Status message indicating success or failure

Return type:

tuple

Notes

This method updates two instance variables: - self.best_model: Stores the Model instance with the highest test score - self.best_model_test_score: Stores the highest test score found

Example

>>> models, status_list, response, err, msg = experiment.list_models()
>>> if err:
...     print(f"Error: {err}")
... else:
...     for model, status in zip(models, status_list):
...         print(f"Model: {model.name}, Status: {status}")
refresh()[source]#

Refresh the instance by reinstantiating it with the previous values.

stop_training()[source]#

Stop the training process for the experiment.

This method attempts to halt the ongoing training for the experiment by making a call to the backend to restrict further progress.

Returns:

A tuple containing three elements: - API response (dict): The raw response from the API. - error_message (str or None): Error message if an error occurred, None otherwise. - status_message (str): A status message indicating success or failure.

Return type:

tuple

Example

>>> resp, err, msg = experiment.stop_training()
>>> if err:
>>>     print(f"Error: {err}")
>>> else:
>>>     print(f"Training stopped: {resp}")