matrice.deployment module#

class matrice.deployment.Deployment(session, deployment_id=None, deployment_name=None)[source]#

Bases: object

Class to manage deployment-related operations within a project.

The Deployment class initializes with a given session and deployment details, allowing users to access and manage the deployment attributes such as status, configuration, and associated project information.

Parameters:
  • session (object) – The session object containing project and RPC information.

  • deployment_id (str, optional) – The ID of the deployment to manage. Default is None.

  • deployment_name (str, optional) – The name of the deployment. Default is None.

session#

The session object for RPC communication.

Type:

object

rpc#

The RPC interface for backend API communication.

Type:

object

project_id#

The project ID associated with the deployment.

Type:

str

deployment_id#

The unique ID of the deployment.

Type:

str

deployment_name#

Name of the deployment.

Type:

str

model_id#

ID of the model associated with the deployment.

Type:

str

user_id#

User ID of the deployment owner.

Type:

str

user_name#

Username of the deployment owner.

Type:

str

action_id#

ID of the action associated with the deployment.

Type:

str

auth_keys#

List of authorization keys for the deployment.

Type:

list

runtime_framework#

Framework used for the runtime of the model in the deployment.

Type:

str

model_input#

Input format expected by the model.

Type:

dict

model_type#

Type of model deployed (e.g., classification, detection).

Type:

str

model_output#

Output format of the deployed model.

Type:

dict

deployment_type#

Type of deployment (e.g., real-time, batch).

Type:

str

suggested_classes#

Suggested classes for classification models.

Type:

list

running_instances#

List of currently running instances.

Type:

list

auto_shutdown#

Whether the deployment has auto-shutdown enabled.

Type:

bool

auto_scale#

Whether the deployment is configured for auto-scaling.

Type:

bool

gpu_required#

Whether GPU is required for the deployment.

Type:

bool

status#

Current status of the deployment.

Type:

str

hibernation_threshold#

Threshold for auto-hibernation in minutes.

Type:

int

image_store_confidence_threshold#

Confidence threshold for storing images.

Type:

float

image_store_count_threshold#

Count threshold for storing images.

Type:

int

images_stored_count#

Number of images currently stored.

Type:

int

bucket_alias#

Alias for the storage bucket associated with the deployment.

Type:

str

credential_alias#

Alias for credentials used in the deployment.

Type:

str

created_at#

Timestamp when the deployment was created.

Type:

str

updated_at#

Timestamp when the deployment was last updated.

Type:

str

compute_alias#

Alias of the compute resource associated with the deployment.

Type:

str

is_optimized#

Indicates whether the deployment is optimized.

Type:

bool

status_cards#

List of status cards related to the deployment.

Type:

list

total_deployments#

Total number of deployments in the project.

Type:

int or None

active_deployments#

Number of active deployments in the project.

Type:

int or None

total_running_instances_count#

Total count of running instances in the project.

Type:

int or None

hibernated_deployments#

Number of hibernated deployments.

Type:

int or None

error_deployments#

Number of deployments with errors.

Type:

int or None

Example

>>> session = Session(account_number="account_number")
>>> deployment = Deployment(session=session, deployment_id="deployment_id", deployment_name="MyDeployment")
__init__(session, deployment_id=None, deployment_name=None)[source]#
create_auth_key(expiry_days)[source]#

Create a new authentication key for the deployment, valid for the specified number of days. The deployment_id and project_id must be set during initialization.

Parameters:

expiry_days (int) – The number of days before the authentication key expires.

Returns:

A tuple containing three elements: - dict: The API response with details of the created authentication key, including keys such as:

  • authKey (str): The newly created authentication key.

  • expiryDate (str): Expiration date of the key.

  • str or None: Error message if an error occurred, otherwise None.

  • str: Status message indicating success or failure of the API call.

Return type:

tuple

Examples

>>> auth_key, err, msg = deployment.create_auth_key(30)
>>> if err:
>>>     pprint(err)
>>> else:
>>>     pprint(auth_key)
create_dataset(dataset_name, is_unlabeled, source, source_url, is_public, dataset_description='', version_description='')[source]#

Create a new dataset from a deployment. Only zip files are supported for upload, and the deployment ID must be set for this operation.

Parameters:
  • dataset_name (str) – The name of the new dataset.

  • is_unlabeled (bool) – Indicates whether the dataset is unlabeled.

  • source (str) – The source of the dataset (e.g., “aws”).

  • source_url (str) – The URL of the dataset to be created.

  • is_public (bool) – Specifies if the dataset is public.

  • dataset_description (str, optional) – A description for the dataset. Default is an empty string.

  • version_description (str, optional) – A description for this version of the dataset. Default is an empty string.

Returns:

A tuple containing three elements: - dict: The API response with details of the dataset creation, structured as:

  • datasetId (str): ID of the created dataset.

  • status (str): Status of the dataset creation request.

  • str or None: Error message if an error occurred, otherwise None.

  • str: Status message indicating success or failure of the API call.

Return type:

tuple

Example

>>> from pprint import pprint
>>> resp, err, msg = deployment.create_dataset(
...     dataset_name="New Dataset",
...     is_unlabeled=False,
...     source="aws",
...     source_url="https://example.com/dataset.zip",
...     is_public=True,
...     dataset_description="Dataset description",
...     version_description="Version description"
... )
>>> if err:
...     pprint(err)
>>> else:
...     pprint(resp)
delete()[source]#

Delete the specified deployment.

This method deletes the deployment identified by deployment_id from the backend system.

Returns:

A tuple containing three elements: - dict: The API response confirming the deletion. - str or None: Error message if an error occurred, otherwise None. - str: Status message indicating success or failure of the API call.

Return type:

tuple

Raises:

SystemExit – If deployment_id is not set.

Examples

>>> delete, err, msg = deployment.delete()
>>> if err:
>>>     pprint(err)
>>> else:
>>>     pprint(delete)
delete_auth_key(auth_key)[source]#

Delete a specified authentication key for the current deployment. The deployment_id must be set during initialization.

Parameters:

auth_key (str) – The authentication key to be deleted.

Returns:

A tuple containing three elements: - dict: The API response indicating the result of the delete operation. - str or None: Error message if an error occurred, otherwise None. - str: Status message indicating success or failure of the API call.

Return type:

tuple

Raises:

SystemExit – If deployment_id is not set.

Examples

>>> delete_auth_key, err, msg = deployment.delete_auth_key("abcd1234")
>>> if err:
>>>     pprint(err)
>>> else:
>>>     pprint(delete_auth_key)
get_deployment_server(model_train_id, model_type)[source]#

Fetch information about the deployment server for a specific model.

Parameters:
  • model_train_id (str) – The ID of the model training instance.

  • model_type (str) – The type of model (e.g., ‘trained’, ‘exported’).

Returns:

A tuple containing three elements: - dict: The API response with details of the deployment server. - str or None: Error message if an error occurred, otherwise None. - str: Status message indicating success or failure of the API call.

Return type:

tuple

Examples

>>> deployment_server, err, msg = deployment.get_deployment_server("train123", "trained")
>>> if err:
>>>     pprint(err)
>>> else:
>>>     pprint(deployment_server)
get_prediction(image_path, auth_key)[source]#

Fetch model predictions for a given image using a deployment.

This method sends an image to the deployment for prediction. Either deployment_id or deployment_name must be provided in the instance to locate the deployment.

Parameters:
  • image_path (str) – The path to the image for prediction.

  • auth_key (str) – The authentication key required for authorizing the prediction request.

Returns:

A tuple containing three elements: - dict: The API response with the prediction results, structured as:

  • predictions (list of dict): Each entry contains:
    • class (str): The predicted class label.

    • confidence (float): Confidence score of the prediction.

  • str or None: Error message if an error occurred, otherwise None.

  • str: Status message indicating success or failure of the prediction request.

Return type:

tuple

Raises:

ValueError – If auth_key is not provided or if neither deployment_id nor deployment_name is set.

Examples

>>> from pprint import pprint
>>> result, error, message = deployment.get_prediction(
...     image_path="/path/to/image.jpg",
...     auth_key="auth123"
... )
>>> if error:
...     pprint(error)
>>> else:
...     pprint(result)
refresh()[source]#

Refresh the instance by reinstantiating it with the previous values.

rename(updated_name)[source]#

Update the deployment name for the current deployment.

Parameters:

updated_name (str) – The new name for the deployment.

Returns:

A tuple containing three elements: - dict: The API response with details of the rename operation. - str or None: Error message if an error occurred, otherwise None. - str: Status message indicating success or failure of the API call.

Return type:

tuple

Raises:

SystemExit – If deployment_id is not set.

Examples

>>> from pprint import pprint
>>> deployment = Deployment(session, deployment_id="1234")
>>> rename, err, msg = deployment.rename("NewDeploymentName")
>>> if err:
>>>     pprint(err)
>>> else:
>>>     pprint(rename)
request_count_monitor(start_date, end_date, granularity='second')[source]#

Monitor the count of requests within a specified time range and granularity for the current deployment.

Parameters:
  • start_date (str) – The start date of the monitoring period in ISO format (e.g., “YYYY-MM-DDTHH:MM:SSZ”).

  • end_date (str) – The end date of the monitoring period in ISO format.

  • granularity (str, optional) – The time granularity for the request count (e.g., “second”, “minute”). Default is “second”.

Returns:

A tuple containing three elements: - dict: The API response with the request count details, structured as:

  • counts (list of dict): Each entry contains:
    • timestamp (str): The timestamp of the request count.

    • count (int): The number of requests at that timestamp.

  • str or None: Error message if an error occurred, otherwise None.

  • str: Status message indicating success or failure of the API call.

Return type:

tuple

Examples

>>> start = "2024-01-28T18:30:00.000Z"
>>> end = "2024-02-29T10:11:27.000Z"
>>> count_monitor, error, message = deployment.request_count_monitor(start, end)
>>> if error:
>>>     pprint(error)
>>> else:
>>>     pprint(count_monitor)
request_latency_monitor(start_date, end_date, granularity='second')[source]#

Monitor the request latency within a specified time range and granularity for the current deployment.

Parameters:
  • start_date (str) – The start date of the monitoring period in ISO format (e.g., “YYYY-MM-DDTHH:MM:SSZ”).

  • end_date (str) – The end date of the monitoring period in ISO format.

  • granularity (str, optional) – The time granularity for latency tracking (e.g., “second”, “minute”). Default is “second”.

Returns:

A tuple containing three elements: - dict: The API response with latency details, structured as:

  • latencies (list of dict): Each entry contains:
    • timestamp (str): The timestamp of the latency record.

    • avg_latency (float): The average latency in seconds for the requests at that timestamp.

  • str or None: Error message if an error occurred, otherwise None.

  • str: Status message indicating success or failure of the API call.

Return type:

tuple

Examples

>>> from pprint import pprint
>>> start = "2024-01-28T18:30:00.000Z"
>>> end = "2024-02-29T10:11:27.000Z"
>>> result, error, message = deployment.request_latency_monitor(start, end)
>>> if error:
>>>     pprint(error)
>>> else:
>>>     pprint(result)
request_total_monitor()[source]#

Monitor the total number of requests for the current deployment.

This method checks the total request count for a deployment by its deployment_id. If deployment_id is not set, it attempts to fetch it using deployment_name.

Returns:

A tuple containing three elements: - dict: The API response with the total request count. - str or None: Error message if an error occurred, otherwise None. - str: Status message indicating success or failure of the API call.

Return type:

tuple

Raises:

SystemExit – If both deployment_id and deployment_name are not set.

Examples

>>> from pprint import pprint
>>> monitor, error, message = deployment.request_total_monitor()
>>> if error:
>>>     pprint(error)
>>> else:
>>>     pprint(monitor)
wakeup_deployment_server()[source]#

Wake up the deployment server associated with the current deployment. The deployment_id must be set during initialization.

Returns:

A tuple containing three elements: - dict: The API response with details of the wake-up operation. - str or None: Error message if an error occurred, otherwise None. - str: Status message indicating success or failure of the API call.

Return type:

tuple

Raises:

SystemExit – If deployment_id is not set.

Examples

>>> wakeup, err, msg = deployment.wakeup_deployment_server()
>>> if err:
>>>     pprint(err)
>>> else:
>>>     pprint(wakeup)