matrice.deployment module#
- class matrice.deployment.Deployment(session, deployment_id=None, deployment_name=None)[source]#
Bases:
object
Class to manage deployment-related operations within a project.
The Deployment class initializes with a given session and deployment details, allowing users to access and manage the deployment attributes such as status, configuration, and associated project information.
- Parameters:
session (object) – The session object containing project and RPC information.
deployment_id (str, optional) – The ID of the deployment to manage. Default is None.
deployment_name (str, optional) – The name of the deployment. Default is None.
- session#
The session object for RPC communication.
- Type:
object
- rpc#
The RPC interface for backend API communication.
- Type:
object
- project_id#
The project ID associated with the deployment.
- Type:
str
- deployment_id#
The unique ID of the deployment.
- Type:
str
- deployment_name#
Name of the deployment.
- Type:
str
- model_id#
ID of the model associated with the deployment.
- Type:
str
- user_id#
User ID of the deployment owner.
- Type:
str
- user_name#
Username of the deployment owner.
- Type:
str
- action_id#
ID of the action associated with the deployment.
- Type:
str
- auth_keys#
List of authorization keys for the deployment.
- Type:
list
- runtime_framework#
Framework used for the runtime of the model in the deployment.
- Type:
str
- model_input#
Input format expected by the model.
- Type:
dict
- model_type#
Type of model deployed (e.g., classification, detection).
- Type:
str
- model_output#
Output format of the deployed model.
- Type:
dict
- deployment_type#
Type of deployment (e.g., real-time, batch).
- Type:
str
- suggested_classes#
Suggested classes for classification models.
- Type:
list
- running_instances#
List of currently running instances.
- Type:
list
- auto_shutdown#
Whether the deployment has auto-shutdown enabled.
- Type:
bool
- auto_scale#
Whether the deployment is configured for auto-scaling.
- Type:
bool
- gpu_required#
Whether GPU is required for the deployment.
- Type:
bool
- status#
Current status of the deployment.
- Type:
str
- hibernation_threshold#
Threshold for auto-hibernation in minutes.
- Type:
int
- image_store_confidence_threshold#
Confidence threshold for storing images.
- Type:
float
- image_store_count_threshold#
Count threshold for storing images.
- Type:
int
- images_stored_count#
Number of images currently stored.
- Type:
int
- bucket_alias#
Alias for the storage bucket associated with the deployment.
- Type:
str
- credential_alias#
Alias for credentials used in the deployment.
- Type:
str
- created_at#
Timestamp when the deployment was created.
- Type:
str
- updated_at#
Timestamp when the deployment was last updated.
- Type:
str
- compute_alias#
Alias of the compute resource associated with the deployment.
- Type:
str
- is_optimized#
Indicates whether the deployment is optimized.
- Type:
bool
- status_cards#
List of status cards related to the deployment.
- Type:
list
- total_deployments#
Total number of deployments in the project.
- Type:
int or None
- active_deployments#
Number of active deployments in the project.
- Type:
int or None
- total_running_instances_count#
Total count of running instances in the project.
- Type:
int or None
- hibernated_deployments#
Number of hibernated deployments.
- Type:
int or None
- error_deployments#
Number of deployments with errors.
- Type:
int or None
Example
>>> session = Session(account_number="account_number") >>> deployment = Deployment(session=session, deployment_id="deployment_id", deployment_name="MyDeployment")
- create_auth_key(expiry_days)[source]#
Create a new authentication key for the deployment, valid for the specified number of days. The deployment_id and project_id must be set during initialization.
- Parameters:
expiry_days (int) – The number of days before the authentication key expires.
- Returns:
A tuple containing three elements: - dict: The API response with details of the created authentication key, including keys such as:
authKey (str): The newly created authentication key.
expiryDate (str): Expiration date of the key.
str or None: Error message if an error occurred, otherwise None.
str: Status message indicating success or failure of the API call.
- Return type:
tuple
Examples
>>> auth_key, err, msg = deployment.create_auth_key(30) >>> if err: >>> pprint(err) >>> else: >>> pprint(auth_key)
- create_dataset(dataset_name, is_unlabeled, source, source_url, is_public, dataset_description='', version_description='')[source]#
Create a new dataset from a deployment. Only zip files are supported for upload, and the deployment ID must be set for this operation.
- Parameters:
dataset_name (str) – The name of the new dataset.
is_unlabeled (bool) – Indicates whether the dataset is unlabeled.
source (str) – The source of the dataset (e.g., “aws”).
source_url (str) – The URL of the dataset to be created.
is_public (bool) – Specifies if the dataset is public.
dataset_description (str, optional) – A description for the dataset. Default is an empty string.
version_description (str, optional) – A description for this version of the dataset. Default is an empty string.
- Returns:
A tuple containing three elements: - dict: The API response with details of the dataset creation, structured as:
datasetId (str): ID of the created dataset.
status (str): Status of the dataset creation request.
str or None: Error message if an error occurred, otherwise None.
str: Status message indicating success or failure of the API call.
- Return type:
tuple
Example
>>> from pprint import pprint >>> resp, err, msg = deployment.create_dataset( ... dataset_name="New Dataset", ... is_unlabeled=False, ... source="aws", ... source_url="https://example.com/dataset.zip", ... is_public=True, ... dataset_description="Dataset description", ... version_description="Version description" ... ) >>> if err: ... pprint(err) >>> else: ... pprint(resp)
- delete()[source]#
Delete the specified deployment.
This method deletes the deployment identified by deployment_id from the backend system.
- Returns:
A tuple containing three elements: - dict: The API response confirming the deletion. - str or None: Error message if an error occurred, otherwise None. - str: Status message indicating success or failure of the API call.
- Return type:
tuple
- Raises:
SystemExit – If deployment_id is not set.
Examples
>>> delete, err, msg = deployment.delete() >>> if err: >>> pprint(err) >>> else: >>> pprint(delete)
- delete_auth_key(auth_key)[source]#
Delete a specified authentication key for the current deployment. The deployment_id must be set during initialization.
- Parameters:
auth_key (str) – The authentication key to be deleted.
- Returns:
A tuple containing three elements: - dict: The API response indicating the result of the delete operation. - str or None: Error message if an error occurred, otherwise None. - str: Status message indicating success or failure of the API call.
- Return type:
tuple
- Raises:
SystemExit – If deployment_id is not set.
Examples
>>> delete_auth_key, err, msg = deployment.delete_auth_key("abcd1234") >>> if err: >>> pprint(err) >>> else: >>> pprint(delete_auth_key)
- get_deployment_server(model_train_id, model_type)[source]#
Fetch information about the deployment server for a specific model.
- Parameters:
model_train_id (str) – The ID of the model training instance.
model_type (str) – The type of model (e.g., ‘trained’, ‘exported’).
- Returns:
A tuple containing three elements: - dict: The API response with details of the deployment server. - str or None: Error message if an error occurred, otherwise None. - str: Status message indicating success or failure of the API call.
- Return type:
tuple
Examples
>>> deployment_server, err, msg = deployment.get_deployment_server("train123", "trained") >>> if err: >>> pprint(err) >>> else: >>> pprint(deployment_server)
- get_prediction(image_path, auth_key)[source]#
Fetch model predictions for a given image using a deployment.
This method sends an image to the deployment for prediction. Either deployment_id or deployment_name must be provided in the instance to locate the deployment.
- Parameters:
image_path (str) – The path to the image for prediction.
auth_key (str) – The authentication key required for authorizing the prediction request.
- Returns:
A tuple containing three elements: - dict: The API response with the prediction results, structured as:
- predictions (list of dict): Each entry contains:
class (str): The predicted class label.
confidence (float): Confidence score of the prediction.
str or None: Error message if an error occurred, otherwise None.
str: Status message indicating success or failure of the prediction request.
- Return type:
tuple
- Raises:
ValueError – If auth_key is not provided or if neither deployment_id nor deployment_name is set.
Examples
>>> from pprint import pprint >>> result, error, message = deployment.get_prediction( ... image_path="/path/to/image.jpg", ... auth_key="auth123" ... ) >>> if error: ... pprint(error) >>> else: ... pprint(result)
- rename(updated_name)[source]#
Update the deployment name for the current deployment.
- Parameters:
updated_name (str) – The new name for the deployment.
- Returns:
A tuple containing three elements: - dict: The API response with details of the rename operation. - str or None: Error message if an error occurred, otherwise None. - str: Status message indicating success or failure of the API call.
- Return type:
tuple
- Raises:
SystemExit – If deployment_id is not set.
Examples
>>> from pprint import pprint >>> deployment = Deployment(session, deployment_id="1234") >>> rename, err, msg = deployment.rename("NewDeploymentName") >>> if err: >>> pprint(err) >>> else: >>> pprint(rename)
- request_count_monitor(start_date, end_date, granularity='second')[source]#
Monitor the count of requests within a specified time range and granularity for the current deployment.
- Parameters:
start_date (str) – The start date of the monitoring period in ISO format (e.g., “YYYY-MM-DDTHH:MM:SSZ”).
end_date (str) – The end date of the monitoring period in ISO format.
granularity (str, optional) – The time granularity for the request count (e.g., “second”, “minute”). Default is “second”.
- Returns:
A tuple containing three elements: - dict: The API response with the request count details, structured as:
- counts (list of dict): Each entry contains:
timestamp (str): The timestamp of the request count.
count (int): The number of requests at that timestamp.
str or None: Error message if an error occurred, otherwise None.
str: Status message indicating success or failure of the API call.
- Return type:
tuple
Examples
>>> start = "2024-01-28T18:30:00.000Z" >>> end = "2024-02-29T10:11:27.000Z" >>> count_monitor, error, message = deployment.request_count_monitor(start, end) >>> if error: >>> pprint(error) >>> else: >>> pprint(count_monitor)
- request_latency_monitor(start_date, end_date, granularity='second')[source]#
Monitor the request latency within a specified time range and granularity for the current deployment.
- Parameters:
start_date (str) – The start date of the monitoring period in ISO format (e.g., “YYYY-MM-DDTHH:MM:SSZ”).
end_date (str) – The end date of the monitoring period in ISO format.
granularity (str, optional) – The time granularity for latency tracking (e.g., “second”, “minute”). Default is “second”.
- Returns:
A tuple containing three elements: - dict: The API response with latency details, structured as:
- latencies (list of dict): Each entry contains:
timestamp (str): The timestamp of the latency record.
avg_latency (float): The average latency in seconds for the requests at that timestamp.
str or None: Error message if an error occurred, otherwise None.
str: Status message indicating success or failure of the API call.
- Return type:
tuple
Examples
>>> from pprint import pprint >>> start = "2024-01-28T18:30:00.000Z" >>> end = "2024-02-29T10:11:27.000Z" >>> result, error, message = deployment.request_latency_monitor(start, end) >>> if error: >>> pprint(error) >>> else: >>> pprint(result)
- request_total_monitor()[source]#
Monitor the total number of requests for the current deployment.
This method checks the total request count for a deployment by its deployment_id. If deployment_id is not set, it attempts to fetch it using deployment_name.
- Returns:
A tuple containing three elements: - dict: The API response with the total request count. - str or None: Error message if an error occurred, otherwise None. - str: Status message indicating success or failure of the API call.
- Return type:
tuple
- Raises:
SystemExit – If both deployment_id and deployment_name are not set.
Examples
>>> from pprint import pprint >>> monitor, error, message = deployment.request_total_monitor() >>> if error: >>> pprint(error) >>> else: >>> pprint(monitor)
- wakeup_deployment_server()[source]#
Wake up the deployment server associated with the current deployment. The deployment_id must be set during initialization.
- Returns:
A tuple containing three elements: - dict: The API response with details of the wake-up operation. - str or None: Error message if an error occurred, otherwise None. - str: Status message indicating success or failure of the API call.
- Return type:
tuple
- Raises:
SystemExit – If deployment_id is not set.
Examples
>>> wakeup, err, msg = deployment.wakeup_deployment_server() >>> if err: >>> pprint(err) >>> else: >>> pprint(wakeup)