airflow.providers.amazon.aws.operators.sagemaker_unified_studio¶
This module contains the Amazon SageMaker Unified Studio Notebook operator.
Classes¶
Provides Artifact execution functionality for Sagemaker Unified Studio Workflows. |
Module Contents¶
- class airflow.providers.amazon.aws.operators.sagemaker_unified_studio.SageMakerNotebookOperator(task_id, input_config, domain_id=None, project_id=None, output_config=None, domain_region=None, compute=None, termination_condition=None, tags=None, wait_for_completion=True, waiter_delay=10, waiter_max_attempts=1440, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]¶
Bases:
airflow.providers.common.compat.sdk.BaseOperatorProvides Artifact execution functionality for Sagemaker Unified Studio Workflows.
- Examples:
from airflow.providers.amazon.aws.operators.sagemaker_unified_studio import SageMakerNotebookOperator notebook_operator = SageMakerNotebookOperator( task_id="notebook_task", domain_id="dzd-example123456", project_id="example123456", input_config={"input_path": "path/to/notebook.ipynb", "input_params": ""}, output_config={"output_format": "ipynb"}, domain_region="us-east-1", wait_for_completion=True, waiter_delay=10, waiter_max_attempts=1440, )
- Parameters:
task_id (str) – A unique, meaningful id for the task.
input_config (dict) – Configuration for the input file. Input path should be specified as a relative path. The provided relative path will be automatically resolved to an absolute path within the context of the user’s home directory in the IDE. Input params should be a dict. Example: {‘input_path’: ‘folder/input/notebook.ipynb’, ‘input_params’:{‘key’: ‘value’}}
output_config (dict | None) – Configuration for the output format. It should include an output_format parameter to control the format of the notebook execution output. Example: {“output_formats”: [“NOTEBOOK”]}
domain_id (str | None) – The domain ID for Amazon SageMaker Unified Studio. Optional - if not provided, the SDK will attempt to resolve it from the environment.
project_id (str | None) – The project ID for Amazon SageMaker Unified Studio. Optional - if not provided, the SDK will attempt to resolve it from the environment.
domain_region (str | None) – The AWS region for the domain. If not provided, the default AWS region will be used.
compute (dict | None) –
compute configuration to use for the artifact execution. This is a required attribute if the execution is on a remote compute. Example:
{ "instance_type": "ml.c5.xlarge", "image_details": { "image_name": "sagemaker-distribution-prod", "image_version": "3", "ecr_uri": "123456123456.dkr.ecr.us-west-2.amazonaws.com/ImageName:latest", }, }
termination_condition (dict | None) – conditions to match to terminate the remote execution. Example:
{"MaxRuntimeInSeconds": 3600}tags (dict | None) – tags to be associated with the remote execution runs. Example:
{"md_analytics": "logs"}wait_for_completion (bool) – Indicates whether to wait for the notebook execution to complete. If True, wait for completion; if False, don’t wait.
waiter_delay (int) – Interval in seconds to check the notebook execution status.
waiter_max_attempts (int) – Number of attempts to wait before returning FAILED.
deferrable (bool) – If True, the operator will wait asynchronously for the job to complete. This implies waiting for completion. This mode requires aiobotocore module to be installed. (default: False)
See also
For more information on how to use this operator, take a look at the guide: Create an Amazon SageMaker Unified Studio Workflow