airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook¶
This module contains the Amazon SageMaker Unified Studio Notebook Run hook.
Attributes¶
Classes¶
Interact with Sagemaker Unified Studio Workflows for asynchronous notebook execution. |
Module Contents¶
- airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook.TWELVE_HOURS_IN_MINUTES = 720[source]¶
- airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook.MIN_BOTOCORE_VERSION = '1.43.1'[source]¶
- airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook.NOTEBOOK_RUN_SUCCESS_STATES[source]¶
- airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook.NOTEBOOK_RUN_IN_PROGRESS_STATES[source]¶
- airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook.NOTEBOOK_RUN_FAILURE_STATES[source]¶
- airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook.NOTEBOOK_OUTPUT_KEY_PREFIX = 'NOTEBOOK_OUTPUT'[source]¶
- class airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook.SageMakerUnifiedStudioNotebookHook(*args, **kwargs)[source]¶
Bases:
airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHookInteract with Sagemaker Unified Studio Workflows for asynchronous notebook execution.
This hook provides a wrapper around the DataZone StartNotebookRun / GetNotebookRun APIs.
- Examples:
from airflow.providers.amazon.aws.hooks.sagemaker_unified_studio_notebook import ( SageMakerUnifiedStudioNotebookHook, ) hook = SageMakerUnifiedStudioNotebookHook(aws_conn_id="my_aws_conn")
Additional arguments (such as
aws_conn_idorregion_name) may be specified and are passed down to the underlying AwsBaseHook.- property conn[source]¶
Get the underlying boto3 DataZone client, optionally with a custom endpoint URL.
- start_notebook_run(notebook_identifier, domain_identifier, owning_project_identifier, client_token=None, notebook_parameters=None, compute_configuration=None, timeout_configuration=None, workflow_name=None)[source]¶
Start an asynchronous notebook run via the DataZone StartNotebookRun API.
- Parameters:
notebook_identifier (str) – The ID of the notebook to execute.
domain_identifier (str) – The ID of the DataZone domain containing the notebook.
owning_project_identifier (str) – The ID of the DataZone project containing the notebook.
client_token (str | None) – Idempotency token. Auto-generated if not provided.
notebook_parameters (dict | None) – Parameters to pass to the notebook.
compute_configuration (dict | None) – Compute config (e.g. instanceType).
timeout_configuration (dict | None) – Timeout settings (runTimeoutInMinutes).
workflow_name (str | None) – Name of the workflow (DAG) that triggered this run.
- Returns:
The StartNotebookRun API response dict.
- Return type:
- get_notebook_run(notebook_run_id, domain_identifier)[source]¶
Get the status of a notebook run via the DataZone GetNotebookRun API.
- wait_for_notebook_run(notebook_run_id, domain_identifier, waiter_delay=10, timeout_configuration=None)[source]¶
Poll GetNotebookRun until the run reaches a terminal state.
- Parameters:
notebook_run_id (str) – The ID of the notebook run to monitor.
domain_identifier (str) – The ID of the DataZone domain.
waiter_delay (int) – Interval in seconds to poll the notebook run status.
timeout_configuration (dict | None) – Timeout settings for the notebook execution. When provided, the maximum number of poll attempts is derived from
runTimeoutInMinutes * 60 / waiter_delay. Defaults to 12 hours.
- Returns:
A dict with Status and NotebookRunId on success.
- Raises:
RuntimeError – If the run fails or times out.
- Return type:
- get_project_s3_path(project_id)[source]¶
Construct the S3 path for a SageMaker Unified Studio project bucket.
- get_notebook_outputs(notebook_identifier, notebook_run_id, owning_project_identifier)[source]¶
Read notebook output artifacts from the S3 project bucket.
After a notebook run completes, the SDK writes output variables as a JSON file to a well-known S3 location within the project bucket. This method reads that file and returns the parsed key-value pairs.
- Parameters:
- Returns:
A dict of notebook output key-value pairs. Returns an empty dict if no outputs were written or the file cannot be parsed.
- Return type: