airflow.providers.amazon.aws.operators.sagemaker
¶
Module Contents¶
Classes¶
This is the base operator for all SageMaker operators. |
|
Initiate a SageMaker processing job. |
|
Create a SageMaker endpoint config. |
|
Create a SageMaker endpoint. |
|
Initiate a SageMaker transform job. |
|
Initiate a SageMaker hyperparameter tuning job. |
|
Create a SageMaker model. |
|
Initiate a SageMaker training job. |
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerBaseOperator(*, config, aws_conn_id='aws_default', **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
This is the base operator for all SageMaker operators.
- Parameters
- parse_integer(self, config, field)[source]¶
Recursive method for parsing string fields holding integer values to integers.
- parse_config_integers(self)[source]¶
Parse the integer fields of training config to integers in case the config is rendered by Jinja and all fields are str
- expand_role(self)[source]¶
Placeholder for calling boto3's expand_role, which expands an IAM role name into an ARN.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerProcessingOperator(*, config, aws_conn_id, wait_for_completion=True, print_log=True, check_interval=30, max_ingestion_time=None, action_if_job_exists='increment', **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Initiate a SageMaker processing job.
This operator returns The ARN of the processing job created in Amazon SageMaker.
- Parameters
config (dict) --
The configuration necessary to start a processing job (templated).
For details of the configuration parameter see
SageMaker.Client.create_processing_job()
aws_conn_id (str) -- The AWS connection ID to use.
wait_for_completion (bool) -- If wait is set to True, the time interval, in seconds, that the operation waits to check the status of the processing job.
print_log (bool) -- if the operator should print the cloudwatch log during processing
check_interval (int) -- if wait is set to be true, this is the time interval in seconds which the operator will check the status of the processing job
max_ingestion_time (Optional[int]) -- If wait is set to True, the operation fails if the processing job doesn't finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.
action_if_job_exists (str) -- Behaviour if the job name already exists. Possible options are "increment" (default) and "fail".
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerEndpointConfigOperator(*, config, **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Create a SageMaker endpoint config.
This operator returns The ARN of the endpoint config created in Amazon SageMaker
- Parameters
config (dict) --
The configuration necessary to create an endpoint config.
For details of the configuration parameter see
SageMaker.Client.create_endpoint_config()
aws_conn_id -- The AWS connection ID to use.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerEndpointOperator(*, config, wait_for_completion=True, check_interval=30, max_ingestion_time=None, operation='create', **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Create a SageMaker endpoint.
This operator returns The ARN of the endpoint created in Amazon SageMaker
- Parameters
config (dict) --
The configuration necessary to create an endpoint.
If you need to create a SageMaker endpoint based on an existed SageMaker model and an existed SageMaker endpoint config:
config = endpoint_configuration;
If you need to create all of SageMaker model, SageMaker endpoint-config and SageMaker endpoint:
config = { 'Model': model_configuration, 'EndpointConfig': endpoint_config_configuration, 'Endpoint': endpoint_configuration }
For details of the configuration parameter of model_configuration see
SageMaker.Client.create_model()
For details of the configuration parameter of endpoint_config_configuration see
SageMaker.Client.create_endpoint_config()
For details of the configuration parameter of endpoint_configuration see
SageMaker.Client.create_endpoint()
aws_conn_id -- The AWS connection ID to use.
wait_for_completion (bool) -- Whether the operator should wait until the endpoint creation finishes.
check_interval (int) -- If wait is set to True, this is the time interval, in seconds, that this operation waits before polling the status of the endpoint creation.
max_ingestion_time (Optional[int]) -- If wait is set to True, this operation fails if the endpoint creation doesn't finish within max_ingestion_time seconds. If you set this parameter to None it never times out.
operation (str) -- Whether to create an endpoint or update an endpoint. Must be either 'create or 'update'.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerTransformOperator(*, config, wait_for_completion=True, check_interval=30, max_ingestion_time=None, **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Initiate a SageMaker transform job.
This operator returns The ARN of the model created in Amazon SageMaker.
- Parameters
config (dict) --
The configuration necessary to start a transform job (templated).
If you need to create a SageMaker transform job based on an existed SageMaker model:
config = transform_config
If you need to create both SageMaker model and SageMaker Transform job:
config = { 'Model': model_config, 'Transform': transform_config }
For details of the configuration parameter of transform_config see
SageMaker.Client.create_transform_job()
For details of the configuration parameter of model_config, See:
SageMaker.Client.create_model()
aws_conn_id -- The AWS connection ID to use.
wait_for_completion (bool) -- Set to True to wait until the transform job finishes.
check_interval (int) -- If wait is set to True, the time interval, in seconds, that this operation waits to check the status of the transform job.
max_ingestion_time (Optional[int]) -- If wait is set to True, the operation fails if the transform job doesn't finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerTuningOperator(*, config, wait_for_completion=True, check_interval=30, max_ingestion_time=None, **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Initiate a SageMaker hyperparameter tuning job.
This operator returns The ARN of the tuning job created in Amazon SageMaker.
- Parameters
config (dict) --
The configuration necessary to start a tuning job (templated).
For details of the configuration parameter see
SageMaker.Client.create_hyper_parameter_tuning_job()
aws_conn_id -- The AWS connection ID to use.
wait_for_completion (bool) -- Set to True to wait until the tuning job finishes.
check_interval (int) -- If wait is set to True, the time interval, in seconds, that this operation waits to check the status of the tuning job.
max_ingestion_time (Optional[int]) -- If wait is set to True, the operation fails if the tuning job doesn't finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.
- integer_fields = [['HyperParameterTuningJobConfig', 'ResourceLimits', 'MaxNumberOfTrainingJobs'],...[source]¶
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerModelOperator(*, config, **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Create a SageMaker model.
This operator returns The ARN of the model created in Amazon SageMaker
- Parameters
config --
The configuration necessary to create a model.
For details of the configuration parameter see
SageMaker.Client.create_model()
aws_conn_id -- The AWS connection ID to use.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerTrainingOperator(*, config, wait_for_completion=True, print_log=True, check_interval=30, max_ingestion_time=None, check_if_job_exists=True, action_if_job_exists='increment', **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Initiate a SageMaker training job.
This operator returns The ARN of the training job created in Amazon SageMaker.
- Parameters
config (dict) --
The configuration necessary to start a training job (templated).
For details of the configuration parameter see
SageMaker.Client.create_training_job()
aws_conn_id -- The AWS connection ID to use.
wait_for_completion (bool) -- If wait is set to True, the time interval, in seconds, that the operation waits to check the status of the training job.
print_log (bool) -- if the operator should print the cloudwatch log during training
check_interval (int) -- if wait is set to be true, this is the time interval in seconds which the operator will check the status of the training job
max_ingestion_time (Optional[int]) -- If wait is set to True, the operation fails if the training job doesn't finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.
check_if_job_exists (bool) -- If set to true, then the operator will check whether a training job already exists for the name in the config.
action_if_job_exists (str) -- Behaviour if the job name already exists. Possible options are "increment" (default) and "fail". This is only relevant if check_if
- integer_fields = [['ResourceConfig', 'InstanceCount'], ['ResourceConfig', 'VolumeSizeInGB'],...[source]¶