airflow.providers.amazon.aws.operators.sagemaker
¶
Module Contents¶
Classes¶
This is the base operator for all SageMaker operators. |
|
Use Amazon SageMaker Processing to analyze data and evaluate machine learning |
|
Creates an endpoint configuration that Amazon SageMaker hosting |
|
When you create a serverless endpoint, SageMaker provisions and manages |
|
Starts a transform job. A transform job uses a trained model to get inferences |
|
Starts a hyperparameter tuning job. A hyperparameter tuning job finds the |
|
Creates a model in Amazon SageMaker. In the request, you name the model and |
|
Starts a model training job. After training completes, Amazon SageMaker saves |
|
Deletes a SageMaker model. |
Attributes¶
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerBaseOperator(*, config, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
This is the base operator for all SageMaker operators.
- Parameters
config (dict) – The configuration necessary to start a training job (templated)
- parse_integer(config, field)[source]¶
Recursive method for parsing string fields holding integer values to integers.
- parse_config_integers()[source]¶
Parse the integer fields to ints in case the config is rendered by Jinja and all fields are str.
- expand_role()[source]¶
Placeholder for calling boto3’s expand_role, which expands an IAM role name into an ARN.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerProcessingOperator(*, config, aws_conn_id=DEFAULT_CONN_ID, wait_for_completion=True, print_log=True, check_interval=CHECK_INTERVAL_SECOND, max_ingestion_time=None, action_if_job_exists='increment', **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Use Amazon SageMaker Processing to analyze data and evaluate machine learning models on Amazon SageMake. With Processing, you can use a simplified, managed experience on SageMaker to run your data processing workloads, such as feature engineering, data validation, model evaluation, and model interpretation.
See also
For more information on how to use this operator, take a look at the guide: Create an Amazon SageMaker processing job
- Parameters
config (dict) – The configuration necessary to start a processing job (templated). For details of the configuration parameter see
SageMaker.Client.create_processing_job()
aws_conn_id (str) – The AWS connection ID to use.
wait_for_completion (bool) – If wait is set to True, the time interval, in seconds, that the operation waits to check the status of the processing job.
print_log (bool) – if the operator should print the cloudwatch log during processing
check_interval (int) – if wait is set to be true, this is the time interval in seconds which the operator will check the status of the processing job
max_ingestion_time (int | None) – If wait is set to True, the operation fails if the processing job doesn’t finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.
action_if_job_exists (str) – Behaviour if the job name already exists. Possible options are “increment” (default) and “fail”.
- Return Dict
Returns The ARN of the processing job created in Amazon SageMaker.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerEndpointConfigOperator(*, config, aws_conn_id=DEFAULT_CONN_ID, **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Creates an endpoint configuration that Amazon SageMaker hosting services uses to deploy models. In the configuration, you identify one or more models, created using the CreateModel API, to deploy and the resources that you want Amazon SageMaker to provision.
See also
For more information on how to use this operator, take a look at the guide: Create an Amazon SageMaker endpoint config job
- Parameters
config (dict) –
The configuration necessary to create an endpoint config.
For details of the configuration parameter see
SageMaker.Client.create_endpoint_config()
aws_conn_id (str) – The AWS connection ID to use.
- Return Dict
Returns The ARN of the endpoint config created in Amazon SageMaker.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerEndpointOperator(*, config, aws_conn_id=DEFAULT_CONN_ID, wait_for_completion=True, check_interval=CHECK_INTERVAL_SECOND, max_ingestion_time=None, operation='create', **kwargs)[source]¶
Bases:
SageMakerBaseOperator
When you create a serverless endpoint, SageMaker provisions and manages the compute resources for you. Then, you can make inference requests to the endpoint and receive model predictions in response. SageMaker scales the compute resources up and down as needed to handle your request traffic.
Requires an Endpoint Config.
See also
For more information on how to use this operator, take a look at the guide: Create an Amazon SageMaker endpoint job
- Parameters
config (dict) –
The configuration necessary to create an endpoint.
If you need to create a SageMaker endpoint based on an existed SageMaker model and an existed SageMaker endpoint config:
config = endpoint_configuration;
If you need to create all of SageMaker model, SageMaker endpoint-config and SageMaker endpoint:
config = { 'Model': model_configuration, 'EndpointConfig': endpoint_config_configuration, 'Endpoint': endpoint_configuration }
For details of the configuration parameter of model_configuration see
SageMaker.Client.create_model()
For details of the configuration parameter of endpoint_config_configuration see
SageMaker.Client.create_endpoint_config()
For details of the configuration parameter of endpoint_configuration see
SageMaker.Client.create_endpoint()
wait_for_completion (bool) – Whether the operator should wait until the endpoint creation finishes.
check_interval (int) – If wait is set to True, this is the time interval, in seconds, that this operation waits before polling the status of the endpoint creation.
max_ingestion_time (int | None) – If wait is set to True, this operation fails if the endpoint creation doesn’t finish within max_ingestion_time seconds. If you set this parameter to None it never times out.
operation (str) – Whether to create an endpoint or update an endpoint. Must be either ‘create or ‘update’.
aws_conn_id (str) – The AWS connection ID to use.
- Return Dict
Returns The ARN of the endpoint created in Amazon SageMaker.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerTransformOperator(*, config, aws_conn_id=DEFAULT_CONN_ID, wait_for_completion=True, check_interval=CHECK_INTERVAL_SECOND, max_ingestion_time=None, check_if_job_exists=True, action_if_job_exists='increment', **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Starts a transform job. A transform job uses a trained model to get inferences on a dataset and saves these results to an Amazon S3 location that you specify.
See also
For more information on how to use this operator, take a look at the guide: Create an Amazon SageMaker transform job
- Parameters
config (dict) –
The configuration necessary to start a transform job (templated).
If you need to create a SageMaker transform job based on an existed SageMaker model:
config = transform_config
If you need to create both SageMaker model and SageMaker Transform job:
config = { 'Model': model_config, 'Transform': transform_config }
For details of the configuration parameter of transform_config see
SageMaker.Client.create_transform_job()
For details of the configuration parameter of model_config, See:
SageMaker.Client.create_model()
aws_conn_id (str) – The AWS connection ID to use.
wait_for_completion (bool) – Set to True to wait until the transform job finishes.
check_interval (int) – If wait is set to True, the time interval, in seconds, that this operation waits to check the status of the transform job.
max_ingestion_time (int | None) – If wait is set to True, the operation fails if the transform job doesn’t finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.
check_if_job_exists (bool) – If set to true, then the operator will check whether a transform job already exists for the name in the config.
action_if_job_exists (str) – Behaviour if the job name already exists. Possible options are “increment” (default) and “fail”. This is only relevant if check_if_job_exists is True.
- Return Dict
Returns The ARN of the model created in Amazon SageMaker.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerTuningOperator(*, config, aws_conn_id=DEFAULT_CONN_ID, wait_for_completion=True, check_interval=CHECK_INTERVAL_SECOND, max_ingestion_time=None, **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Starts a hyperparameter tuning job. A hyperparameter tuning job finds the best version of a model by running many training jobs on your dataset using the algorithm you choose and values for hyperparameters within ranges that you specify. It then chooses the hyperparameter values that result in a model that performs the best, as measured by an objective metric that you choose.
See also
For more information on how to use this operator, take a look at the guide: Start a hyperparameter tuning job
- Parameters
config (dict) –
The configuration necessary to start a tuning job (templated).
For details of the configuration parameter see
SageMaker.Client.create_hyper_parameter_tuning_job()
aws_conn_id (str) – The AWS connection ID to use.
wait_for_completion (bool) – Set to True to wait until the tuning job finishes.
check_interval (int) – If wait is set to True, the time interval, in seconds, that this operation waits to check the status of the tuning job.
max_ingestion_time (int | None) – If wait is set to True, the operation fails if the tuning job doesn’t finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.
- Return Dict
Returns The ARN of the tuning job created in Amazon SageMaker.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerModelOperator(*, config, aws_conn_id=DEFAULT_CONN_ID, **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Creates a model in Amazon SageMaker. In the request, you name the model and describe a primary container. For the primary container, you specify the Docker image that contains inference code, artifacts (from prior training), and a custom environment map that the inference code uses when you deploy the model for predictions.
See also
For more information on how to use this operator, take a look at the guide: Create an Amazon SageMaker model
- Parameters
config (dict) –
The configuration necessary to create a model.
For details of the configuration parameter see
SageMaker.Client.create_model()
aws_conn_id (str) – The AWS connection ID to use.
- Return Dict
Returns The ARN of the model created in Amazon SageMaker.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerTrainingOperator(*, config, aws_conn_id=DEFAULT_CONN_ID, wait_for_completion=True, print_log=True, check_interval=CHECK_INTERVAL_SECOND, max_ingestion_time=None, check_if_job_exists=True, action_if_job_exists='increment', **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Starts a model training job. After training completes, Amazon SageMaker saves the resulting model artifacts to an Amazon S3 location that you specify.
See also
For more information on how to use this operator, take a look at the guide: Create an Amazon SageMaker training job
- Parameters
config (dict) –
The configuration necessary to start a training job (templated).
For details of the configuration parameter see
SageMaker.Client.create_training_job()
aws_conn_id (str) – The AWS connection ID to use.
wait_for_completion (bool) – If wait is set to True, the time interval, in seconds, that the operation waits to check the status of the training job.
print_log (bool) – if the operator should print the cloudwatch log during training
check_interval (int) – if wait is set to be true, this is the time interval in seconds which the operator will check the status of the training job
max_ingestion_time (int | None) – If wait is set to True, the operation fails if the training job doesn’t finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.
check_if_job_exists (bool) – If set to true, then the operator will check whether a training job already exists for the name in the config.
action_if_job_exists (str) – Behaviour if the job name already exists. Possible options are “increment” (default) and “fail”. This is only relevant if check_if_job_exists is True.
- Return Dict
Returns The ARN of the training job created in Amazon SageMaker.
- class airflow.providers.amazon.aws.operators.sagemaker.SageMakerDeleteModelOperator(*, config, aws_conn_id=DEFAULT_CONN_ID, **kwargs)[source]¶
Bases:
SageMakerBaseOperator
Deletes a SageMaker model.
See also
For more information on how to use this operator, take a look at the guide: Delete an Amazon SageMaker model
- Parameters
config (dict) – The configuration necessary to delete the model. For details of the configuration parameter see
SageMaker.Client.delete_model()
aws_conn_id (str) – The AWS connection ID to use.