airflow.providers.amazon.aws.operators.sagemaker

Module Contents

Classes

SageMakerBaseOperator

This is the base operator for all SageMaker operators.

SageMakerProcessingOperator

Initiate a SageMaker processing job.

SageMakerEndpointConfigOperator

Create a SageMaker endpoint config.

SageMakerEndpointOperator

Create a SageMaker endpoint.

SageMakerTransformOperator

Initiate a SageMaker transform job.

SageMakerTuningOperator

Initiate a SageMaker hyperparameter tuning job.

SageMakerModelOperator

Create a SageMaker model.

SageMakerTrainingOperator

Initiate a SageMaker training job.

class airflow.providers.amazon.aws.operators.sagemaker.SageMakerBaseOperator(*, config: dict, aws_conn_id: str = 'aws_default', **kwargs)[source]

Bases: airflow.models.BaseOperator

This is the base operator for all SageMaker operators.

Parameters
  • config (dict) -- The configuration necessary to start a training job (templated)

  • aws_conn_id (str) -- The AWS connection ID to use.

template_fields :Sequence[str] = ['config'][source]
template_ext :Sequence[str] = [][source]
template_fields_renderers[source]
ui_color = #ededed[source]
integer_fields = [][source]
parse_integer(self, config, field)[source]

Recursive method for parsing string fields holding integer values to integers.

parse_config_integers(self)[source]

Parse the integer fields of training config to integers in case the config is rendered by Jinja and all fields are str

expand_role(self)[source]

Placeholder for calling boto3's expand_role, which expands an IAM role name into an ARN.

preprocess_config(self)[source]

Process the config into a usable form.

abstract execute(self, context: airflow.utils.context.Context)[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

hook(self)[source]

Return SageMakerHook

class airflow.providers.amazon.aws.operators.sagemaker.SageMakerProcessingOperator(*, config: dict, aws_conn_id: str, wait_for_completion: bool = True, print_log: bool = True, check_interval: int = 30, max_ingestion_time: Optional[int] = None, action_if_job_exists: str = 'increment', **kwargs)[source]

Bases: SageMakerBaseOperator

Initiate a SageMaker processing job.

This operator returns The ARN of the processing job created in Amazon SageMaker.

Parameters
  • config (dict) --

    The configuration necessary to start a processing job (templated).

    For details of the configuration parameter see SageMaker.Client.create_processing_job()

  • aws_conn_id (str) -- The AWS connection ID to use.

  • wait_for_completion (bool) -- If wait is set to True, the time interval, in seconds, that the operation waits to check the status of the processing job.

  • print_log (bool) -- if the operator should print the cloudwatch log during processing

  • check_interval (int) -- if wait is set to be true, this is the time interval in seconds which the operator will check the status of the processing job

  • max_ingestion_time (int) -- If wait is set to True, the operation fails if the processing job doesn't finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.

  • action_if_job_exists (str) -- Behaviour if the job name already exists. Possible options are "increment" (default) and "fail".

expand_role(self) None[source]

Placeholder for calling boto3's expand_role, which expands an IAM role name into an ARN.

execute(self, context: airflow.utils.context.Context) dict[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.amazon.aws.operators.sagemaker.SageMakerEndpointConfigOperator(*, config: dict, **kwargs)[source]

Bases: SageMakerBaseOperator

Create a SageMaker endpoint config.

This operator returns The ARN of the endpoint config created in Amazon SageMaker

Parameters
integer_fields = [['ProductionVariants', 'InitialInstanceCount']][source]
execute(self, context: airflow.utils.context.Context) dict[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.amazon.aws.operators.sagemaker.SageMakerEndpointOperator(*, config: dict, wait_for_completion: bool = True, check_interval: int = 30, max_ingestion_time: Optional[int] = None, operation: str = 'create', **kwargs)[source]

Bases: SageMakerBaseOperator

Create a SageMaker endpoint.

This operator returns The ARN of the endpoint created in Amazon SageMaker

Parameters
  • config (dict) --

    The configuration necessary to create an endpoint.

    If you need to create a SageMaker endpoint based on an existed SageMaker model and an existed SageMaker endpoint config:

    config = endpoint_configuration;
    

    If you need to create all of SageMaker model, SageMaker endpoint-config and SageMaker endpoint:

    config = {
        'Model': model_configuration,
        'EndpointConfig': endpoint_config_configuration,
        'Endpoint': endpoint_configuration
    }
    

    For details of the configuration parameter of model_configuration see SageMaker.Client.create_model()

    For details of the configuration parameter of endpoint_config_configuration see SageMaker.Client.create_endpoint_config()

    For details of the configuration parameter of endpoint_configuration see SageMaker.Client.create_endpoint()

  • aws_conn_id (str) -- The AWS connection ID to use.

  • wait_for_completion (bool) -- Whether the operator should wait until the endpoint creation finishes.

  • check_interval (int) -- If wait is set to True, this is the time interval, in seconds, that this operation waits before polling the status of the endpoint creation.

  • max_ingestion_time (int) -- If wait is set to True, this operation fails if the endpoint creation doesn't finish within max_ingestion_time seconds. If you set this parameter to None it never times out.

  • operation (str) -- Whether to create an endpoint or update an endpoint. Must be either 'create or 'update'.

create_integer_fields(self) None[source]

Set fields which should be casted to integers.

expand_role(self) None[source]

Placeholder for calling boto3's expand_role, which expands an IAM role name into an ARN.

execute(self, context: airflow.utils.context.Context) dict[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.amazon.aws.operators.sagemaker.SageMakerTransformOperator(*, config: dict, wait_for_completion: bool = True, check_interval: int = 30, max_ingestion_time: Optional[int] = None, **kwargs)[source]

Bases: SageMakerBaseOperator

Initiate a SageMaker transform job.

This operator returns The ARN of the model created in Amazon SageMaker.

Parameters
  • config (dict) --

    The configuration necessary to start a transform job (templated).

    If you need to create a SageMaker transform job based on an existed SageMaker model:

    config = transform_config
    

    If you need to create both SageMaker model and SageMaker Transform job:

    config = {
        'Model': model_config,
        'Transform': transform_config
    }
    

    For details of the configuration parameter of transform_config see SageMaker.Client.create_transform_job()

    For details of the configuration parameter of model_config, See: SageMaker.Client.create_model()

  • aws_conn_id (str) -- The AWS connection ID to use.

  • wait_for_completion (bool) -- Set to True to wait until the transform job finishes.

  • check_interval (int) -- If wait is set to True, the time interval, in seconds, that this operation waits to check the status of the transform job.

  • max_ingestion_time (int) -- If wait is set to True, the operation fails if the transform job doesn't finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.

create_integer_fields(self) None[source]

Set fields which should be casted to integers.

expand_role(self) None[source]

Placeholder for calling boto3's expand_role, which expands an IAM role name into an ARN.

execute(self, context: airflow.utils.context.Context) dict[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.amazon.aws.operators.sagemaker.SageMakerTuningOperator(*, config: dict, wait_for_completion: bool = True, check_interval: int = 30, max_ingestion_time: Optional[int] = None, **kwargs)[source]

Bases: SageMakerBaseOperator

Initiate a SageMaker hyperparameter tuning job.

This operator returns The ARN of the tuning job created in Amazon SageMaker.

Parameters
  • config (dict) --

    The configuration necessary to start a tuning job (templated).

    For details of the configuration parameter see SageMaker.Client.create_hyper_parameter_tuning_job()

  • aws_conn_id (str) -- The AWS connection ID to use.

  • wait_for_completion (bool) -- Set to True to wait until the tuning job finishes.

  • check_interval (int) -- If wait is set to True, the time interval, in seconds, that this operation waits to check the status of the tuning job.

  • max_ingestion_time (int) -- If wait is set to True, the operation fails if the tuning job doesn't finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.

integer_fields = [['HyperParameterTuningJobConfig', 'ResourceLimits', 'MaxNumberOfTrainingJobs'],...[source]
expand_role(self) None[source]

Placeholder for calling boto3's expand_role, which expands an IAM role name into an ARN.

execute(self, context: airflow.utils.context.Context) dict[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.amazon.aws.operators.sagemaker.SageMakerModelOperator(*, config, **kwargs)[source]

Bases: SageMakerBaseOperator

Create a SageMaker model.

This operator returns The ARN of the model created in Amazon SageMaker

Parameters
  • config (dict) --

    The configuration necessary to create a model.

    For details of the configuration parameter see SageMaker.Client.create_model()

  • aws_conn_id (str) -- The AWS connection ID to use.

expand_role(self) None[source]

Placeholder for calling boto3's expand_role, which expands an IAM role name into an ARN.

execute(self, context: airflow.utils.context.Context) dict[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.amazon.aws.operators.sagemaker.SageMakerTrainingOperator(*, config: dict, wait_for_completion: bool = True, print_log: bool = True, check_interval: int = 30, max_ingestion_time: Optional[int] = None, check_if_job_exists: bool = True, action_if_job_exists: str = 'increment', **kwargs)[source]

Bases: SageMakerBaseOperator

Initiate a SageMaker training job.

This operator returns The ARN of the training job created in Amazon SageMaker.

Parameters
  • config (dict) --

    The configuration necessary to start a training job (templated).

    For details of the configuration parameter see SageMaker.Client.create_training_job()

  • aws_conn_id (str) -- The AWS connection ID to use.

  • wait_for_completion (bool) -- If wait is set to True, the time interval, in seconds, that the operation waits to check the status of the training job.

  • print_log (bool) -- if the operator should print the cloudwatch log during training

  • check_interval (int) -- if wait is set to be true, this is the time interval in seconds which the operator will check the status of the training job

  • max_ingestion_time (int) -- If wait is set to True, the operation fails if the training job doesn't finish within max_ingestion_time seconds. If you set this parameter to None, the operation does not timeout.

  • check_if_job_exists (bool) -- If set to true, then the operator will check whether a training job already exists for the name in the config.

  • action_if_job_exists -- Behaviour if the job name already exists. Possible options are "increment" (default) and "fail". This is only relevant if check_if

integer_fields = [['ResourceConfig', 'InstanceCount'], ['ResourceConfig', 'VolumeSizeInGB'],...[source]
expand_role(self) None[source]

Placeholder for calling boto3's expand_role, which expands an IAM role name into an ARN.

execute(self, context: airflow.utils.context.Context) dict[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?