Amazon SageMaker Operators¶
Prerequisite Tasks¶
To use these operators, you must do a few things:
Create necessary resources using AWS Console or AWS CLI.
Install API libraries via pip.
pip install 'apache-airflow[amazon]'
Detailed information is available Installation
Overview¶
Airflow to Amazon SageMaker integration provides several operators to create and interact with SageMaker Jobs.
Purpose¶
This example DAG example_sagemaker.py
uses SageMakerProcessingOperator
, SageMakerTrainingOperator
,
SageMakerModelOperator
, SageMakerDeleteModelOperator
and SageMakerTransformOperator
to
create SageMaker processing job, run the training job,
generate the models artifact in s3, create the model,
, run SageMaker Batch inference and delete the model from SageMaker.
Defining tasks¶
In the following code we create a SageMaker processing, training, Sagemaker Model, batch transform job and then delete the model.
with DAG(
"sample_sagemaker_dag",
schedule_interval=None,
start_date=datetime(2022, 2, 21),
catchup=False,
) as dag:
sagemaker_processing_task = SageMakerProcessingOperator(
config=SAGEMAKER_PROCESSING_JOB_CONFIG,
aws_conn_id="aws_default",
task_id="sagemaker_preprocessing_task",
)
training_task = SageMakerTrainingOperator(
config=SAGEMAKER_TRAINING_JOB_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_training_task"
)
model_create_task = SageMakerModelOperator(
config=SAGEMAKER_CREATE_MODEL_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_create_model_task"
)
inference_task = SageMakerTransformOperator(
config=SAGEMAKER_INFERENCE_CONFIG, aws_conn_id="aws_default", task_id="sagemaker_inference_task"
)
model_delete_task = SageMakerDeleteModelOperator(
task_id="sagemaker_delete_model_task", config={'ModelName': MODEL_NAME}, aws_conn_id="aws_default"
)
sagemaker_processing_task >> training_task >> model_create_task >> inference_task >> model_delete_task