S3 To Redshift Transfer Operator

Overview

The S3ToRedshiftOperator copies data from a S3 Bucket into a Redshift table.

The example dag provided showcases the S3ToRedshiftOperator in action.

  • example_s3_to_redshift.py

example_s3_to_redshift.py

Purpose

This is a basic example dag for using S3ToRedshiftOperator to copies data from a S3 Bucket into a Redshift table.

Environment variables

This example relies on the following variables, which can be passed via OS environment variables.

airflow/providers/amazon/aws/example_dags/example_s3_to_redshift.pyView Source

S3_BUCKET = getenv("S3_BUCKET", "test-bucket")
S3_KEY = getenv("S3_KEY", "key")
REDSHIFT_TABLE = getenv("REDSHIFT_TABLE", "test_table")

You need to set at least the S3_BUCKET.

Copy S3 key into Redshift table

In the following code we are copying the S3 key s3://{S3_BUCKET}/{S3_KEY}/{REDSHIFT_TABLE} into the Redshift table PUBLIC.{REDSHIFT_TABLE}.

airflow/providers/amazon/aws/example_dags/example_s3_to_redshift.pyView Source

    task_transfer_s3_to_redshift = S3ToRedshiftOperator(
        s3_bucket=S3_BUCKET,
        s3_key=S3_KEY,
        schema="PUBLIC",
        table=REDSHIFT_TABLE,
        copy_options=['csv'],
        task_id='transfer_s3_to_redshift',
    )

You can find more information to the COPY command used here.

Was this entry helpful?