S3 To Redshift Transfer Operator¶
Overview¶
The S3ToRedshiftOperator
copies data from a S3 Bucket into a Redshift table.
The example dag provided showcases the
S3ToRedshiftOperator
in action.
example_s3_to_redshift.py
example_s3_to_redshift.py¶
Purpose¶
This is a basic example dag for using S3ToRedshiftOperator
to copies data from a S3 Bucket into a Redshift table.
Environment variables¶
This example relies on the following variables, which can be passed via OS environment variables.
S3_BUCKET = getenv("S3_BUCKET", "test-bucket")
S3_KEY = getenv("S3_KEY", "key")
REDSHIFT_TABLE = getenv("REDSHIFT_TABLE", "test_table")
You need to set at least the S3_BUCKET
.
Copy S3 key into Redshift table¶
In the following code we are copying the S3 key s3://{S3_BUCKET}/{S3_KEY}/{REDSHIFT_TABLE}
into the Redshift table
PUBLIC.{REDSHIFT_TABLE}
.
task_transfer_s3_to_redshift = S3ToRedshiftOperator(
s3_bucket=S3_BUCKET,
s3_key=S3_KEY,
schema="PUBLIC",
table=REDSHIFT_TABLE,
copy_options=['csv'],
task_id='transfer_s3_to_redshift',
)
You can find more information to the COPY
command used
here.