Amazon S3 to Amazon Redshift

Use the S3ToRedshiftOperator transfer to copy the data from an Amazon Simple Storage Service (S3) file into an Amazon Redshift table.

Prerequisite Tasks

To use these operators, you must do a few things:

Operators

Amazon S3 To Amazon Redshift transfer operator

This operator loads data from Amazon S3 to an existing Amazon Redshift table.

To get more information about this operator visit: S3ToRedshiftOperator

Example usage:

tests/system/amazon/aws/example_redshift_s3_transfers.py

transfer_s3_to_redshift = S3ToRedshiftOperator(
    task_id="transfer_s3_to_redshift",
    redshift_data_api_kwargs={
        "database": DB_NAME,
        "cluster_identifier": redshift_cluster_identifier,
        "db_user": DB_LOGIN,
        "wait_for_completion": True,
    },
    s3_bucket=bucket_name,
    s3_key=S3_KEY_2,
    schema="PUBLIC",
    table=REDSHIFT_TABLE,
    copy_options=["csv"],
)

Example of ingesting multiple keys:

tests/system/amazon/aws/example_redshift_s3_transfers.py

transfer_s3_to_redshift_multiple = S3ToRedshiftOperator(
    task_id="transfer_s3_to_redshift_multiple",
    redshift_data_api_kwargs={
        "database": DB_NAME,
        "cluster_identifier": redshift_cluster_identifier,
        "db_user": DB_LOGIN,
        "wait_for_completion": True,
    },
    s3_bucket=bucket_name,
    s3_key=S3_KEY_PREFIX,
    schema="PUBLIC",
    table=REDSHIFT_TABLE,
    copy_options=["csv"],
)

You can find more information to the COPY command used here.

Was this entry helpful?