Amazon Redshift Operators

Amazon Redshift manages all the work of setting up, operating, and scaling a data warehouse: provisioning capacity, monitoring and backing up the cluster, and applying patches and upgrades to the Amazon Redshift engine. You can focus on using your data to acquire new insights for your business and customers.

Airflow provides an operator to execute queries against an Amazon Redshift cluster.

Prerequisite Tasks

To use these operators, you must do a few things:

Redshift SQL

This operator executes a SQL query against an Amazon Redshift cluster.

Execute a SQL query

airflow/providers/amazon/aws/example_dags/example_redshift_sql.py[source]

task_select_data = RedshiftSQLOperator(
    task_id='task_get_all_table_data', sql="""CREATE TABLE more_fruit AS SELECT * FROM fruit;"""
)

Execute a SQL query with parameters

RedshiftSQLOperator supports the parameters attribute which allows us to dynamically pass parameters into SQL statements.

airflow/providers/amazon/aws/example_dags/example_redshift_sql.py[source]

task_select_filtered_data = RedshiftSQLOperator(
    task_id='task_get_filtered_table_data',
    sql="""CREATE TABLE filtered_fruit AS SELECT * FROM fruit WHERE color = '{{ params.color }}';""",
    params={'color': 'Red'},
)

Reference

For further information, look at:

Was this entry helpful?