`airflow.providers.amazon.aws.operators.redshift_data`¶

Module Contents¶

Classes¶

RedshiftDataOperator

Executes SQL Statements against an Amazon Redshift cluster using Redshift Data.

class airflow.providers.amazon.aws.operators.redshift_data.RedshiftDataOperator(sql, database=None, cluster_identifier=None, db_user=None, parameters=None, secret_arn=None, statement_name=None, with_event=False, wait_for_completion=True, poll_interval=10, return_sql_result=False, workgroup_name=None, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), session_id=None, session_keep_alive_seconds=None, **kwargs)[source]¶

Bases: airflow.providers.amazon.aws.operators.base_aws.AwsBaseOperator[airflow.providers.amazon.aws.hooks.redshift_data.RedshiftDataHook]

Executes SQL Statements against an Amazon Redshift cluster using Redshift Data.

… see also::: For more information on how to use this operator, take a look at the guide: Execute a statement on an Amazon Redshift cluster

Parameters

database (str | None) – the name of the database
sql (str | list) – the SQL statement or list of SQL statement to run
cluster_identifier (str | None) – unique identifier of a cluster
db_user (str | None) – the database username
parameters (list | None) – the parameters for the SQL statement
secret_arn (str | None) – the name or ARN of the secret that enables db access
statement_name (str | None) – the name of the SQL statement
with_event (bool) – indicates whether to send an event to EventBridge
wait_for_completion (bool) – indicates whether to wait for a result, if True wait, if False don’t wait
poll_interval (int) – how often in seconds to check the query status
return_sql_result (bool) – if True will return the result of an SQL statement, if False (default) will return statement ID
workgroup_name (str | None) – name of the Redshift Serverless workgroup. Mutually exclusive with cluster_identifier. Specify this parameter to query Redshift Serverless. More info https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-serverless.html
session_id (str | None) – the session identifier of the query
session_keep_alive_seconds (int | None) – duration in seconds to keep the session alive after the query finishes. The maximum time a session can keep alive is 24 hours
aws_conn_id – The Airflow connection used for AWS credentials. If this is None or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).
region_name – AWS region_name. If not specified then the default boto3 behaviour is used.
verify – Whether to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html
botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html

aws_hook_class[source]¶

template_fields[source]¶

template_ext = ('.sql',)[source]¶

template_fields_renderers[source]¶

execute(context)[source]¶

Execute a statement against Amazon Redshift.

execute_complete(context, event=None)[source]¶

get_sql_results(statement_id, return_sql_result)[source]¶

Retrieve either the result of the SQL query, or the statement ID(s).

Parameters

statement_id (str) – Statement ID of the running queries
return_sql_result (bool) – Boolean, true if results should be returned

on_kill()[source]¶

Cancel the submitted redshift query.

airflow.providers.amazon.aws.operators.redshift_data¶

Module Contents¶

Classes¶

`airflow.providers.amazon.aws.operators.redshift_data`¶