airflow.providers.amazon.aws.hooks.redshift_data¶

Attributes¶

`FINISHED_STATE`
`FAILED_STATE`
`ABORTED_STATE`
`FAILURE_STATES`
`RUNNING_STATES`

Exceptions¶

`RedshiftDataQueryFailedError`	Raise an error that redshift data query failed.
`RedshiftDataQueryAbortedError`	Raise an error that redshift data query was aborted.

Classes¶

`QueryExecutionOutput`	Describes the output of a query execution.
`RedshiftDataHook`	Interact with Amazon Redshift Data API.

Module Contents¶

airflow.providers.amazon.aws.hooks.redshift_data.FINISHED_STATE = 'FINISHED'[source]¶

airflow.providers.amazon.aws.hooks.redshift_data.FAILED_STATE = 'FAILED'[source]¶

airflow.providers.amazon.aws.hooks.redshift_data.ABORTED_STATE = 'ABORTED'[source]¶

airflow.providers.amazon.aws.hooks.redshift_data.FAILURE_STATES[source]¶

airflow.providers.amazon.aws.hooks.redshift_data.RUNNING_STATES[source]¶

class airflow.providers.amazon.aws.hooks.redshift_data.QueryExecutionOutput[source]¶

Describes the output of a query execution.

statement_id: str[source]¶

session_id: str | None[source]¶

exception airflow.providers.amazon.aws.hooks.redshift_data.RedshiftDataQueryFailedError[source]¶

Bases: ValueError

Raise an error that redshift data query failed.

exception airflow.providers.amazon.aws.hooks.redshift_data.RedshiftDataQueryAbortedError[source]¶

Bases: ValueError

Raise an error that redshift data query was aborted.

class airflow.providers.amazon.aws.hooks.redshift_data.RedshiftDataHook(*args, **kwargs)[source]¶

Bases: airflow.providers.amazon.aws.hooks.base_aws.AwsGenericHook[mypy_boto3_redshift_data.RedshiftDataAPIServiceClient]

Interact with Amazon Redshift Data API.

Provide thin wrapper around boto3.client("redshift-data").

Additional arguments (such as aws_conn_id) may be specified and are passed down to the underlying AwsBaseHook.

See also

execute_query(sql, database=None, cluster_identifier=None, db_user=None, parameters=None, secret_arn=None, statement_name=None, with_event=False, wait_for_completion=True, poll_interval=10, workgroup_name=None, session_id=None, session_keep_alive_seconds=None)[source]¶

Execute a statement against Amazon Redshift.

Parameters:

sql (str | list[str]) – the SQL statement or list of SQL statement to run
database (str | None) – the name of the database
cluster_identifier (str | None) – unique identifier of a cluster
db_user (str | None) – the database username
parameters (collections.abc.Iterable | None) – the parameters for the SQL statement
secret_arn (str | None) – the name or ARN of the secret that enables db access
statement_name (str | None) – the name of the SQL statement
with_event (bool) – whether to send an event to EventBridge
wait_for_completion (bool) – whether to wait for a result
poll_interval (int) – how often in seconds to check the query status
workgroup_name (str | None) – name of the Redshift Serverless workgroup. Mutually exclusive with cluster_identifier. Specify this parameter to query Redshift Serverless. More info https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-serverless.html
session_id (str | None) – the session identifier of the query
session_keep_alive_seconds (int | None) – duration in seconds to keep the session alive after the query finishes. The maximum time a session can keep alive is 24 hours

Returns statement_id:

str, the UUID of the statement

Return type:

QueryExecutionOutput

wait_for_results(statement_id, poll_interval)[source]¶

check_query_is_finished(statement_id)[source]¶

Check whether query finished, raise exception is failed.

parse_statement_response(resp)[source]¶

Parse the response of describe_statement.

get_table_primary_key(table, database, schema='public', cluster_identifier=None, workgroup_name=None, db_user=None, secret_arn=None, statement_name=None, with_event=False, wait_for_completion=True, poll_interval=10)[source]¶

Return the table primary key.

Copied from RedshiftSQLHook.get_table_primary_key()

Parameters:

table (str) – Name of the target table
database (str) – the name of the database
schema (str | None) – Name of the target schema, public by default
cluster_identifier (str | None) – unique identifier of a cluster
workgroup_name (str | None) – name of the Redshift Serverless workgroup. Mutually exclusive with cluster_identifier. Specify this parameter to query Redshift Serverless. More info https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-serverless.html
db_user (str | None) – the database username
secret_arn (str | None) – the name or ARN of the secret that enables db access
statement_name (str | None) – the name of the SQL statement
with_event (bool) – indicates whether to send an event to EventBridge
wait_for_completion (bool) – indicates whether to wait for a result, if True wait, if False don’t wait
poll_interval (int) – how often in seconds to check the query status

Returns:

Primary key columns list

Return type:

list[str] | None

async is_still_running(statement_id)[source]¶

Async function to check whether the query is still running.

Parameters:: statement_id (str) – the UUID of the statement

async check_query_is_finished_async(statement_id)[source]¶

Async function to check statement is finished.

It takes statement_id, makes async connection to redshift data to get the query status by statement_id and returns the query status.

Parameters:: statement_id (str) – the UUID of the statement