airflow.providers.amazon.aws.operators.athena¶
Module Contents¶
Classes¶
| An operator that submits a Trino/Presto query to Amazon Athena. | 
- class airflow.providers.amazon.aws.operators.athena.AthenaOperator(*, query, database, output_location=None, client_request_token=None, workgroup='primary', query_execution_context=None, result_configuration=None, sleep_time=30, max_polling_attempts=None, log_query=True, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), catalog='AwsDataCatalog', **kwargs)[source]¶
- Bases: - airflow.providers.amazon.aws.operators.base_aws.AwsBaseOperator[- airflow.providers.amazon.aws.hooks.athena.AthenaHook]- An operator that submits a Trino/Presto query to Amazon Athena. - Note - if the task is killed while it runs, it’ll cancel the athena query that was launched, EXCEPT if running in deferrable mode. - See also - For more information on how to use this operator, take a look at the guide: Run a query in Amazon Athena - Parameters
- query (str) – Trino/Presto query to be run on Amazon Athena. (templated) 
- database (str) – Database to select. (templated) 
- catalog (str) – Catalog to select. (templated) 
- output_location (str | None) – s3 path to write the query results into. (templated) To run the query, you must specify the query results location using one of the ways: either for individual queries using either this setting (client-side), or in the workgroup, using WorkGroupConfiguration. If none of them is set, Athena issues an error that no output location is provided 
- client_request_token (str | None) – Unique token created by user to avoid multiple executions of same query 
- workgroup (str) – Athena workgroup in which query will be run. (templated) 
- query_execution_context (dict[str, str] | None) – Context in which query need to be run 
- result_configuration (dict[str, Any] | None) – Dict with path to store results in and config related to encryption 
- sleep_time (int) – Time (in seconds) to wait between two consecutive calls to check query status on Athena 
- max_polling_attempts (int | None) – Number of times to poll for query state before function exits To limit task execution time, use execution_timeout. 
- log_query (bool) – Whether to log athena query and other execution params when it’s executed. Defaults to True. 
- aws_conn_id – The Airflow connection used for AWS credentials. If this is - Noneor empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).
- region_name – AWS region_name. If not specified then the default boto3 behaviour is used. 
- verify – Whether or not to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html 
- botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html