airflow.providers.amazon.aws.hooks.athena

This module contains AWS Athena hook.

Module Contents

Classes

AthenaHook

Interact with AWS Athena to run, poll queries and return query results

AWSAthenaHook

This hook is deprecated.

class airflow.providers.amazon.aws.hooks.athena.AthenaHook(*args, sleep_time=30, **kwargs)[source]

Bases: airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook

Interact with AWS Athena to run, poll queries and return query results

Additional arguments (such as aws_conn_id) may be specified and are passed down to the underlying AwsBaseHook.

See also

AwsBaseHook

Parameters

sleep_time (int) – Time (in seconds) to wait between two consecutive calls to check query status on Athena

INTERMEDIATE_STATES = ['QUEUED', 'RUNNING'][source]
FAILURE_STATES = ['FAILED', 'CANCELLED'][source]
SUCCESS_STATES = ['SUCCEEDED'][source]
TERMINAL_STATES = ['SUCCEEDED', 'FAILED', 'CANCELLED'][source]
run_query(self, query, query_context, result_configuration, client_request_token=None, workgroup='primary')[source]

Run Presto query on athena with provided config and return submitted query_execution_id

Parameters
  • query (str) – Presto query to run

  • query_context (Dict[str, str]) – Context in which query need to be run

  • result_configuration (Dict[str, Any]) – Dict with path to store results in and config related to encryption

  • client_request_token (Optional[str]) – Unique token created by user to avoid multiple executions of same query

  • workgroup (str) – Athena workgroup name, when not specified, will be ‘primary’

Returns

str

Return type

str

check_query_status(self, query_execution_id)[source]

Fetch the status of submitted athena query. Returns None or one of valid query states.

Parameters

query_execution_id (str) – Id of submitted athena query

Returns

str

Return type

Optional[str]

get_state_change_reason(self, query_execution_id)[source]

Fetch the reason for a state change (e.g. error message). Returns None or reason string.

Parameters

query_execution_id (str) – Id of submitted athena query

Returns

str

Return type

Optional[str]

get_query_results(self, query_execution_id, next_token_id=None, max_results=1000)[source]

Fetch submitted athena query results. returns none if query is in intermediate state or failed/cancelled state else dict of query output

Parameters
  • query_execution_id (str) – Id of submitted athena query

  • next_token_id (Optional[str]) – The token that specifies where to start pagination.

  • max_results (int) – The maximum number of results (rows) to return in this request.

Returns

dict

Return type

Optional[dict]

get_query_results_paginator(self, query_execution_id, max_items=None, page_size=None, starting_token=None)[source]

Fetch submitted athena query results. returns none if query is in intermediate state or failed/cancelled state else a paginator to iterate through pages of results. If you wish to get all results at once, call build_full_result() on the returned PageIterator

Parameters
  • query_execution_id (str) – Id of submitted athena query

  • max_items (Optional[int]) – The total number of items to return.

  • page_size (Optional[int]) – The size of each page.

  • starting_token (Optional[str]) – A token to specify where to start paginating.

Returns

PageIterator

Return type

Optional[botocore.paginate.PageIterator]

poll_query_status(self, query_execution_id, max_tries=None)[source]

Poll the status of submitted athena query until query state reaches final state. Returns one of the final states

Parameters
  • query_execution_id (str) – Id of submitted athena query

  • max_tries (Optional[int]) – Number of times to poll for query state before function exits

Returns

str

Return type

Optional[str]

get_output_location(self, query_execution_id)[source]

Function to get the output location of the query results in s3 uri format.

Parameters

query_execution_id (str) – Id of submitted athena query

Returns

str

Return type

str

stop_query(self, query_execution_id)[source]

Cancel the submitted athena query

Parameters

query_execution_id (str) – Id of submitted athena query

Returns

dict

Return type

Dict

class airflow.providers.amazon.aws.hooks.athena.AWSAthenaHook(*args, **kwargs)[source]

Bases: AthenaHook

This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.athena.AthenaHook.

Was this entry helpful?