airflow.providers.amazon.aws.hooks.glue_crawler

Module Contents

Classes

GlueCrawlerHook

Interacts with AWS Glue Crawler.

AwsGlueCrawlerHook

This hook is deprecated.

class airflow.providers.amazon.aws.hooks.glue_crawler.GlueCrawlerHook(*args, **kwargs)[source]

Bases: airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook

Interacts with AWS Glue Crawler.

Additional arguments (such as aws_conn_id) may be specified and are passed down to the underlying AwsBaseHook.

See also

AwsBaseHook

glue_client(self)[source]
Returns

AWS Glue client

has_crawler(self, crawler_name) bool[source]

Checks if the crawler already exists

Parameters

crawler_name (str) -- unique crawler name per AWS account

Returns

Returns True if the crawler already exists and False if not.

get_crawler(self, crawler_name: str) dict[source]

Gets crawler configurations

Parameters

crawler_name (str) -- unique crawler name per AWS account

Returns

Nested dictionary of crawler configurations

update_crawler(self, **crawler_kwargs) bool[source]

Updates crawler configurations

Parameters

crawler_kwargs (any) -- Keyword args that define the configurations used for the crawler

Returns

True if crawler was updated and false otherwise

create_crawler(self, **crawler_kwargs) str[source]

Creates an AWS Glue Crawler

Parameters

crawler_kwargs (any) -- Keyword args that define the configurations used to create the crawler

Returns

Name of the crawler

start_crawler(self, crawler_name: str) dict[source]

Triggers the AWS Glue crawler

Parameters

crawler_name (str) -- unique crawler name per AWS account

Returns

Empty dictionary

wait_for_crawler_completion(self, crawler_name: str, poll_interval: int = 5) str[source]

Waits until Glue crawler completes and returns the status of the latest crawl run. Raises AirflowException if the crawler fails or is cancelled.

Parameters
  • crawler_name (str) -- unique crawler name per AWS account

  • poll_interval (int) -- Time (in seconds) to wait between two consecutive calls to check crawler status

Returns

Crawler's status

class airflow.providers.amazon.aws.hooks.glue_crawler.AwsGlueCrawlerHook(*args, **kwargs)[source]

Bases: GlueCrawlerHook

This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.glue_crawler.GlueCrawlerHook.

Was this entry helpful?