airflow.providers.amazon.aws.hooks.glue_crawler

Module Contents

Classes

GlueCrawlerHook

Interacts with AWS Glue Crawler.

AwsGlueCrawlerHook

This hook is deprecated.

class airflow.providers.amazon.aws.hooks.glue_crawler.GlueCrawlerHook(*args, **kwargs)[source]

Bases: airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook

Interacts with AWS Glue Crawler.

Additional arguments (such as aws_conn_id) may be specified and are passed down to the underlying AwsBaseHook.

See also

AwsBaseHook

glue_client(self)[source]
Returns

AWS Glue client

has_crawler(self, crawler_name)[source]

Checks if the crawler already exists

Parameters

crawler_name -- unique crawler name per AWS account

Returns

Returns True if the crawler already exists and False if not.

Return type

bool

get_crawler(self, crawler_name)[source]

Gets crawler configurations

Parameters

crawler_name (str) -- unique crawler name per AWS account

Returns

Nested dictionary of crawler configurations

Return type

dict

update_crawler(self, **crawler_kwargs)[source]

Updates crawler configurations

Parameters

crawler_kwargs -- Keyword args that define the configurations used for the crawler

Returns

True if crawler was updated and false otherwise

Return type

bool

create_crawler(self, **crawler_kwargs)[source]

Creates an AWS Glue Crawler

Parameters

crawler_kwargs -- Keyword args that define the configurations used to create the crawler

Returns

Name of the crawler

Return type

str

start_crawler(self, crawler_name)[source]

Triggers the AWS Glue crawler

Parameters

crawler_name (str) -- unique crawler name per AWS account

Returns

Empty dictionary

Return type

dict

wait_for_crawler_completion(self, crawler_name, poll_interval=5)[source]

Waits until Glue crawler completes and returns the status of the latest crawl run. Raises AirflowException if the crawler fails or is cancelled.

Parameters
  • crawler_name (str) -- unique crawler name per AWS account

  • poll_interval (int) -- Time (in seconds) to wait between two consecutive calls to check crawler status

Returns

Crawler's status

Return type

str

class airflow.providers.amazon.aws.hooks.glue_crawler.AwsGlueCrawlerHook(*args, **kwargs)[source]

Bases: GlueCrawlerHook

This hook is deprecated. Please use airflow.providers.amazon.aws.hooks.glue_crawler.GlueCrawlerHook.

Was this entry helpful?