airflow.providers.amazon.aws.sensors.glue

Module Contents

Classes

GlueJobSensor

Waits for an AWS Glue Job to reach any of the status below.

GlueDataQualityRuleSetEvaluationRunSensor

Waits for an AWS Glue data quality ruleset evaluation run to reach any of the status below.

GlueDataQualityRuleRecommendationRunSensor

Waits for an AWS Glue data quality recommendation run to reach any of the status below.

class airflow.providers.amazon.aws.sensors.glue.GlueJobSensor(*, job_name, run_id, verbose=False, aws_conn_id='aws_default', **kwargs)[source]

Bases: airflow.sensors.base.BaseSensorOperator

Waits for an AWS Glue Job to reach any of the status below.

‘FAILED’, ‘STOPPED’, ‘SUCCEEDED’

See also

For more information on how to use this sensor, take a look at the guide: Wait on an AWS Glue job state

Parameters
  • job_name (str) – The AWS Glue Job unique name

  • run_id (str) – The AWS Glue current running job identifier

  • verbose (bool) – If True, more Glue Job Run logs show in the Airflow Task Logs. (default: False)

template_fields: Sequence[str] = ('job_name', 'run_id')[source]
hook()[source]
poke(context)[source]

Override when deriving this class.

class airflow.providers.amazon.aws.sensors.glue.GlueDataQualityRuleSetEvaluationRunSensor(*, evaluation_run_id, show_results=True, verify_result_status=True, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), poke_interval=120, max_retries=60, aws_conn_id='aws_default', **kwargs)[source]

Bases: airflow.providers.amazon.aws.sensors.base_aws.AwsBaseSensor[airflow.providers.amazon.aws.hooks.glue.GlueDataQualityHook]

Waits for an AWS Glue data quality ruleset evaluation run to reach any of the status below.

‘FAILED’, ‘STOPPED’, ‘STOPPING’, ‘TIMEOUT’, ‘SUCCEEDED’

See also

For more information on how to use this sensor, take a look at the guide: Wait on an AWS Glue Data Quality Evaluation Run

Parameters
  • evaluation_run_id (str) – The AWS Glue data quality ruleset evaluation run identifier.

  • verify_result_status (bool) – Validate all the ruleset rules evaluation run results, If any of the rule status is Fail or Error then an exception is thrown. (default: True)

  • show_results (bool) – Displays all the ruleset rules evaluation run results. (default: True)

  • deferrable (bool) – If True, the sensor will operate in deferrable mode. This mode requires aiobotocore module to be installed. (default: False, but can be overridden in config file by setting default_deferrable to True)

  • poke_interval (int) – Polling period in seconds to check for the status of the job. (default: 120)

  • max_retries (int) – Number of times before returning the current state. (default: 60)

  • aws_conn_id (str | None) – The Airflow connection used for AWS credentials. If this is None or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).

  • region_name – AWS region_name. If not specified then the default boto3 behaviour is used.

  • verify – Whether to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html

  • botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html

SUCCESS_STATES = ('SUCCEEDED',)[source]
FAILURE_STATES = ('FAILED', 'STOPPED', 'STOPPING', 'TIMEOUT')[source]
aws_hook_class[source]
template_fields: Sequence[str][source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

execute_complete(context, event=None)[source]
poke(context)[source]

Override when deriving this class.

class airflow.providers.amazon.aws.sensors.glue.GlueDataQualityRuleRecommendationRunSensor(*, recommendation_run_id, show_results=True, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), poke_interval=120, max_retries=60, aws_conn_id='aws_default', **kwargs)[source]

Bases: airflow.providers.amazon.aws.sensors.base_aws.AwsBaseSensor[airflow.providers.amazon.aws.hooks.glue.GlueDataQualityHook]

Waits for an AWS Glue data quality recommendation run to reach any of the status below.

‘FAILED’, ‘STOPPED’, ‘STOPPING’, ‘TIMEOUT’, ‘SUCCEEDED’

See also

For more information on how to use this sensor, take a look at the guide: Wait on an AWS Glue Data Quality Recommendation Run

Parameters
  • recommendation_run_id (str) – The AWS Glue data quality rule recommendation run identifier.

  • show_results (bool) – Displays the recommended ruleset (a set of rules), when recommendation run completes. (default: True)

  • deferrable (bool) – If True, the sensor will operate in deferrable mode. This mode requires aiobotocore module to be installed. (default: False, but can be overridden in config file by setting default_deferrable to True)

  • poke_interval (int) – Polling period in seconds to check for the status of the job. (default: 120)

  • max_retries (int) – Number of times before returning the current state. (default: 60)

  • aws_conn_id (str | None) – The Airflow connection used for AWS credentials. If this is None or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).

  • region_name – AWS region_name. If not specified then the default boto3 behaviour is used.

  • verify – Whether to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html

  • botocore_config – Configuration dictionary (key-values) for botocore client. See: https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html

SUCCESS_STATES = ('SUCCEEDED',)[source]
FAILURE_STATES = ('FAILED', 'STOPPED', 'STOPPING', 'TIMEOUT')[source]
aws_hook_class[source]
template_fields: Sequence[str][source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

execute_complete(context, event=None)[source]
poke(context)[source]

Override when deriving this class.

Was this entry helpful?