airflow.providers.amazon.aws.triggers.emr

Module Contents

Classes

EmrAddStepsTrigger

Asynchronously poll the boto3 API and wait for the steps to finish executing.

EmrCreateJobFlowTrigger

Asynchronously poll the boto3 API and wait for the JobFlow to finish executing.

EmrTerminateJobFlowTrigger

Asynchronously poll the boto3 API and wait for the JobFlow to finish terminating.

EmrContainerTrigger

Poll for the status of EMR container until reaches terminal state.

EmrStepSensorTrigger

Poll for the status of EMR container until reaches terminal state.

class airflow.providers.amazon.aws.triggers.emr.EmrAddStepsTrigger(job_flow_id, step_ids, aws_conn_id, max_attempts, poll_interval)[source]

Bases: airflow.triggers.base.BaseTrigger

Asynchronously poll the boto3 API and wait for the steps to finish executing.

Parameters
  • job_flow_id (str) – The id of the job flow.

  • step_ids (list[str]) – The id of the steps being waited upon.

  • poll_interval (int | None) – The amount of time in seconds to wait between attempts.

  • max_attempts (int | None) – The maximum number of attempts to be made.

  • aws_conn_id (str) – The Airflow connection used for AWS credentials.

serialize()[source]

Returns the information needed to reconstruct this Trigger.

Returns

Tuple of (class path, keyword arguments needed to re-instantiate).

Return type

tuple[str, dict[str, Any]]

async run()[source]

Runs the trigger in an asynchronous context.

The trigger should yield an Event whenever it wants to fire off an event, and return None if it is finished. Single-event triggers should thus yield and then immediately return.

If it yields, it is likely that it will be resumed very quickly, but it may not be (e.g. if the workload is being moved to another triggerer process, or a multi-event trigger was being used for a single-event task defer).

In either case, Trigger classes should assume they will be persisted, and then rely on cleanup() being called when they are no longer needed.

class airflow.providers.amazon.aws.triggers.emr.EmrCreateJobFlowTrigger(job_flow_id, poll_interval=None, max_attempts=None, aws_conn_id=None, waiter_delay=30, waiter_max_attempts=60)[source]

Bases: airflow.providers.amazon.aws.triggers.base.AwsBaseWaiterTrigger

Asynchronously poll the boto3 API and wait for the JobFlow to finish executing.

Parameters
  • job_flow_id (str) – The id of the job flow to wait for.

  • waiter_delay (int) – The amount of time in seconds to wait between attempts.

  • waiter_max_attempts (int) – The maximum number of attempts to be made.

  • aws_conn_id (str | None) – The Airflow connection used for AWS credentials.

hook()[source]

Override in subclasses to return the right hook.

class airflow.providers.amazon.aws.triggers.emr.EmrTerminateJobFlowTrigger(job_flow_id, poll_interval=None, max_attempts=None, aws_conn_id=None, waiter_delay=30, waiter_max_attempts=60)[source]

Bases: airflow.providers.amazon.aws.triggers.base.AwsBaseWaiterTrigger

Asynchronously poll the boto3 API and wait for the JobFlow to finish terminating.

Parameters
  • job_flow_id (str) – ID of the EMR Job Flow to terminate

  • waiter_delay (int) – The amount of time in seconds to wait between attempts.

  • waiter_max_attempts (int) – The maximum number of attempts to be made.

  • aws_conn_id (str | None) – The Airflow connection used for AWS credentials.

hook()[source]

Override in subclasses to return the right hook.

class airflow.providers.amazon.aws.triggers.emr.EmrContainerTrigger(virtual_cluster_id, job_id, aws_conn_id='aws_default', poll_interval=None, waiter_delay=30, waiter_max_attempts=600)[source]

Bases: airflow.providers.amazon.aws.triggers.base.AwsBaseWaiterTrigger

Poll for the status of EMR container until reaches terminal state.

Parameters
  • virtual_cluster_id (str) – Reference Emr cluster id

  • job_id (str) – job_id to check the state

  • aws_conn_id (str) – Reference to AWS connection id

  • waiter_delay (int) – polling period in seconds to check for the status

hook()[source]

Override in subclasses to return the right hook.

class airflow.providers.amazon.aws.triggers.emr.EmrStepSensorTrigger(job_flow_id, step_id, waiter_delay=30, waiter_max_attempts=60, aws_conn_id='aws_default')[source]

Bases: airflow.providers.amazon.aws.triggers.base.AwsBaseWaiterTrigger

Poll for the status of EMR container until reaches terminal state.

Parameters
  • job_flow_id (str) – job_flow_id which contains the step check the state of

  • step_id (str) – step to check the state of

  • waiter_delay (int) – polling period in seconds to check for the status

  • waiter_max_attempts (int) – The maximum number of attempts to be made

  • aws_conn_id (str) – Reference to AWS connection id

hook()[source]

Override in subclasses to return the right hook.

Was this entry helpful?