airflow.providers.amazon.aws.sensors.emr
¶
Module Contents¶
Classes¶
Contains general sensor behavior for EMR. |
|
Asks for the state of the job run until it reaches a failure state or success state. |
|
Asks for the state of the EMR JobFlow (Cluster) until it reaches |
|
Asks for the state of the step until it reaches any of the target states. |
- class airflow.providers.amazon.aws.sensors.emr.EmrBaseSensor(*, aws_conn_id: str = 'aws_default', **kwargs)[source]¶
Bases:
airflow.sensors.base.BaseSensorOperator
Contains general sensor behavior for EMR.
- Subclasses should implement following methods:
get_emr_response()
state_from_response()
failure_message_from_response()
Subclasses should set
target_states
andfailed_states
fields.- Parameters
aws_conn_id (str) -- aws connection to uses
- get_hook(self) airflow.providers.amazon.aws.hooks.emr.EmrHook [source]¶
Get EmrHook
- poke(self, context: airflow.utils.context.Context)[source]¶
Function that the sensors defined while deriving this class should override.
- abstract get_emr_response(self) Dict[str, Any] [source]¶
Make an API call with boto3 and get response.
- class airflow.providers.amazon.aws.sensors.emr.EmrContainerSensor(*, virtual_cluster_id: str, job_id: str, max_retries: Optional[int] = None, aws_conn_id: str = 'aws_default', poll_interval: int = 10, **kwargs: Any)[source]¶
Bases:
airflow.sensors.base.BaseSensorOperator
Asks for the state of the job run until it reaches a failure state or success state. If the job run fails, the task will fail.
- Parameters
job_id (str) -- job_id to check the state of
max_retries (int) -- Number of times to poll for query state before returning the current state, defaults to None
aws_conn_id (str) -- aws connection to use, defaults to 'aws_default'
poll_interval (int) -- Time in seconds to wait between two consecutive call to check query status on athena, defaults to 10
- poke(self, context: airflow.utils.context.Context) bool [source]¶
Function that the sensors defined while deriving this class should override.
- hook(self) airflow.providers.amazon.aws.hooks.emr.EmrContainerHook [source]¶
Create and return an EmrContainerHook
- class airflow.providers.amazon.aws.sensors.emr.EmrJobFlowSensor(*, job_flow_id: str, target_states: Optional[Iterable[str]] = None, failed_states: Optional[Iterable[str]] = None, **kwargs)[source]¶
Bases:
EmrBaseSensor
Asks for the state of the EMR JobFlow (Cluster) until it reaches any of the target states. If it fails the sensor errors, failing the task.
With the default target states, sensor waits cluster to be terminated. When target_states is set to ['RUNNING', 'WAITING'] sensor waits until job flow to be ready (after 'STARTING' and 'BOOTSTRAPPING' states)
- Parameters
- get_emr_response(self) Dict[str, Any] [source]¶
Make an API call with boto3 and get cluster-level details.
- class airflow.providers.amazon.aws.sensors.emr.EmrStepSensor(*, job_flow_id: str, step_id: str, target_states: Optional[Iterable[str]] = None, failed_states: Optional[Iterable[str]] = None, **kwargs)[source]¶
Bases:
EmrBaseSensor
Asks for the state of the step until it reaches any of the target states. If it fails the sensor errors, failing the task.
With the default target states, sensor waits step to be completed.
- Parameters
job_flow_id (str) -- job_flow_id which contains the step check the state of
step_id (str) -- step to check the state of
target_states (list[str]) -- the target states, sensor waits until step reaches any of these states
failed_states (list[str]) -- the failure states, sensor fails when step reaches any of these states
- template_fields :Sequence[str] = ['job_flow_id', 'step_id', 'target_states', 'failed_states'][source]¶
- get_emr_response(self) Dict[str, Any] [source]¶
Make an API call with boto3 and get details about the cluster step.