airflow.providers.amazon.aws.hooks.emr_containers
¶
Module Contents¶
-
class
airflow.providers.amazon.aws.hooks.emr_containers.
EMRContainerHook
(*args, virtual_cluster_id: str = None, **kwargs)[source]¶ Bases:
airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook
Interact with AWS EMR Virtual Cluster to run, poll jobs and return job status Additional arguments (such as
aws_conn_id
) may be specified and are passed down to the underlying AwsBaseHook.See also
- Parameters
virtual_cluster_id (str) -- Cluster ID of the EMR on EKS virtual cluster
-
submit_job
(self, name: str, execution_role_arn: str, release_label: str, job_driver: dict, configuration_overrides: Optional[dict] = None, client_request_token: Optional[str] = None)[source]¶ Submit a job to the EMR Containers API and and return the job ID. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr-containers.html#EMRContainers.Client.start_job_run # noqa: E501
- Parameters
name (str) -- The name of the job run.
execution_role_arn (str) -- The IAM role ARN associated with the job run.
release_label (str) -- The Amazon EMR release version to use for the job run.
job_driver (dict) -- Job configuration details, e.g. the Spark job parameters.
configuration_overrides (dict) -- The configuration overrides for the job run, specifically either application configuration or monitoring configuration.
client_request_token (str) -- The client idempotency token of the job run request. Use this if you want to specify a unique ID to prevent two jobs from getting started.
- Returns
Job ID
-
get_job_failure_reason
(self, job_id: str)[source]¶ Fetch the reason for a job failure (e.g. error message). Returns None or reason string.
- Parameters
job_id (str) -- Id of submitted job run
- Returns
str
-
check_query_status
(self, job_id: str)[source]¶ Fetch the status of submitted job run. Returns None or one of valid query states. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr-containers.html#EMRContainers.Client.describe_job_run # noqa: E501 :param job_id: Id of submitted job run :type job_id: str :return: str