airflow.providers.amazon.aws.operators.batch
¶
An Airflow operator for AWS Batch services
See also
Module Contents¶
Classes¶
Execute a job on AWS Batch |
|
Create an AWS Batch compute environment |
- class airflow.providers.amazon.aws.operators.batch.BatchOperator(*, job_name, job_definition, job_queue, overrides, array_properties=None, parameters=None, job_id=None, waiters=None, max_retries=None, status_retries=None, aws_conn_id=None, region_name=None, tags=None, wait_for_completion=True, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Execute a job on AWS Batch
See also
For more information on how to use this operator, take a look at the guide: Submit a new AWS Batch job
- Parameters
job_name (str) – the name for the job that will run on AWS Batch (templated)
job_definition (str) – the job definition name on AWS Batch
job_queue (str) – the queue name on AWS Batch
overrides (dict) – the containerOverrides parameter for boto3 (templated)
array_properties (dict | None) – the arrayProperties parameter for boto3
parameters (dict | None) – the parameters for boto3 (templated)
job_id (str | None) – the job ID, usually unknown (None) until the submit_job operation gets the jobId defined by AWS Batch
waiters (Any | None) – an
BatchWaiters
object (see note below); if None, polling is used with max_retries and status_retries.max_retries (int | None) – exponential back-off retries, 4200 = 48 hours; polling is only used when waiters is None
status_retries (int | None) – number of HTTP retries to get job status, 10; polling is only used when waiters is None
aws_conn_id (str | None) – connection id of AWS credentials / region name. If None, credential boto3 strategy will be used.
region_name (str | None) – region name to use in AWS Hook. Override the region_name in connection (if provided)
tags (dict | None) – collection of tags to apply to the AWS Batch job submission if None, no tags are submitted
Note
Any custom waiters must return a waiter for these calls: .. code-block:: python
waiter = waiters.get_waiter(“JobExists”) waiter = waiters.get_waiter(“JobRunning”) waiter = waiters.get_waiter(“JobComplete”)
- template_fields :Sequence[str] = ['job_id', 'job_name', 'job_definition', 'job_queue', 'overrides', 'array_properties',...[source]¶
- on_kill()[source]¶
Override this method to cleanup subprocesses when a task instance gets killed. Any use of the threading, subprocess or multiprocessing module within an operator needs to be cleaned up or it will leave ghost processes behind.
- monitor_job(context)[source]¶
Monitor an AWS Batch job monitor_job can raise an exception or an AirflowTaskTimeout can be raised if execution_timeout is given while creating the task. These exceptions should be handled in taskinstance.py instead of here like it was previously done
- Raises
AirflowException
- class airflow.providers.amazon.aws.operators.batch.BatchCreateComputeEnvironmentOperator(compute_environment_name, environment_type, state, compute_resources, unmanaged_v_cpus=None, service_role=None, tags=None, max_retries=None, status_retries=None, aws_conn_id=None, region_name=None, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Create an AWS Batch compute environment
See also
For more information on how to use this operator, take a look at the guide: Create an AWS Batch compute environment
- Parameters
compute_environment_name (str) – the name of the AWS batch compute environment (templated)
environment_type (str) – the type of the compute-environment
state (str) – the state of the compute-environment
compute_resources (dict) – details about the resources managed by the compute-environment (templated). See more details here https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/batch.html#Batch.Client.create_compute_environment
unmanaged_v_cpus (int | None) – the maximum number of vCPU for an unmanaged compute environment. This parameter is only supported when the
type
parameter is set toUNMANAGED
.service_role (str | None) – the IAM role that allows Batch to make calls to other AWS services on your behalf (templated)
tags (dict | None) – the tags that you apply to the compute-environment to help you categorize and organize your resources
max_retries (int | None) – exponential back-off retries, 4200 = 48 hours; polling is only used when waiters is None
status_retries (int | None) – number of HTTP retries to get job status, 10; polling is only used when waiters is None
aws_conn_id (str | None) – connection id of AWS credentials / region name. If None, credential boto3 strategy will be used.
region_name (str | None) – region name to use in AWS Hook. Override the region_name in connection (if provided)