airflow.providers.amazon.aws.operators.batch
¶
An Airflow operator for AWS Batch services
See also
Module Contents¶
Classes¶
Execute a job on AWS Batch |
- class airflow.providers.amazon.aws.operators.batch.AwsBatchOperator(*, job_name: str, job_definition: str, job_queue: str, overrides: dict, array_properties: Optional[dict] = None, parameters: Optional[dict] = None, job_id: Optional[str] = None, waiters: Optional[Any] = None, max_retries: Optional[int] = None, status_retries: Optional[int] = None, aws_conn_id: Optional[str] = None, region_name: Optional[str] = None, tags: Optional[dict] = None, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Execute a job on AWS Batch
- Parameters
job_name (str) -- the name for the job that will run on AWS Batch (templated)
job_definition (str) -- the job definition name on AWS Batch
job_queue (str) -- the queue name on AWS Batch
overrides (Optional[dict]) -- the containerOverrides parameter for boto3 (templated)
array_properties (Optional[dict]) -- the arrayProperties parameter for boto3
parameters (Optional[dict]) -- the parameters for boto3 (templated)
job_id (Optional[str]) -- the job ID, usually unknown (None) until the submit_job operation gets the jobId defined by AWS Batch
waiters (Optional[AwsBatchWaiters]) -- an
AwsBatchWaiters
object (see note below); if None, polling is used with max_retries and status_retries.max_retries (int) -- exponential back-off retries, 4200 = 48 hours; polling is only used when waiters is None
status_retries (int) -- number of HTTP retries to get job status, 10; polling is only used when waiters is None
aws_conn_id (str) -- connection id of AWS credentials / region name. If None, credential boto3 strategy will be used.
region_name (str) -- region name to use in AWS Hook. Override the region_name in connection (if provided)
tags (dict) -- collection of tags to apply to the AWS Batch job submission if None, no tags are submitted
Note
Any custom waiters must return a waiter for these calls: .. code-block:: python
waiter = waiters.get_waiter("JobExists") waiter = waiters.get_waiter("JobRunning") waiter = waiters.get_waiter("JobComplete")
- execute(self, context: airflow.utils.context.Context)[source]¶
Submit and monitor an AWS Batch job
- Raises
AirflowException
- on_kill(self)[source]¶
Override this method to cleanup subprocesses when a task instance gets killed. Any use of the threading, subprocess or multiprocessing module within an operator needs to be cleaned up or it will leave ghost processes behind.
- submit_job(self, context: airflow.utils.context.Context)[source]¶
Submit an AWS Batch job
- Raises
AirflowException
- monitor_job(self, context: airflow.utils.context.Context)[source]¶
Monitor an AWS Batch job monitor_job can raise an exception or an AirflowTaskTimeout can be raised if execution_timeout is given while creating the task. These exceptions should be handled in taskinstance.py instead of here like it was previously done
- Raises
AirflowException