airflow.providers.amazon.aws.operators.batch¶
AWS Batch services.
See also
Classes¶
| Execute a job on AWS Batch. | |
| Create an AWS Batch compute environment. | 
Module Contents¶
- class airflow.providers.amazon.aws.operators.batch.BatchOperator(*, job_name, job_definition, job_queue, container_overrides=None, array_properties=None, ecs_properties_override=None, eks_properties_override=None, node_overrides=None, share_identifier=None, scheduling_priority_override=None, parameters=None, retry_strategy=None, job_id=None, waiters=None, max_retries=4200, status_retries=None, tags=None, wait_for_completion=True, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), poll_interval=30, awslogs_enabled=False, awslogs_fetch_interval=timedelta(seconds=30), submit_job_timeout=None, **kwargs)[source]¶
- Bases: - airflow.providers.amazon.aws.operators.base_aws.AwsBaseOperator[- airflow.providers.amazon.aws.hooks.batch_client.BatchClientHook]- Execute a job on AWS Batch. - See also - For more information on how to use this operator, take a look at the guide: Submit a new AWS Batch job - Parameters:
- job_name (str) – the name for the job that will run on AWS Batch (templated) 
- job_definition (str) – the job definition name on AWS Batch 
- job_queue (str) – the queue name on AWS Batch 
- container_overrides (dict | None) – the containerOverrides parameter for boto3 (templated) 
- ecs_properties_override (dict | None) – the ecsPropertiesOverride parameter for boto3 (templated) 
- eks_properties_override (dict | None) – the eksPropertiesOverride parameter for boto3 (templated) 
- node_overrides (dict | None) – the nodeOverrides parameter for boto3 (templated) 
- share_identifier (str | None) – The share identifier for the job. Don’t specify this parameter if the job queue doesn’t have a scheduling policy. 
- scheduling_priority_override (int | None) – The scheduling priority for the job. Jobs with a higher scheduling priority are scheduled before jobs with a lower scheduling priority. This overrides any scheduling priority in the job definition 
- array_properties (dict | None) – the arrayProperties parameter for boto3 
- parameters (dict | None) – the parameters for boto3 (templated) 
- job_id (str | None) – the job ID, usually unknown (None) until the submit_job operation gets the jobId defined by AWS Batch 
- waiters (Any | None) – an - BatchWaitersobject (see note below); if None, polling is used with max_retries and status_retries.
- max_retries (int) – exponential back-off retries, 4200 = 48 hours; polling is only used when waiters is None 
- status_retries (int | None) – number of HTTP retries to get job status, 10; polling is only used when waiters is None 
- aws_conn_id – The Airflow connection used for AWS credentials. If this is - Noneor empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).
- region_name – AWS region_name. If not specified then the default boto3 behaviour is used. 
- verify – Whether or not to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html 
- tags (dict | None) – collection of tags to apply to the AWS Batch job submission if None, no tags are submitted 
- deferrable (bool) – Run operator in the deferrable mode. 
- awslogs_enabled (bool) – Specifies whether logs from CloudWatch should be printed or not, False. If it is an array job, only the logs of the first task will be printed. 
- awslogs_fetch_interval (datetime.timedelta) – The interval with which cloudwatch logs are to be fetched, 30 sec. 
- poll_interval (int) – (Deferrable mode only) Time in seconds to wait between polling. 
- submit_job_timeout (int | None) – Execution timeout in seconds for submitted batch job. 
 
 - Note - Any custom waiters must return a waiter for these calls: .. code-block:: python - waiter = waiters.get_waiter(“JobExists”) waiter = waiters.get_waiter(“JobRunning”) waiter = waiters.get_waiter(“JobComplete”) - template_fields: collections.abc.Sequence[str][source]¶
 
- class airflow.providers.amazon.aws.operators.batch.BatchCreateComputeEnvironmentOperator(compute_environment_name, environment_type, state, compute_resources, unmanaged_v_cpus=None, service_role=None, tags=None, poll_interval=30, max_retries=None, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]¶
- Bases: - airflow.providers.amazon.aws.operators.base_aws.AwsBaseOperator[- airflow.providers.amazon.aws.hooks.batch_client.BatchClientHook]- Create an AWS Batch compute environment. - See also - For more information on how to use this operator, take a look at the guide: Create an AWS Batch compute environment - Parameters:
- compute_environment_name (str) – Name of the AWS batch compute environment (templated). 
- environment_type (str) – Type of the compute-environment. 
- state (str) – State of the compute-environment. 
- compute_resources (dict) – Details about the resources managed by the compute-environment (templated). More details: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/batch.html#Batch.Client.create_compute_environment 
- unmanaged_v_cpus (int | None) – Maximum number of vCPU for an unmanaged compute environment. This parameter is only supported when the - typeparameter is set to- UNMANAGED.
- service_role (str | None) – IAM role that allows Batch to make calls to other AWS services on your behalf (templated). 
- tags (dict | None) – Tags that you apply to the compute-environment to help you categorize and organize your resources. 
- poll_interval (int) – How long to wait in seconds between 2 polls at the environment status. Only useful when deferrable is True. 
- max_retries (int | None) – How many times to poll for the environment status. Only useful when deferrable is True. 
- aws_conn_id – The Airflow connection used for AWS credentials. If this is - Noneor empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).
- region_name – AWS region_name. If not specified then the default boto3 behaviour is used. 
- verify – Whether or not to verify SSL certificates. See: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html 
- deferrable (bool) – If True, the operator will wait asynchronously for the environment to be created. This mode requires aiobotocore module to be installed. (default: False) 
 
 - template_fields: collections.abc.Sequence[str][source]¶