airflow.providers.amazon.aws.hooks.batch_client
¶
A client for AWS Batch services
See also
Module Contents¶
Classes¶
A structured Protocol for |
|
A client for AWS Batch services. |
|
This class is deprecated. |
|
This hook is deprecated. |
- class airflow.providers.amazon.aws.hooks.batch_client.BatchProtocol[source]¶
Bases:
airflow.typing_compat.Protocol
A structured Protocol for
boto3.client('batch') -> botocore.client.Batch
. This is used for type hints onBatchClient.client()
; it covers only the subset of client methods required.See also
- describe_jobs(self, jobs)[source]¶
Get job descriptions from AWS Batch
- Parameters
jobs (List[str]) -- a list of JobId to describe
- Returns
an API response to describe jobs
- Return type
Dict
- get_waiter(self, waiterName)[source]¶
Get an AWS Batch service waiter
- Parameters
waiterName (str) -- The name of the waiter. The name should match the name (including the casing) of the key name in the waiter model file (typically this is CamelCasing).
- Returns
a waiter object for the named AWS Batch service
- Return type
botocore.waiter.Waiter
Note
AWS Batch might not have any waiters (until botocore PR-1307 is released).
import boto3 boto3.client("batch").waiter_names == []
- submit_job(self, jobName, jobQueue, jobDefinition, arrayProperties, parameters, containerOverrides, tags)[source]¶
Submit a Batch job
- Parameters
jobName (str) -- the name for the AWS Batch job
jobQueue (str) -- the queue name on AWS Batch
jobDefinition (str) -- the job definition name on AWS Batch
arrayProperties (Dict) -- the same parameter that boto3 will receive
parameters (Dict) -- the same parameter that boto3 will receive
containerOverrides (Dict) -- the same parameter that boto3 will receive
tags (Dict) -- the same parameter that boto3 will receive
- Returns
an API response
- Return type
Dict
- class airflow.providers.amazon.aws.hooks.batch_client.BatchClientHook(*args, max_retries=None, status_retries=None, **kwargs)[source]¶
Bases:
airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook
A client for AWS Batch services.
- Parameters
Note
Several methods use a default random delay to check or poll for job status, i.e.
random.uniform(DEFAULT_DELAY_MIN, DEFAULT_DELAY_MAX)
Using a random interval helps to avoid AWS API throttle limits when many concurrent tasks request job-descriptions.To modify the global defaults for the range of jitter allowed when a random delay is used to check Batch job status, modify these defaults, e.g.: .. code-block:
BatchClient.DEFAULT_DELAY_MIN = 0 BatchClient.DEFAULT_DELAY_MAX = 5
When explicit delay values are used, a 1 second random jitter is applied to the delay (e.g. a delay of 0 sec will be a
random.uniform(0, 1)
delay. It is generally recommended that random jitter is added to API requests. A convenience method is provided for this, e.g. to get a random delay of 10 sec +/- 5 sec:delay = BatchClient.add_jitter(10, width=5, minima=0)
See also
- property client(self)[source]¶
An AWS API client for Batch services.
- Returns
a boto3 'batch' client for the
.region_name
- Return type
Union[BatchProtocol, botocore.client.BaseClient]
- check_job_success(self, job_id)[source]¶
Check the final status of the Batch job; return True if the job 'SUCCEEDED', else raise an AirflowException
- poll_for_job_running(self, job_id, delay=None)[source]¶
Poll for job running. The status that indicates a job is running or already complete are: 'RUNNING'|'SUCCEEDED'|'FAILED'.
So the status options that this will wait for are the transitions from: 'SUBMITTED'>'PENDING'>'RUNNABLE'>'STARTING'>'RUNNING'|'SUCCEEDED'|'FAILED'
The completed status options are included for cases where the status changes too quickly for polling to detect a RUNNING status that moves quickly from STARTING to RUNNING to completed (often a failure).
- poll_for_job_complete(self, job_id, delay=None)[source]¶
Poll for job completion. The status that indicates job completion are: 'SUCCEEDED'|'FAILED'.
So the status options that this will wait for are the transitions from: 'SUBMITTED'>'PENDING'>'RUNNABLE'>'STARTING'>'RUNNING'>'SUCCEEDED'|'FAILED'
- poll_job_status(self, job_id, match_status)[source]¶
Poll for job status using an exponential back-off strategy (with max_retries).
- get_job_description(self, job_id)[source]¶
Get job description (using status_retries).
- Parameters
job_id (str) -- a Batch job ID
- Returns
an API response for describe jobs
- Return type
Dict
- Raises
AirflowException
- static parse_job_description(job_id, response)[source]¶
Parse job description to extract description for job_id
- Parameters
job_id (str) -- a Batch job ID
response (Dict) -- an API response for describe jobs
- Returns
an API response to describe job_id
- Return type
Dict
- Raises
AirflowException
- static add_jitter(delay, width=1, minima=0)[source]¶
Use delay +/- width for random jitter
Adding jitter to status polling can help to avoid AWS Batch API limits for monitoring Batch jobs with a high concurrency in Airflow tasks.
- Parameters
delay (Union[int, float]) -- number of seconds to pause; delay is assumed to be a positive number
width (Union[int, float]) -- delay +/- width for random jitter; width is assumed to be a positive number
minima (Union[int, float]) -- minimum delay allowed; minima is assumed to be a non-negative number
- Returns
uniform(delay - width, delay + width) jitter and it is a non-negative number
- Return type
- static delay(delay=None)[source]¶
Pause execution for
delay
seconds.- Parameters
delay (Union[int, float, None]) -- a delay to pause execution using
time.sleep(delay)
; a small 1 second jitter is applied to the delay.
Note
This method uses a default random delay, i.e.
random.uniform(DEFAULT_DELAY_MIN, DEFAULT_DELAY_MAX)
; using a random interval helps to avoid AWS API throttle limits when many concurrent tasks request job-descriptions.
- static exponential_delay(tries)[source]¶
An exponential back-off delay, with random jitter. There is a maximum interval of 10 minutes (with random jitter between 3 and 10 minutes). This is used in the
poll_for_job_status()
method.Examples of behavior:
def exp(tries): max_interval = 600.0 # 10 minutes in seconds delay = 1 + pow(tries * 0.6, 2) delay = min(max_interval, delay) print(delay / 3, delay) for tries in range(10): exp(tries) # 0.33 1.0 # 0.45 1.35 # 0.81 2.44 # 1.41 4.23 # 2.25 6.76 # 3.33 10.00 # 4.65 13.95 # 6.21 18.64 # 8.01 24.04 # 10.05 30.15
- class airflow.providers.amazon.aws.hooks.batch_client.AwsBatchProtocol(*args, **kwargs)[source]¶
Bases:
BatchProtocol
,airflow.typing_compat.Protocol
This class is deprecated. Please use
airflow.providers.amazon.aws.hooks.batch.BatchProtocol
.
- class airflow.providers.amazon.aws.hooks.batch_client.AwsBatchClientHook(*args, **kwargs)[source]¶
Bases:
BatchClientHook
This hook is deprecated. Please use
airflow.providers.amazon.aws.hooks.batch.BatchClientHook
.