airflow.providers.microsoft.azure.operators.batch

Module Contents

Classes

AzureBatchOperator

Executes a job on Azure Batch Service

class airflow.providers.microsoft.azure.operators.batch.AzureBatchOperator(*, batch_pool_id, batch_pool_vm_size, batch_job_id, batch_task_command_line, batch_task_id, vm_node_agent_sku_id, vm_publisher=None, vm_offer=None, sku_starts_with=None, vm_sku=None, vm_version=None, os_family=None, os_version=None, batch_pool_display_name=None, batch_job_display_name=None, batch_job_manager_task=None, batch_job_preparation_task=None, batch_job_release_task=None, batch_task_display_name=None, batch_task_container_settings=None, batch_start_task=None, batch_max_retries=3, batch_task_resource_files=None, batch_task_output_files=None, batch_task_user_identity=None, target_low_priority_nodes=None, target_dedicated_nodes=None, enable_auto_scale=False, auto_scale_formula=None, azure_batch_conn_id='azure_batch_default', use_latest_verified_vm_image_and_sku=False, timeout=25, should_delete_job=False, should_delete_pool=False, **kwargs)[source]

Bases: airflow.models.BaseOperator

Executes a job on Azure Batch Service

Parameters
  • batch_pool_id (str) – A string that uniquely identifies the Pool within the Account.

  • batch_pool_vm_size (str) – The size of virtual machines in the Pool

  • batch_job_id (str) – A string that uniquely identifies the Job within the Account.

  • batch_task_command_line (str) – The command line of the Task

  • batch_task_id (str) – A string that uniquely identifies the task within the Job.

  • batch_pool_display_name (str | None) – The display name for the Pool. The display name need not be unique

  • batch_job_display_name (str | None) – The display name for the Job. The display name need not be unique

  • batch_job_manager_task (batch_models.JobManagerTask | None) – Details of a Job Manager Task to be launched when the Job is started.

  • batch_job_preparation_task (batch_models.JobPreparationTask | None) – The Job Preparation Task. If set, the Batch service will run the Job Preparation Task on a Node before starting any Tasks of that Job on that Compute Node. Required if batch_job_release_task is set.

  • batch_job_release_task (batch_models.JobReleaseTask | None) – The Job Release Task. Use to undo changes to Compute Nodes made by the Job Preparation Task

  • batch_task_display_name (str | None) – The display name for the task. The display name need not be unique

  • batch_task_container_settings (batch_models.TaskContainerSettings | None) – The settings for the container under which the Task runs

  • batch_start_task (batch_models.StartTask | None) – A Task specified to run on each Compute Node as it joins the Pool. The Task runs when the Compute Node is added to the Pool or when the Compute Node is restarted.

  • batch_max_retries (int) – The number of times to retry this batch operation before it’s considered a failed operation. Default is 3

  • batch_task_resource_files (list[batch_models.ResourceFile] | None) – A list of files that the Batch service will download to the Compute Node before running the command line.

  • batch_task_output_files (list[batch_models.OutputFile] | None) – A list of files that the Batch service will upload from the Compute Node after running the command line.

  • batch_task_user_identity (batch_models.UserIdentity | None) – The user identity under which the Task runs. If omitted, the Task runs as a non-administrative user unique to the Task.

  • target_low_priority_nodes (int | None) – The desired number of low-priority Compute Nodes in the Pool. This property must not be specified if enable_auto_scale is set to true.

  • target_dedicated_nodes (int | None) – The desired number of dedicated Compute Nodes in the Pool. This property must not be specified if enable_auto_scale is set to true.

  • enable_auto_scale (bool) – Whether the Pool size should automatically adjust over time. Default is false

  • auto_scale_formula (str | None) – A formula for the desired number of Compute Nodes in the Pool. This property must not be specified if enableAutoScale is set to false. It is required if enableAutoScale is set to true.

  • azure_batch_conn_id – The Azure Batch connection id

  • use_latest_verified_vm_image_and_sku (bool) – Whether to use the latest verified virtual machine image and sku in the batch account. Default is false.

  • vm_publisher (str | None) – The publisher of the Azure Virtual Machines Marketplace Image. For example, Canonical or MicrosoftWindowsServer. Required if use_latest_image_and_sku is set to True

  • vm_offer (str | None) – The offer type of the Azure Virtual Machines Marketplace Image. For example, UbuntuServer or WindowsServer. Required if use_latest_image_and_sku is set to True

  • sku_starts_with (str | None) – The starting string of the Virtual Machine SKU. Required if use_latest_image_and_sku is set to True

  • vm_sku (str | None) – The name of the virtual machine sku to use

  • vm_version (str | None) – The version of the virtual machine

  • vm_version – str | None

  • vm_node_agent_sku_id (str) – The node agent sku id of the virtual machine

  • os_family (str | None) – The Azure Guest OS family to be installed on the virtual machines in the Pool.

  • os_version (str | None) – The OS family version

  • timeout (int) – The amount of time to wait for the job to complete in minutes. Default is 25

  • should_delete_job (bool) – Whether to delete job after execution. Default is False

  • should_delete_pool (bool) – Whether to delete pool after execution of jobs. Default is False

template_fields: Sequence[str] = ('batch_pool_id', 'batch_pool_vm_size', 'batch_job_id', 'batch_task_id', 'batch_task_command_line')[source]
ui_color = '#f0f0e4'[source]
execute(context)[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

on_kill()[source]

Override this method to clean up subprocesses when a task instance gets killed. Any use of the threading, subprocess or multiprocessing module within an operator needs to be cleaned up, or it will leave ghost processes behind.

get_hook()[source]

Create and return an AzureBatchHook.

clean_up(pool_id=None, job_id=None)[source]

Delete the given pool and job in the batch account

Parameters
  • pool_id (str | None) – The id of the pool to delete

  • job_id (str | None) – The id of the job to delete

Was this entry helpful?