airflow.providers.cncf.kubernetes.operators.job

Executes a Kubernetes Job.

Module Contents

Classes

KubernetesJobOperator

Executes a Kubernetes Job.

KubernetesDeleteJobOperator

Delete a Kubernetes Job.

KubernetesPatchJobOperator

Update a Kubernetes Job.

Attributes

log

airflow.providers.cncf.kubernetes.operators.job.log[source]
class airflow.providers.cncf.kubernetes.operators.job.KubernetesJobOperator(*, job_template_file=None, full_job_spec=None, backoff_limit=None, completion_mode=None, completions=None, manual_selector=None, parallelism=None, selector=None, suspend=None, ttl_seconds_after_finished=None, wait_until_job_complete=False, job_poll_interval=10, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]

Bases: airflow.providers.cncf.kubernetes.operators.pod.KubernetesPodOperator

Executes a Kubernetes Job.

See also

For more information on how to use this operator, take a look at the guide: KubernetesJobOperator

Note

If you use Google Kubernetes Engine and Airflow is not running in the same cluster, consider using GKEStartJobOperator, which simplifies the authorization process.

Parameters
  • job_template_file (str | None) – path to job template file (templated)

  • full_job_spec (kubernetes.client.models.V1Job | None) – The complete JodSpec

  • backoff_limit (int | None) – Specifies the number of retries before marking this job failed. Defaults to 6

  • completion_mode (str | None) – CompletionMode specifies how Pod completions are tracked. It can be NonIndexed (default) or Indexed.

  • completions (int | None) – Specifies the desired number of successfully finished pods the job should be run with.

  • manual_selector (bool | None) – manualSelector controls generation of pod labels and pod selectors.

  • parallelism (int | None) – Specifies the maximum desired number of pods the job should run at any given time.

  • selector (kubernetes.client.models.V1LabelSelector | None) – The selector of this V1JobSpec.

  • suspend (bool | None) – Suspend specifies whether the Job controller should create Pods or not.

  • ttl_seconds_after_finished (int | None) – ttlSecondsAfterFinished limits the lifetime of a Job that has finished execution (either Complete or Failed).

  • wait_until_job_complete (bool) – Whether to wait until started job finished execution (either Complete or Failed). Default is False.

  • job_poll_interval (float) – Interval in seconds between polling the job status. Default is 10. Used if the parameter wait_until_job_complete set True.

  • deferrable (bool) – Run operator in the deferrable mode. Note that the parameter wait_until_job_complete must be set True.

template_fields: collections.abc.Sequence[str][source]
hook()[source]
job_client()[source]
create_job(job_request_obj)[source]
execute(context)[source]

Based on the deferrable parameter runs the pod asynchronously or synchronously.

execute_deferrable()[source]
execute_complete(context, event, **kwargs)[source]
static deserialize_job_template_file(path)[source]

Generate a Job from a file.

Unfortunately we need access to the private method _ApiClient__deserialize_model from the kubernetes client. This issue is tracked here: https://github.com/kubernetes-client/python/issues/977.

Parameters

path (str) – Path to the file

Returns

a kubernetes.client.models.V1Job

Return type

kubernetes.client.models.V1Job

on_kill()[source]

Override this method to clean up subprocesses when a task instance gets killed.

Any use of the threading, subprocess or multiprocessing module within an operator needs to be cleaned up, or it will leave ghost processes behind.

build_job_request_obj(context=None)[source]

Return V1Job object based on job template file, full job spec, and other operator parameters.

The V1Job attributes are derived (in order of precedence) from operator params, full job spec, job template file.

static reconcile_jobs(base_job, client_job)[source]

Merge Kubernetes Job objects.

Parameters
  • base_job (kubernetes.client.models.V1Job) – has the base attributes which are overwritten if they exist in the client job and remain if they do not exist in the client_job

  • client_job (kubernetes.client.models.V1Job | None) – the job that the client wants to create.

Returns

the merged jobs

Return type

kubernetes.client.models.V1Job

This can’t be done recursively as certain fields are overwritten and some are concatenated.

static reconcile_job_specs(base_spec, client_spec)[source]

Merge Kubernetes JobSpec objects.

Parameters
  • base_spec (kubernetes.client.models.V1JobSpec | None) – has the base attributes which are overwritten if they exist in the client_spec and remain if they do not exist in the client_spec

  • client_spec (kubernetes.client.models.V1JobSpec | None) – the spec that the client wants to create.

Returns

the merged specs

Return type

kubernetes.client.models.V1JobSpec | None

class airflow.providers.cncf.kubernetes.operators.job.KubernetesDeleteJobOperator(*, name, namespace, kubernetes_conn_id=KubernetesHook.default_conn_name, config_file=None, in_cluster=None, cluster_context=None, delete_on_status=None, wait_for_completion=False, poll_interval=10.0, **kwargs)[source]

Bases: airflow.models.BaseOperator

Delete a Kubernetes Job.

See also

For more information on how to use this operator, take a look at the guide: KubernetesDeleteJobOperator

Parameters
  • name (str) – name of the Job.

  • namespace (str) – the namespace to run within kubernetes.

  • kubernetes_conn_id (str | None) – The kubernetes connection id for the Kubernetes cluster.

  • config_file (str | None) – The path to the Kubernetes config file. (templated) If not specified, default value is ~/.kube/config

  • in_cluster (bool | None) – run kubernetes client with in_cluster configuration.

  • cluster_context (str | None) – context that points to kubernetes cluster. Ignored when in_cluster is True. If None, current-context is used. (templated)

  • delete_on_status (str | None) – Condition for performing delete operation depending on the job status. Values: None - delete the job regardless of its status, “Complete” - delete only successfully completed jobs, “Failed” - delete only failed jobs. (default: None)

  • wait_for_completion (bool) – Whether to wait for the job to complete. (default: False)

  • poll_interval (float) – Interval in seconds between polling the job status. Used when the delete_on_status parameter is set. (default: 10.0)

template_fields: collections.abc.Sequence[str] = ('config_file', 'name', 'namespace', 'cluster_context')[source]
hook()[source]
client()[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.cncf.kubernetes.operators.job.KubernetesPatchJobOperator(*, name, namespace, body, kubernetes_conn_id=KubernetesHook.default_conn_name, config_file=None, in_cluster=None, cluster_context=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Update a Kubernetes Job.

See also

For more information on how to use this operator, take a look at the guide: KubernetesPatchJobOperator

Parameters
  • name (str) – name of the Job

  • namespace (str) – the namespace to run within kubernetes

  • body (object) – Job json object with parameters for update https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.25/#job-v1-batch e.g. {"spec": {"suspend": True}}

  • kubernetes_conn_id (str | None) – The kubernetes connection id for the Kubernetes cluster.

  • config_file (str | None) – The path to the Kubernetes config file. (templated) If not specified, default value is ~/.kube/config

  • in_cluster (bool | None) – run kubernetes client with in_cluster configuration.

  • cluster_context (str | None) – context that points to kubernetes cluster. Ignored when in_cluster is True. If None, current-context is used. (templated)

template_fields: collections.abc.Sequence[str] = ('config_file', 'name', 'namespace', 'body', 'cluster_context')[source]
hook()[source]
execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?