airflow.providers.cncf.kubernetes.operators.job
¶
Executes a Kubernetes Job.
Module Contents¶
Classes¶
Executes a Kubernetes Job. |
|
Delete a Kubernetes Job. |
|
Update a Kubernetes Job. |
Attributes¶
- class airflow.providers.cncf.kubernetes.operators.job.KubernetesJobOperator(*, job_template_file=None, full_job_spec=None, backoff_limit=None, completion_mode=None, completions=None, manual_selector=None, parallelism=None, selector=None, suspend=None, ttl_seconds_after_finished=None, wait_until_job_complete=False, job_poll_interval=10, deferrable=conf.getboolean('operators', 'default_deferrable', fallback=False), **kwargs)[source]¶
Bases:
airflow.providers.cncf.kubernetes.operators.pod.KubernetesPodOperator
Executes a Kubernetes Job.
See also
For more information on how to use this operator, take a look at the guide: KubernetesJobOperator
Note
If you use Google Kubernetes Engine and Airflow is not running in the same cluster, consider using
GKEStartJobOperator
, which simplifies the authorization process.- Parameters
job_template_file (str | None) – path to job template file (templated)
full_job_spec (kubernetes.client.models.V1Job | None) – The complete JodSpec
backoff_limit (int | None) – Specifies the number of retries before marking this job failed. Defaults to 6
completion_mode (str | None) – CompletionMode specifies how Pod completions are tracked. It can be NonIndexed (default) or Indexed.
completions (int | None) – Specifies the desired number of successfully finished pods the job should be run with.
manual_selector (bool | None) – manualSelector controls generation of pod labels and pod selectors.
parallelism (int | None) – Specifies the maximum desired number of pods the job should run at any given time.
selector (kubernetes.client.models.V1LabelSelector | None) – The selector of this V1JobSpec.
suspend (bool | None) – Suspend specifies whether the Job controller should create Pods or not.
ttl_seconds_after_finished (int | None) – ttlSecondsAfterFinished limits the lifetime of a Job that has finished execution (either Complete or Failed).
wait_until_job_complete (bool) – Whether to wait until started job finished execution (either Complete or Failed). Default is False.
job_poll_interval (float) – Interval in seconds between polling the job status. Default is 10. Used if the parameter wait_until_job_complete set True.
deferrable (bool) – Run operator in the deferrable mode. Note that the parameter wait_until_job_complete must be set True.
- template_fields: collections.abc.Sequence[str][source]¶
- execute(context)[source]¶
Based on the deferrable parameter runs the pod asynchronously or synchronously.
- static deserialize_job_template_file(path)[source]¶
Generate a Job from a file.
Unfortunately we need access to the private method
_ApiClient__deserialize_model
from the kubernetes client. This issue is tracked here: https://github.com/kubernetes-client/python/issues/977.- Parameters
path (str) – Path to the file
- Returns
a kubernetes.client.models.V1Job
- Return type
kubernetes.client.models.V1Job
- on_kill()[source]¶
Override this method to clean up subprocesses when a task instance gets killed.
Any use of the threading, subprocess or multiprocessing module within an operator needs to be cleaned up, or it will leave ghost processes behind.
- build_job_request_obj(context=None)[source]¶
Return V1Job object based on job template file, full job spec, and other operator parameters.
The V1Job attributes are derived (in order of precedence) from operator params, full job spec, job template file.
- static reconcile_jobs(base_job, client_job)[source]¶
Merge Kubernetes Job objects.
- Parameters
base_job (kubernetes.client.models.V1Job) – has the base attributes which are overwritten if they exist in the client job and remain if they do not exist in the client_job
client_job (kubernetes.client.models.V1Job | None) – the job that the client wants to create.
- Returns
the merged jobs
- Return type
kubernetes.client.models.V1Job
This can’t be done recursively as certain fields are overwritten and some are concatenated.
- static reconcile_job_specs(base_spec, client_spec)[source]¶
Merge Kubernetes JobSpec objects.
- Parameters
base_spec (kubernetes.client.models.V1JobSpec | None) – has the base attributes which are overwritten if they exist in the client_spec and remain if they do not exist in the client_spec
client_spec (kubernetes.client.models.V1JobSpec | None) – the spec that the client wants to create.
- Returns
the merged specs
- Return type
kubernetes.client.models.V1JobSpec | None
- class airflow.providers.cncf.kubernetes.operators.job.KubernetesDeleteJobOperator(*, name, namespace, kubernetes_conn_id=KubernetesHook.default_conn_name, config_file=None, in_cluster=None, cluster_context=None, delete_on_status=None, wait_for_completion=False, poll_interval=10.0, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Delete a Kubernetes Job.
See also
For more information on how to use this operator, take a look at the guide: KubernetesDeleteJobOperator
- Parameters
name (str) – name of the Job.
namespace (str) – the namespace to run within kubernetes.
kubernetes_conn_id (str | None) – The kubernetes connection id for the Kubernetes cluster.
config_file (str | None) – The path to the Kubernetes config file. (templated) If not specified, default value is
~/.kube/config
in_cluster (bool | None) – run kubernetes client with in_cluster configuration.
cluster_context (str | None) – context that points to kubernetes cluster. Ignored when in_cluster is True. If None, current-context is used. (templated)
delete_on_status (str | None) – Condition for performing delete operation depending on the job status. Values:
None
- delete the job regardless of its status, “Complete” - delete only successfully completed jobs, “Failed” - delete only failed jobs. (default:None
)wait_for_completion (bool) – Whether to wait for the job to complete. (default:
False
)poll_interval (float) – Interval in seconds between polling the job status. Used when the delete_on_status parameter is set. (default: 10.0)
- template_fields: collections.abc.Sequence[str] = ('config_file', 'name', 'namespace', 'cluster_context')[source]¶
- class airflow.providers.cncf.kubernetes.operators.job.KubernetesPatchJobOperator(*, name, namespace, body, kubernetes_conn_id=KubernetesHook.default_conn_name, config_file=None, in_cluster=None, cluster_context=None, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Update a Kubernetes Job.
See also
For more information on how to use this operator, take a look at the guide: KubernetesPatchJobOperator
- Parameters
name (str) – name of the Job
namespace (str) – the namespace to run within kubernetes
body (object) – Job json object with parameters for update https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.25/#job-v1-batch e.g.
{"spec": {"suspend": True}}
kubernetes_conn_id (str | None) – The kubernetes connection id for the Kubernetes cluster.
config_file (str | None) – The path to the Kubernetes config file. (templated) If not specified, default value is
~/.kube/config
in_cluster (bool | None) – run kubernetes client with in_cluster configuration.
cluster_context (str | None) – context that points to kubernetes cluster. Ignored when in_cluster is True. If None, current-context is used. (templated)
- template_fields: collections.abc.Sequence[str] = ('config_file', 'name', 'namespace', 'body', 'cluster_context')[source]¶