apache-airflow-providers-cncf-kubernetes
¶
Provider package¶
This is a provider package for cncf.kubernetes
provider. All classes for this provider package
are in airflow.providers.cncf.kubernetes
python package.
Installation¶
You can install this package on top of an existing Airflow 2.1+ installation via
pip install apache-airflow-providers-cncf-kubernetes
PIP requirements¶
PIP package |
Version required |
---|---|
|
|
|
|
|
|
Changelog¶
4.0.1¶
Bug Fixes¶
Add k8s container's error message in airflow exception (#22871)
KubernetesHook should try incluster first when not otherwise configured (#23126)
KubernetesPodOperator should patch "already checked" always (#22734)
Delete old Spark Application in SparkKubernetesOperator (#21092)
Cleanup dup code now that k8s provider requires 2.3.0+ (#22845)
Fix ''KubernetesPodOperator'' with 'KubernetesExecutor'' on 2.3.0 (#23371)
Fix KPO to have hyphen instead of period (#22982)
Fix new MyPy errors in main (#22884)
4.0.0¶
Breaking changes¶
The provider in version 4.0.0 only works with Airflow 2.3+. Please upgrade Airflow to 2.3 version if you want to use the features or fixes in 4.* line of the provider.
The main reason for the incompatibility is using latest Kubernetes Libraries.
The cncf.kubernetes
provider requires newer version of libraries than
Airflow 2.1 and 2.2 used for Kubernetes Executor and that makes the provider
incompatible with those Airflow versions.
Features¶
Log traceback only on ''DEBUG'' for KPO logs read interruption (#22595)
Update our approach for executor-bound dependencies (#22573)
Optionally not follow logs in KPO pod_manager (#22412)
Bug Fixes¶
Stop crashing when empty logs are received from kubernetes client (#22566)
3.1.2 (YANKED)¶
Bug Fixes¶
Fix mistakenly added install_requires for all providers (#22382)
Fix "run_id" k8s and elasticsearch compatibility with Airflow 2.1 (#22385)
Misc¶
Remove RefreshConfiguration workaround for K8s token refreshing (#20759)
3.1.0¶
Features¶
Add map_index label to mapped KubernetesPodOperator (#21916)
Change KubePodOperator labels from exeuction_date to run_id (#21960)
Misc¶
Support for Python 3.10
Fix Kubernetes example with wrong operator casing (#21898)
Remove types from KPO docstring (#21826)
3.0.0¶
Breaking changes¶
Parameter is_delete_operator_pod default is changed to True (#20575)
Simplify KubernetesPodOperator (#19572)
Move pod_mutation_hook call from PodManager to KubernetesPodOperator (#20596)
Rename ''PodLauncher'' to ''PodManager'' (#20576)
Parameter is_delete_operator_pod has new default¶
Previously, the default for param is_delete_operator_pod
was False
, which means that
after a task runs, its pod is not deleted by the operator and remains on the
cluster indefinitely. With this release, we change the default to True
.
Notes on changes KubernetesPodOperator and PodLauncher¶
Warning
Many methods in KubernetesPodOperator
and PodLauncher
have been renamed.
If you have subclassed KubernetesPodOperator
you will need to update your subclass to reflect
the new structure. Additionally PodStatus
enum has been renamed to PodPhase
.
Overview¶
Generally speaking if you did not subclass KubernetesPodOperator
and you didn't use the PodLauncher
class directly,
then you don't need to worry about this change. If however you have subclassed KubernetesPodOperator
, what
follows are some notes on the changes in this release.
One of the principal goals of the refactor is to clearly separate the "get or create pod" and
"wait for pod completion" phases. Previously the "wait for pod completion" logic would be invoked
differently depending on whether the operator were to "attach to an existing pod" (e.g. after a
worker failure) or "create a new pod" and this resulted in some code duplication and a bit more
nesting of logic. With this refactor we encapsulate the "get or create" step
into method KubernetesPodOperator.get_or_create_pod
, and pull the monitoring and XCom logic up
into the top level of execute
because it can be the same for "attached" pods and "new" pods.
The KubernetesPodOperator.get_or_create_pod
tries first to find an existing pod using labels
specific to the task instance (see KubernetesPodOperator.find_pod
).
If one does not exist it creates a pod <~.PodManager.create_pod>
.
The "waiting" part of execution has three components. The first step is to wait for the pod to leave the
Pending
phase (~.KubernetesPodOperator.await_pod_start
). Next, if configured to do so,
the operator will follow the base container logs and forward these logs to the task logger until
the base
container is done. If not configured to harvest the
logs, the operator will instead KubernetesPodOperator.await_container_completion
either way, we must await container completion before harvesting xcom. After (optionally) extracting the xcom
value from the base container, we await pod completion <~.PodManager.await_pod_completion>
.
Previously, depending on whether the pod was "reattached to" (e.g. after a worker failure) or
created anew, the waiting logic may have occurred in either handle_pod_overlap
or create_new_pod_for_operator
.
After the pod terminates, we execute different cleanup tasks depending on whether the pod terminated successfully.
If the pod terminates unsuccessfully, we attempt to log the pod events PodLauncher.read_pod_events>
. If
additionally the task is configured not to delete the pod after termination, we apply a label KubernetesPodOperator.patch_already_checked>
indicating that the pod failed and should not be "reattached to" in a retry. If the task is configured
to delete its pod, we delete it KubernetesPodOperator.process_pod_deletion>
. Finally,
we raise an AirflowException to fail the task instance.
If the pod terminates successfully, we delete the pod KubernetesPodOperator.process_pod_deletion>
(if configured to delete the pod) and push XCom (if configured to push XCom).
Details on method renames, refactors, and deletions¶
In KubernetesPodOperator
:
Method
create_pod_launcher
is converted to cached propertypod_manager
Construction of k8s
CoreV1Api
client is now encapsulated within cached propertyclient
Logic to search for an existing pod (e.g. after an airflow worker failure) is moved out of
execute
and into methodfind_pod
.Method
handle_pod_overlap
is removed. Previously it monitored a "found" pod until completion. With this change the pod monitoring (and log following) is orchestrated directly fromexecute
and it is the same whether it's a "found" pod or a "new" pod. See methodsawait_pod_start
,follow_container_logs
,await_container_completion
andawait_pod_completion
.Method
create_pod_request_obj
is renamedbuild_pod_request_obj
. It now takes argumentcontext
in order to add TI-specific pod labels; previously they were added after return.Method
create_labels_for_pod
is renamed_get_ti_pod_labels
. This method doesn't return all labels, but only those specific to the TI. We also add parameterinclude_try_number
to control the inclusion of this label instead of possibly filtering it out later.Method
_get_pod_identifying_label_string
is renamed_build_find_pod_label_selector
Method
_try_numbers_match
is removed.Method
create_new_pod_for_operator
is removed. Previously it would mutate the labels onself.pod
, launch the pod, monitor the pod to completion etc. Now this logic is in part handled byget_or_create_pod
, where a new pod will be created if necessary. The monitoring etc is now orchestrated directly fromexecute
. Again, see the calls to methodsawait_pod_start
,follow_container_logs
,await_container_completion
andawait_pod_completion
.
In class PodManager
(formerly PodLauncher
):
Method
start_pod
is removed and split into two methods:create_pod
andawait_pod_start
.Method
monitor_pod
is removed and split into methodsfollow_container_logs
,await_container_completion
,await_pod_completion
Methods
pod_not_started
,pod_is_running
,process_status
, and_task_status
are removed. These were needed due to the way in which podphase
was mapped to task instance states; but we no longer do such a mapping and instead deal with pod phases directly and untransformed.Method
_extract_xcom
is renamedextract_xcom
.Method
read_pod_logs
now takes kwargcontainer_name
Other changes in pod_manager.py
(formerly pod_launcher.py
):
Class
pod_launcher.PodLauncher
renamed topod_manager.PodManager
Enum-like class
PodStatus
is renamedPodPhase
, and the values are no longer lower-cased.The
airflow.settings.pod_mutation_hook
is no longer called incncf.kubernetes.utils.pod_manager.PodManager.run_pod_async
. ForKubernetesPodOperator
, mutation now occurs inbuild_pod_request_obj
.Parameter
is_delete_operator_pod
default is changed toTrue
so that pods are deleted after task completion and not left to accumulate. In practice it seems more common to disable pod deletion only on a temporary basis for debugging purposes and therefore pod deletion is the more sensible default.
Features¶
Add params config, in_cluster, and cluster_context to KubernetesHook (#19695)
Implement dry_run for KubernetesPodOperator (#20573)
Clarify docstring for ''build_pod_request_obj'' in K8s providers (#20574)
Bug Fixes¶
Fix Volume/VolumeMount KPO DeprecationWarning (#19726)
2.2.0¶
Features¶
Added namespace as a template field in the KPO. (#19718)
Decouple name randomization from name kwarg (#19398)
Bug Fixes¶
Checking event.status.container_statuses before filtering (#19713)
Coalesce 'extra' params to None in KubernetesHook (#19694)
Change to correct type in KubernetesPodOperator (#19459)
2.1.0¶
Features¶
Add more type hints to PodLauncher (#18928)
Add more information to PodLauncher timeout error (#17953)
2.0.3¶
Bug Fixes¶
Fix KubernetesPodOperator reattach when not deleting pods (#18070)
Make Kubernetes job description fit on one log line (#18377)
Do not fail KubernetesPodOperator tasks if log reading fails (#17649)
2.0.2¶
Bug Fixes¶
Fix using XCom with ''KubernetesPodOperator'' (#17760)
Import Hooks lazily individually in providers manager (#17682)
2.0.1¶
Features¶
Enable using custom pod launcher in Kubernetes Pod Operator (#16945)
Bug Fixes¶
BugFix: Using 'json' string in template_field causes issue with K8s Operators (#16930)
2.0.0¶
Breaking changes¶
Auto-apply apply_default decorator (#15667)
Warning
Due to apply_default decorator removal, this version of the provider requires Airflow 2.1.0+.
If your Airflow version is < 2.1.0, and you want to install this provider version, first upgrade
Airflow to at least version 2.1.0. Otherwise your Airflow package version will be upgraded
automatically and you will have to manually run airflow upgrade db
to complete the migration.
Features¶
Add 'KubernetesPodOperat' 'pod-template-file' jinja template support (#15942)
Save pod name to xcom for KubernetesPodOperator (#15755)
Bug Fixes¶
Bug Fix Pod-Template Affinity Ignored due to empty Affinity K8S Object (#15787)
Bug Pod Template File Values Ignored (#16095)
Fix issue with parsing error logs in the KPO (#15638)
Fix unsuccessful KubernetesPod final_state call when 'is_delete_operator_pod=True' (#15490)
1.2.0¶
Features¶
Require 'name' with KubernetesPodOperator (#15373)
Change KPO node_selectors warning to proper deprecationwarning (#15507)
Bug Fixes¶
Fix timeout when using XCom with KubernetesPodOperator (#15388)
Fix labels on the pod created by ''KubernetsPodOperator'' (#15492)
1.1.0¶
Features¶
Separate Kubernetes pod_launcher from core airflow (#15165)
Add ability to specify api group and version for Spark operators (#14898)
Use libyaml C library when available. (#14577)
1.0.2¶
Bug fixes¶
Allow pod name override in KubernetesPodOperator if pod_template is used. (#14186)
Allow users of the KPO to *actually* template environment variables (#14083)
1.0.1¶
Updated documentation and readme files.
Bug fixes¶
Pass image_pull_policy in KubernetesPodOperator correctly (#13289)
1.0.0¶
Initial version of the provider.