Provider packages¶
Apache Airflow 2 is built in modular way. The “Core” of Apache Airflow provides core scheduler
functionality which allow you to write some basic tasks, but the capabilities of Apache Airflow can
be extended by installing additional packages, called providers
.
Providers can contain operators, hooks, sensor, and transfer operators to communicate with a multitude of external systems, but they can also extend Airflow core with new capabilities.
You can install those provider packages separately in order to interface with a given service. The providers
for Apache Airflow
are designed in the way that you can write your own providers easily. The
Apache Airflow Community
develops and maintain more than 80 provider packages, but you are free to
develop your own providers - the providers you build have exactly the same capability as the providers
written by the community, so you can release and share those providers with others.
If you want to learn how to build your own custom provider, you can find all the information about it at How to create your own provider.
The full list of all the community managed providers is available at Providers Index.
You can also see index of all the community provider’s operators and hooks in Operators and Hooks Reference
Extending Airflow core functionality¶
Providers give you the capability of extending core Airflow with extra capabilities. The Core airflow provides basic and solid functionality of scheduling, the providers extend its capabilities. Here we describe all the custom capabilities.
Airflow automatically discovers which providers add those additional capabilities and, once you install provider package and re-start Airflow, those become automatically available to Airflow Users.
The summary of all the core functionalities that can be extended are available in Core Extensions.
Configuration¶
Providers can have their own configuration options which allow you to configure how they work:
You can see all community-managed providers with their own configuration in Configurations
Auth backends¶
The providers can add custom authentication backends, that allow you to configure the way how your web server authenticates your users, integrating it with public or private authentication services.
You can see all the authentication backends available via community-managed providers in Auth backends
Custom connections¶
The providers can add custom connection types, extending connection form and handling custom form field behaviour for the connections defined by the provider.
You can see all the custom connections available via community-managed providers in Connections.
Extra links¶
The providers can add extra custom links to operators delivered by the provider. Those will be visible in task details view of the task.
You can see all the extra links available via community-managed providers in Extra Links.
Logging¶
The providers can add additional task logging capabilities. By default Apache Airflow
saves logs for
tasks locally and make them available to Airflow UI via internal http server. However, providers
can add extra logging capabilities, where Airflow Logs can be written to a remote service and
retrieved from those services.
You can see all task loggers available via community-managed providers in Writing logs.
Secret backends¶
Airflow has the capability of reading connections, variables and configuration from Secret Backends rather than from its own Database.
You can see all the secret backends available via community-managed providers in Secret backends.
Notifications¶
The providers can add custom notifications, that allow you to configure the way how you would like to receive notifications about the status of your tasks/DAGs.
You can see all the notifications available via community-managed providers in Notifications.
Installing and upgrading providers¶
Separate provider packages give the possibilities that were not available in 1.10:
You can upgrade to latest version of particular providers without the need of Apache Airflow core upgrade.
You can downgrade to previous version of particular provider in case the new version introduces some problems, without impacting the main Apache Airflow core package.
You can release and upgrade/downgrade provider packages incrementally, independent from each other. This means that you can incrementally validate each of the provider package update in your environment, following the usual tests you have in your environment.
Types of providers¶
Providers have the same capacity - no matter if they are provided by the community or if they are third-party providers. This chapter explains how community managed providers are versioned and released and how you can create your own providers.
Community maintained providers¶
From the point of view of the community, Airflow is delivered in multiple, separate packages.
The core of Airflow scheduling system is delivered as apache-airflow
package and there are more than
80 provider packages which can be installed separately as so called Airflow Provider packages
.
Those packages are available as apache-airflow-providers
packages - for example there is an
apache-airflow-providers-amazon
or apache-airflow-providers-google
package).
Community maintained providers are released and versioned separately from the Airflow releases. We are following the Semver versioning scheme for the packages. Some versions of the provider packages might depend on particular versions of Airflow, but the general approach we have is that unless there is a good reason, new version of providers should work with recent versions of Airflow 2.x. Details will vary per-provider and if there is a limitation for particular version of particular provider, constraining the Airflow version used, it will be included as limitation of dependencies in the provider package.
Each community provider has corresponding extra which can be used when installing airflow to install the
provider together with Apache Airflow
- for example you can install airflow with those extras:
apache-airflow[google,amazon]
(with correct constraints -see Installation of Airflow®) and you
will install the appropriate versions of the apache-airflow-providers-amazon
and
apache-airflow-providers-google
packages together with Apache Airflow
.
Some of the community providers have cross-provider dependencies as well. Those are not required dependencies, they might simply enable certain features (for example transfer operators often create dependency between different providers. Again, the general approach here is that the providers are backwards compatible, including cross-dependencies. Any kind of breaking changes and requirements on particular versions of other provider packages are automatically documented in the release notes of every provider.
Note
For Airflow 1.10 We also provided apache-airflow-backport-providers
packages that could be installed
with those versions Those were the same providers as for 2.0 but automatically back-ported to work for
Airflow 1.10. The last release of backport providers was done on March 17, 2021 and the backport
providers will no longer be released, since Airflow 1.10 has reached End-Of-Life as of June 17, 2021.
If you want to contribute to Apache Airflow
, you can see how to build and extend community
managed providers in
https://github.com/apache/airflow/blob/main/providers/src/airflow/providers/MANAGING_PROVIDERS_LIFECYCLE.rst
.