Reference for package extras¶
Here’s the list of all the extra dependencies of Apache Airflow.
The entries with *
in the Preinstalled
column indicate that those extras (providers) are always
pre-installed when Airflow is installed.
Note
You can disable automated installation of the providers with extras when installing Airflow. You need to
have INSTALL_PROVIDERS_FROM_SOURCES
environment variable to true
before running pip install
command. Contributors need to set it, if they are installing Airflow locally, and want to develop
providers directly via Airflow sources. This variable is automatically set in Breeze
development environment. Setting this variable is not needed in editable mode (pip install -e
).
Core Airflow extras¶
Those are core airflow extras that extend capabilities of core Airflow. They usually do not install provider
packages (with the exception of celery
and cncf.kubernetes
extras), they just install necessary
python dependencies for the provided package.
extra |
install command |
enables |
---|---|---|
async |
|
Async worker classes for Gunicorn |
celery |
|
CeleryExecutor (also installs the celery provider package!) |
cgroups |
|
Needed To use CgroupTaskRunner |
cncf.kubernetes |
|
Kubernetes Executor (also installs the Kubernetes provider package) |
dask |
|
DaskExecutor |
deprecated_api |
|
Deprecated, experimental API that is replaced with the new REST API |
github_enterprise |
|
GitHub Enterprise auth backend |
google_auth |
|
Google auth backend |
kerberos |
|
Kerberos integration for Kerberized services (Hadoop, Presto, Trino) |
ldap |
|
LDAP authentication for users |
leveldb |
|
Required for use leveldb extra in google provider |
pandas |
|
Install Pandas library compatible with Airflow |
password |
|
Password authentication for users |
rabbitmq |
|
RabbitMQ support as a Celery backend |
sentry |
|
Sentry service for application logging and monitoring |
statsd |
|
Needed by StatsD metrics |
virtualenv |
|
Running python tasks in local virtualenv |
Providers extras¶
Those providers extras are simply convenience extras to install provider packages so that you can install the providers with simple command - including provider package and necessary dependencies in single command, which allows PIP to resolve any conflicting dependencies. This is extremely useful for first time installation where you want to repeatably install version of dependencies which are ‘valid’ for both airflow and providers installed.
For example the below command will install:
apache-airflow
apache-airflow-providers-amazon
apache-airflow-providers-google
apache-airflow-providers-apache-spark
with a consistent set of dependencies based on constraint files provided by Airflow Community at the time 2.3.1 version was released.
pip install apache-airflow[google,amazon,apache.spark]==2.3.1 \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.3.1/constraints-3.7.txt"
Note, that this will install providers in the versions that were released at the time of Airflow 2.3.1 release. You can later upgrade those providers manually if you want to use latest versions of the providers.
Apache Software extras¶
Those are extras that add dependencies needed for integration with other Apache projects (note that apache.atlas
and
apache.webhdfs
do not have their own providers - they only install additional libraries that can be used in
custom bash/python providers).
extra |
install command |
enables |
---|---|---|
apache.atlas |
|
Apache Atlas |
apache.beam |
|
Apache Beam operators & hooks |
apache.cassandra |
|
Cassandra related operators & hooks |
apache.drill |
|
Drill related operators & hooks |
apache.druid |
|
Druid related operators & hooks |
apache.hdfs |
|
HDFS hooks and operators |
apache.hive |
|
All Hive related operators |
apache.kylin |
|
All Kylin related operators & hooks |
apache.livy |
|
All Livy related operators, hooks & sensors |
apache.pig |
|
All Pig related operators & hooks |
apache.pinot |
|
All Pinot related hooks |
apache.spark |
|
All Spark related operators & hooks |
apache.sqoop |
|
All Sqoop related operators & hooks |
apache.webhdfs |
|
HDFS hooks and operators |
External Services extras¶
Those are extras that add dependencies needed for integration with external services - either cloud based or on-premises.
extra |
install command |
enables |
---|---|---|
airbyte |
|
Airbyte hooks and operators |
alibaba |
|
Alibaba Cloud |
amazon |
|
Amazon Web Services |
asana |
|
Asana hooks and operators |
azure |
|
Microsoft Azure |
cloudant |
|
Cloudant hook |
databricks |
|
Databricks hooks and operators |
datadog |
|
Datadog hooks and sensors |
dbt.cloud |
|
dbt Cloud hooks and operators |
dingding |
|
Dingding hooks and sensors |
discord |
|
Discord hooks and sensors |
|
Facebook Social |
|
|
Google Cloud |
|
hashicorp |
|
Hashicorp Services (Vault) |
jira |
|
Jira hooks and operators |
opsgenie |
|
OpsGenie hooks and operators |
pagerduty |
|
Pagerduty hook |
plexus |
|
Plexus service of CoreScientific.com AI platform |
qubole |
|
Enable QDS (Qubole Data Service) support |
salesforce |
|
Salesforce hook |
sendgrid |
|
Send email using sendgrid |
segment |
|
Segment hooks and sensors |
slack |
|
Slack hooks and operators |
snowflake |
|
Snowflake hooks and operators |
tableau |
|
Tableau hooks and operators |
telegram |
|
Telegram hooks and operators |
vertica |
|
Vertica hook support as an Airflow backend |
yandex |
|
Yandex.cloud hooks and operators |
zendesk |
|
Zendesk hooks |
Locally installed software extras¶
Those are extras that add dependencies needed for integration with other software packages installed usually as part of the deployment of Airflow.
extra |
install command |
enables |
---|---|---|
docker |
|
Docker hooks and operators |
elasticsearch |
|
Elasticsearch hooks and Log Handler |
exasol |
|
Exasol hooks and operators |
github |
|
GitHub operators and hook |
influxdb |
|
Influxdb operators and hook |
jenkins |
|
Jenkins hooks and operators |
mongo |
|
Mongo hooks and operators |
microsoft.mssql |
|
Microsoft SQL Server operators and hook. |
mysql |
|
MySQL operators and hook |
neo4j |
|
Neo4j operators and hook |
odbc |
|
ODBC data sources including MS SQL Server |
openfaas |
|
OpenFaaS hooks |
oracle |
|
Oracle hooks and operators |
postgres |
|
PostgreSQL operators and hook |
presto |
|
All Presto related operators & hooks |
redis |
|
Redis hooks and sensors |
samba |
|
Samba hooks and operators |
singularity |
|
Singularity container operator |
trino |
|
All Trino related operators & hooks |
arangodb |
|
ArangoDB operators, sensors and hook |
Other extras¶
Those are extras that provide support for integration with external systems via some - usually - standard protocols.
extra |
install command |
enables |
Preinstalled |
---|---|---|---|
ftp |
|
FTP hooks and operators |
|
grpc |
|
Grpc hooks and operators |
|
http |
|
HTTP hooks, operators and sensors |
|
imap |
|
IMAP hooks and sensors |
|
jdbc |
|
JDBC hooks and operators |
|
papermill |
|
Papermill hooks and operators |
|
sftp |
|
SFTP hooks, operators and sensors |
|
sqlite |
|
SQLite hooks and operators |
|
ssh |
|
SSH hooks and operators |
|
microsoft.psrp |
|
PSRP hooks and operators |
|
microsoft.winrm |
|
WinRM hooks and operators |
Bundle extras¶
Those are extras that install one ore more extras as a bundle. Note that those extras should only be used for “development” version of Airflow - i.e. when Airflow is installed from sources. Because of the way how bundle extras are constructed they might not work when airflow is installed from ‘PyPI`.
If you want to install Airflow from PyPI with “all” extras (which should basically be never needed - you almost never need all extras from Airflow), you need to list explicitly all the non-bundle extras that you want to install.
extra |
install command |
enables |
---|---|---|
all |
|
All Airflow user facing features (no devel and doc requirements) |
all_dbs |
|
All database integrations |
devel |
|
Minimum development dependencies (without Hadoop, Kerberos, providers) |
devel_hadoop |
|
Adds Hadoop stack libraries to |
devel_all |
|
Everything needed for development including Hadoop and providers |
devel_ci |
|
All dependencies required for CI tests (same as |
Doc extras¶
This is the extra that is needed to generated documentation for Airflow. This is used for development time only
extra |
install command |
enables |
doc |
|
Packages needed to build docs (included in |
Deprecated 1.10 extras¶
Those are the extras that have been deprecated in 2.0 and will be removed in Airflow 3.0.0. They were all replaced by new extras, which have naming consistent with the names of provider packages.
The crypto
extra is not needed any more, because all crypto dependencies are part of airflow package,
so there is no replacement for crypto
extra.
Deprecated extra |
Extra to be used instead |
---|---|
atlas |
apache.atlas |
aws |
amazon |
azure |
microsoft.azure |
cassandra |
apache.cassandra |
crypto |
|
druid |
apache.druid |
gcp |
|
gcp_api |
|
hdfs |
apache.hdfs |
hive |
apache.hive |
kubernetes |
cncf.kubernetes |
mssql |
microsoft.mssql |
pinot |
apache.pinot |
qds |
qubole |
s3 |
amazon |
spark |
apache.spark |
webhdfs |
apache.webhdfs |
winrm |
microsoft.winrm |