Configuration Reference¶
This page contains the list of all the available Airflow configurations that you
can set in airflow.cfg
file or using environment variables.
Sections:
core¶
dags_folder¶
The folder where your airflow pipelines live, most likely a subfolder in a code repository. This path must be absolute.
- Type
string
- Default
{AIRFLOW_HOME}/dags
- Environment Variable
AIRFLOW__CORE__DAGS_FOLDER
base_log_folder¶
The folder where airflow should store its log files This path must be absolute
- Type
string
- Default
{AIRFLOW_HOME}/logs
- Environment Variable
AIRFLOW__CORE__BASE_LOG_FOLDER
remote_logging¶
Airflow can store logs remotely in AWS S3, Google Cloud Storage or Elastic Search. Set this to True if you want to enable remote logging.
- Type
string
- Default
False
- Environment Variable
AIRFLOW__CORE__REMOTE_LOGGING
remote_log_conn_id¶
Users must supply an Airflow connection id that provides access to the storage location.
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CORE__REMOTE_LOG_CONN_ID
remote_base_log_folder¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER
encrypt_s3_logs¶
- Type
string
- Default
False
- Environment Variable
AIRFLOW__CORE__ENCRYPT_S3_LOGS
logging_level¶
Logging level
- Type
string
- Default
INFO
- Environment Variable
AIRFLOW__CORE__LOGGING_LEVEL
fab_logging_level¶
Logging level for Flask-appbuilder UI
- Type
string
- Default
WARN
- Environment Variable
AIRFLOW__CORE__FAB_LOGGING_LEVEL
logging_config_class¶
Logging class Specify the class that will specify the logging configuration This class has to be on the python classpath
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CORE__LOGGING_CONFIG_CLASS
- Example
my.path.default_local_settings.LOGGING_CONFIG
colored_console_log¶
New in version 1.10.4.
Flag to enable/disable Colored logs in Console Colour the logs when the controlling terminal is a TTY.
- Type
string
- Default
True
- Environment Variable
AIRFLOW__CORE__COLORED_CONSOLE_LOG
colored_log_format¶
New in version 1.10.4.
Log format for when Colored logs is enabled
- Type
string
- Default
[%%(blue)s%%(asctime)s%%(reset)s] {{%%(blue)s%%(filename)s:%%(reset)s%%(lineno)d}} %%(log_color)s%%(levelname)s%%(reset)s - %%(log_color)s%%(message)s%%(reset)s
- Environment Variable
AIRFLOW__CORE__COLORED_LOG_FORMAT
colored_formatter_class¶
New in version 1.10.4.
- Type
string
- Default
airflow.utils.log.colored_log.CustomTTYColoredFormatter
- Environment Variable
AIRFLOW__CORE__COLORED_FORMATTER_CLASS
log_format¶
Format of Log line
- Type
string
- Default
[%%(asctime)s] {{%%(filename)s:%%(lineno)d}} %%(levelname)s - %%(message)s
- Environment Variable
AIRFLOW__CORE__LOG_FORMAT
simple_log_format¶
- Type
string
- Default
%%(asctime)s %%(levelname)s - %%(message)s
- Environment Variable
AIRFLOW__CORE__SIMPLE_LOG_FORMAT
log_filename_template¶
Log filename format
- Type
string
- Default
{{{{ ti.dag_id }}}}/{{{{ ti.task_id }}}}/{{{{ ts }}}}/{{{{ try_number }}}}.log
- Environment Variable
AIRFLOW__CORE__LOG_FILENAME_TEMPLATE
log_processor_filename_template¶
- Type
string
- Default
{{{{ filename }}}}.log
- Environment Variable
AIRFLOW__CORE__LOG_PROCESSOR_FILENAME_TEMPLATE
dag_processor_manager_log_location¶
New in version 1.10.2.
- Type
string
- Default
{AIRFLOW_HOME}/logs/dag_processor_manager/dag_processor_manager.log
- Environment Variable
AIRFLOW__CORE__DAG_PROCESSOR_MANAGER_LOG_LOCATION
task_log_reader¶
Name of handler to read task instance logs. Default to use task handler.
- Type
string
- Default
task
- Environment Variable
AIRFLOW__CORE__TASK_LOG_READER
hostname_callable¶
Hostname by providing a path to a callable, which will resolve the hostname. The format is “package:function”.
For example, default value “socket:getfqdn” means that result from getfqdn() of “socket” package will be used as hostname.
No argument should be required in the function specified.
If using IP address as hostname is preferred, use value airflow.utils.net:get_host_ip_address
- Type
string
- Default
socket:getfqdn
- Environment Variable
AIRFLOW__CORE__HOSTNAME_CALLABLE
default_timezone¶
Default timezone in case supplied date times are naive can be utc (default), system, or any IANA timezone string (e.g. Europe/Amsterdam)
- Type
string
- Default
utc
- Environment Variable
AIRFLOW__CORE__DEFAULT_TIMEZONE
executor¶
The executor class that airflow should use. Choices include SequentialExecutor, LocalExecutor, CeleryExecutor, DaskExecutor, KubernetesExecutor
- Type
string
- Default
SequentialExecutor
- Environment Variable
AIRFLOW__CORE__EXECUTOR
sql_alchemy_conn¶
The SqlAlchemy connection string to the metadata database. SqlAlchemy supports many different database engine, more information their website
- Type
string
- Default
sqlite:///{AIRFLOW_HOME}/airflow.db
- Environment Variable
AIRFLOW__CORE__SQL_ALCHEMY_CONN
sql_engine_encoding¶
New in version 1.10.1.
The encoding for the databases
- Type
string
- Default
utf-8
- Environment Variable
AIRFLOW__CORE__SQL_ENGINE_ENCODING
sql_alchemy_pool_enabled¶
If SqlAlchemy should pool database connections.
- Type
string
- Default
True
- Environment Variable
AIRFLOW__CORE__SQL_ALCHEMY_POOL_ENABLED
sql_alchemy_pool_size¶
The SqlAlchemy pool size is the maximum number of database connections in the pool. 0 indicates no limit.
- Type
string
- Default
5
- Environment Variable
AIRFLOW__CORE__SQL_ALCHEMY_POOL_SIZE
sql_alchemy_max_overflow¶
New in version 1.10.4.
The maximum overflow size of the pool. When the number of checked-out connections reaches the size set in pool_size, additional connections will be returned up to this limit. When those additional connections are returned to the pool, they are disconnected and discarded. It follows then that the total number of simultaneous connections the pool will allow is pool_size + max_overflow, and the total number of “sleeping” connections the pool will allow is pool_size. max_overflow can be set to -1 to indicate no overflow limit; no limit will be placed on the total number of concurrent connections. Defaults to 10.
- Type
string
- Default
10
- Environment Variable
AIRFLOW__CORE__SQL_ALCHEMY_MAX_OVERFLOW
sql_alchemy_pool_recycle¶
The SqlAlchemy pool recycle is the number of seconds a connection can be idle in the pool before it is invalidated. This config does not apply to sqlite. If the number of DB connections is ever exceeded, a lower config value will allow the system to recover faster.
- Type
string
- Default
1800
- Environment Variable
AIRFLOW__CORE__SQL_ALCHEMY_POOL_RECYCLE
sql_alchemy_pool_pre_ping¶
New in version 1.10.6.
Check connection at the start of each connection pool checkout. Typically, this is a simple statement like “SELECT 1”. More information here: https://docs.sqlalchemy.org/en/13/core/pooling.html#disconnect-handling-pessimistic
- Type
string
- Default
True
- Environment Variable
AIRFLOW__CORE__SQL_ALCHEMY_POOL_PRE_PING
sql_alchemy_schema¶
New in version 1.10.3.
The schema to use for the metadata database. SqlAlchemy supports databases with the concept of multiple schemas.
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CORE__SQL_ALCHEMY_SCHEMA
parallelism¶
The amount of parallelism as a setting to the executor. This defines the max number of task instances that should run simultaneously on this airflow installation
- Type
string
- Default
32
- Environment Variable
AIRFLOW__CORE__PARALLELISM
dag_concurrency¶
The number of task instances allowed to run concurrently by the scheduler
- Type
string
- Default
16
- Environment Variable
AIRFLOW__CORE__DAG_CONCURRENCY
dags_are_paused_at_creation¶
Are DAGs paused by default at creation
- Type
string
- Default
True
- Environment Variable
AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION
max_active_runs_per_dag¶
The maximum number of active DAG runs per DAG
- Type
string
- Default
16
- Environment Variable
AIRFLOW__CORE__MAX_ACTIVE_RUNS_PER_DAG
load_examples¶
Whether to load the examples that ship with Airflow. It’s good to get started, but you probably want to set this to False in a production environment
- Type
string
- Default
True
- Environment Variable
AIRFLOW__CORE__LOAD_EXAMPLES
plugins_folder¶
Where your Airflow plugins are stored
- Type
string
- Default
{AIRFLOW_HOME}/plugins
- Environment Variable
AIRFLOW__CORE__PLUGINS_FOLDER
fernet_key¶
Secret key to save connection passwords in the db
- Type
string
- Default
{FERNET_KEY}
- Environment Variable
AIRFLOW__CORE__FERNET_KEY
donot_pickle¶
Whether to disable pickling dags
- Type
string
- Default
False
- Environment Variable
AIRFLOW__CORE__DONOT_PICKLE
dagbag_import_timeout¶
How long before timing out a python file import
- Type
string
- Default
30
- Environment Variable
AIRFLOW__CORE__DAGBAG_IMPORT_TIMEOUT
dag_file_processor_timeout¶
New in version 1.10.6.
How long before timing out a DagFileProcessor, which processes a dag file
- Type
string
- Default
50
- Environment Variable
AIRFLOW__CORE__DAG_FILE_PROCESSOR_TIMEOUT
task_runner¶
The class to use for running task instances in a subprocess
- Type
string
- Default
StandardTaskRunner
- Environment Variable
AIRFLOW__CORE__TASK_RUNNER
default_impersonation¶
If set, tasks without a run_as_user
argument will be run with this user
Can be used to de-elevate a sudo user running Airflow when executing tasks
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CORE__DEFAULT_IMPERSONATION
security¶
What security module to use (for example kerberos)
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CORE__SECURITY
secure_mode¶
If set to False enables some unsecure features like Charts and Ad Hoc Queries. In 2.0 will default to True.
- Type
string
- Default
False
- Environment Variable
AIRFLOW__CORE__SECURE_MODE
unit_test_mode¶
Turn unit test mode on (overwrites many configuration options with test values at runtime)
- Type
string
- Default
False
- Environment Variable
AIRFLOW__CORE__UNIT_TEST_MODE
enable_xcom_pickling¶
Whether to enable pickling for xcom (note that this is insecure and allows for RCE exploits). This will be deprecated in Airflow 2.0 (be forced to False).
- Type
string
- Default
True
- Environment Variable
AIRFLOW__CORE__ENABLE_XCOM_PICKLING
killed_task_cleanup_time¶
When a task is killed forcefully, this is the amount of time in seconds that it has to cleanup after it is sent a SIGTERM, before it is SIGKILLED
- Type
string
- Default
60
- Environment Variable
AIRFLOW__CORE__KILLED_TASK_CLEANUP_TIME
dag_run_conf_overrides_params¶
Whether to override params with dag_run.conf. If you pass some key-value pairs
through airflow dags backfill -c
or
airflow dags trigger -c
, the key-value pairs will override the existing ones in params.
- Type
string
- Default
False
- Environment Variable
AIRFLOW__CORE__DAG_RUN_CONF_OVERRIDES_PARAMS
worker_precheck¶
New in version 1.10.1.
Worker initialisation check to validate Metadata Database connection
- Type
string
- Default
False
- Environment Variable
AIRFLOW__CORE__WORKER_PRECHECK
dag_discovery_safe_mode¶
New in version 1.10.3.
When discovering DAGs, ignore any files that don’t contain the strings DAG
and airflow
.
- Type
string
- Default
True
- Environment Variable
AIRFLOW__CORE__DAG_DISCOVERY_SAFE_MODE
default_task_retries¶
New in version 1.10.6.
The number of retries each task is going to have by default. Can be overridden at dag or task level.
- Type
string
- Default
0
- Environment Variable
AIRFLOW__CORE__DEFAULT_TASK_RETRIES
store_serialized_dags¶
New in version 1.10.7.
Whether to serialises DAGs and persist them in DB. If set to True, Webserver reads from DB instead of parsing DAG files More details: https://airflow.apache.org/docs/stable/dag-serialization.html
- Type
string
- Default
False
- Environment Variable
AIRFLOW__CORE__STORE_SERIALIZED_DAGS
min_serialized_dag_update_interval¶
New in version 1.10.7.
Updating serialized DAG can not be faster than a minimum interval to reduce database write rate.
- Type
string
- Default
30
- Environment Variable
AIRFLOW__CORE__MIN_SERIALIZED_DAG_UPDATE_INTERVAL
check_slas¶
New in version 1.10.8.
On each dagrun check against defined SLAs
- Type
string
- Default
True
- Environment Variable
AIRFLOW__CORE__CHECK_SLAS
cli¶
api_client¶
In what way should the cli access the API. The LocalClient will use the database directly, while the json_client will use the api running on the webserver
- Type
string
- Default
airflow.api.client.local_client
- Environment Variable
AIRFLOW__CLI__API_CLIENT
endpoint_url¶
If you set web_server_url_prefix, do NOT forget to append it here, ex:
endpoint_url = http://localhost:8080/myroot
So api will look like: http://localhost:8080/myroot/api/experimental/...
- Type
string
- Default
http://localhost:8080
- Environment Variable
AIRFLOW__CLI__ENDPOINT_URL
debug¶
fail_fast¶
New in version 1.10.8.
Used only with DebugExecutor. If set to True DAG will fail with first failed task. Helpful for debugging purposes.
- Type
string
- Default
False
- Environment Variable
AIRFLOW__DEBUG__FAIL_FAST
api¶
auth_backend¶
How to authenticate users of the API
- Type
string
- Default
airflow.api.auth.backend.default
- Environment Variable
AIRFLOW__API__AUTH_BACKEND
lineage¶
backend¶
what lineage backend to use
- Type
string
- Default
''
- Environment Variable
AIRFLOW__LINEAGE__BACKEND
atlas¶
sasl_enabled¶
- Type
string
- Default
False
- Environment Variable
AIRFLOW__ATLAS__SASL_ENABLED
host¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__ATLAS__HOST
port¶
- Type
string
- Default
21000
- Environment Variable
AIRFLOW__ATLAS__PORT
username¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__ATLAS__USERNAME
password¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__ATLAS__PASSWORD
operators¶
default_owner¶
The default owner assigned to each new operator, unless
provided explicitly or passed via default_args
- Type
string
- Default
airflow
- Environment Variable
AIRFLOW__OPERATORS__DEFAULT_OWNER
default_cpus¶
- Type
string
- Default
1
- Environment Variable
AIRFLOW__OPERATORS__DEFAULT_CPUS
default_ram¶
- Type
string
- Default
512
- Environment Variable
AIRFLOW__OPERATORS__DEFAULT_RAM
default_disk¶
- Type
string
- Default
512
- Environment Variable
AIRFLOW__OPERATORS__DEFAULT_DISK
default_gpus¶
- Type
string
- Default
0
- Environment Variable
AIRFLOW__OPERATORS__DEFAULT_GPUS
hive¶
default_hive_mapred_queue¶
Default mapreduce queue for HiveOperator tasks
- Type
string
- Default
''
- Environment Variable
AIRFLOW__HIVE__DEFAULT_HIVE_MAPRED_QUEUE
webserver¶
base_url¶
The base url of your website as airflow cannot guess what domain or cname you are using. This is used in automated emails that airflow sends to point links to the right web server
- Type
string
- Default
http://localhost:8080
- Environment Variable
AIRFLOW__WEBSERVER__BASE_URL
web_server_host¶
The ip specified when starting the web server
- Type
string
- Default
0.0.0.0
- Environment Variable
AIRFLOW__WEBSERVER__WEB_SERVER_HOST
web_server_port¶
The port on which to run the web server
- Type
string
- Default
8080
- Environment Variable
AIRFLOW__WEBSERVER__WEB_SERVER_PORT
web_server_ssl_cert¶
Paths to the SSL certificate and key for the web server. When both are provided SSL will be enabled. This does not change the web server port.
- Type
string
- Default
''
- Environment Variable
AIRFLOW__WEBSERVER__WEB_SERVER_SSL_CERT
web_server_ssl_key¶
Paths to the SSL certificate and key for the web server. When both are provided SSL will be enabled. This does not change the web server port.
- Type
string
- Default
''
- Environment Variable
AIRFLOW__WEBSERVER__WEB_SERVER_SSL_KEY
web_server_master_timeout¶
Number of seconds the webserver waits before killing gunicorn master that doesn’t respond
- Type
string
- Default
120
- Environment Variable
AIRFLOW__WEBSERVER__WEB_SERVER_MASTER_TIMEOUT
web_server_worker_timeout¶
Number of seconds the gunicorn webserver waits before timing out on a worker
- Type
string
- Default
120
- Environment Variable
AIRFLOW__WEBSERVER__WEB_SERVER_WORKER_TIMEOUT
worker_refresh_batch_size¶
Number of workers to refresh at a time. When set to 0, worker refresh is disabled. When nonzero, airflow periodically refreshes webserver workers by bringing up new ones and killing old ones.
- Type
string
- Default
1
- Environment Variable
AIRFLOW__WEBSERVER__WORKER_REFRESH_BATCH_SIZE
worker_refresh_interval¶
Number of seconds to wait before refreshing a batch of workers.
- Type
string
- Default
30
- Environment Variable
AIRFLOW__WEBSERVER__WORKER_REFRESH_INTERVAL
secret_key¶
Secret key used to run your flask app It should be as random as possible
- Type
string
- Default
temporary_key
- Environment Variable
AIRFLOW__WEBSERVER__SECRET_KEY
workers¶
Number of workers to run the Gunicorn web server
- Type
string
- Default
4
- Environment Variable
AIRFLOW__WEBSERVER__WORKERS
worker_class¶
The worker class gunicorn should use. Choices include sync (default), eventlet, gevent
- Type
string
- Default
sync
- Environment Variable
AIRFLOW__WEBSERVER__WORKER_CLASS
access_logfile¶
Log files for the gunicorn webserver. ‘-‘ means log to stderr.
- Type
string
- Default
-
- Environment Variable
AIRFLOW__WEBSERVER__ACCESS_LOGFILE
error_logfile¶
Log files for the gunicorn webserver. ‘-‘ means log to stderr.
- Type
string
- Default
-
- Environment Variable
AIRFLOW__WEBSERVER__ERROR_LOGFILE
expose_config¶
Expose the configuration file in the web server
- Type
string
- Default
False
- Environment Variable
AIRFLOW__WEBSERVER__EXPOSE_CONFIG
expose_hostname¶
New in version 1.10.8.
Expose hostname in the web server
- Type
string
- Default
True
- Environment Variable
AIRFLOW__WEBSERVER__EXPOSE_HOSTNAME
expose_stacktrace¶
New in version 1.10.8.
Expose stacktrace in the web server
- Type
string
- Default
True
- Environment Variable
AIRFLOW__WEBSERVER__EXPOSE_STACKTRACE
authenticate¶
Set to true to turn on authentication: https://airflow.apache.org/security.html#web-authentication
- Type
boolean
- Default
False
- Environment Variable
AIRFLOW__WEBSERVER__AUTHENTICATE
filter_by_owner¶
Filter the list of dags by owner name (requires authentication to be enabled)
- Type
boolean
- Default
False
- Environment Variable
AIRFLOW__WEBSERVER__FILTER_BY_OWNER
owner_mode¶
Filtering mode. Choices include user (default) and ldapgroup. Ldap group filtering requires using the ldap backend
Note that the ldap server needs the “memberOf” overlay to be set up in order to user the ldapgroup mode.
- Type
string
- Default
user
- Environment Variable
AIRFLOW__WEBSERVER__OWNER_MODE
dag_default_view¶
Default DAG view. Valid values are: tree, graph, duration, gantt, landing_times
- Type
string
- Default
tree
- Environment Variable
AIRFLOW__WEBSERVER__DAG_DEFAULT_VIEW
dag_orientation¶
“Default DAG orientation. Valid values are:” LR (Left->Right), TB (Top->Bottom), RL (Right->Left), BT (Bottom->Top)
- Type
string
- Default
LR
- Environment Variable
AIRFLOW__WEBSERVER__DAG_ORIENTATION
demo_mode¶
Puts the webserver in demonstration mode; blurs the names of Operators for privacy.
- Type
string
- Default
False
- Environment Variable
AIRFLOW__WEBSERVER__DEMO_MODE
log_fetch_timeout_sec¶
The amount of time (in secs) webserver will wait for initial handshake while fetching logs from other worker machine
- Type
string
- Default
5
- Environment Variable
AIRFLOW__WEBSERVER__LOG_FETCH_TIMEOUT_SEC
log_fetch_delay_sec¶
New in version 1.10.8.
Time interval (in secs) to wait before next log fetching.
- Type
int
- Default
2
- Environment Variable
AIRFLOW__WEBSERVER__LOG_FETCH_DELAY_SEC
log_auto_tailing_offset¶
New in version 1.10.8.
Distance away from page bottom to enable auto tailing.
- Type
int
- Default
30
- Environment Variable
AIRFLOW__WEBSERVER__LOG_AUTO_TAILING_OFFSET
log_animation_speed¶
New in version 1.10.8.
Animation speed for auto tailing log display.
- Type
int
- Default
1000
- Environment Variable
AIRFLOW__WEBSERVER__LOG_ANIMATION_SPEED
hide_paused_dags_by_default¶
By default, the webserver shows paused DAGs. Flip this to hide paused DAGs by default
- Type
string
- Default
False
- Environment Variable
AIRFLOW__WEBSERVER__HIDE_PAUSED_DAGS_BY_DEFAULT
page_size¶
Consistent page size across all listing views in the UI
- Type
string
- Default
100
- Environment Variable
AIRFLOW__WEBSERVER__PAGE_SIZE
rbac¶
Use FAB-based webserver with RBAC feature
- Type
string
- Default
False
- Environment Variable
AIRFLOW__WEBSERVER__RBAC
default_dag_run_display_number¶
Default dagrun to show in UI
- Type
string
- Default
25
- Environment Variable
AIRFLOW__WEBSERVER__DEFAULT_DAG_RUN_DISPLAY_NUMBER
enable_proxy_fix¶
New in version 1.10.1.
Enable werkzeug ProxyFix
middleware for reverse proxy
- Type
boolean
- Default
False
- Environment Variable
AIRFLOW__WEBSERVER__ENABLE_PROXY_FIX
proxy_fix_x_for¶
New in version 1.10.7.
Number of values to trust for X-Forwarded-For
.
More info: https://werkzeug.palletsprojects.com/en/0.16.x/middleware/proxy_fix/
- Type
integer
- Default
1
- Environment Variable
AIRFLOW__WEBSERVER__PROXY_FIX_X_FOR
proxy_fix_x_proto¶
New in version 1.10.7.
Number of values to trust for X-Forwarded-Proto
- Type
integer
- Default
1
- Environment Variable
AIRFLOW__WEBSERVER__PROXY_FIX_X_PROTO
proxy_fix_x_host¶
New in version 1.10.7.
Number of values to trust for X-Forwarded-Host
- Type
integer
- Default
1
- Environment Variable
AIRFLOW__WEBSERVER__PROXY_FIX_X_HOST
proxy_fix_x_port¶
New in version 1.10.7.
Number of values to trust for X-Forwarded-Port
- Type
integer
- Default
1
- Environment Variable
AIRFLOW__WEBSERVER__PROXY_FIX_X_PORT
proxy_fix_x_prefix¶
New in version 1.10.7.
Number of values to trust for X-Forwarded-Prefix
- Type
integer
- Default
1
- Environment Variable
AIRFLOW__WEBSERVER__PROXY_FIX_X_PREFIX
cookie_secure¶
New in version 1.10.3.
Set secure flag on session cookie
- Type
string
- Default
False
- Environment Variable
AIRFLOW__WEBSERVER__COOKIE_SECURE
cookie_samesite¶
New in version 1.10.3.
Set samesite policy on session cookie
- Type
string
- Default
''
- Environment Variable
AIRFLOW__WEBSERVER__COOKIE_SAMESITE
default_wrap¶
New in version 1.10.4.
Default setting for wrap toggle on DAG code and TI log views.
- Type
boolean
- Default
False
- Environment Variable
AIRFLOW__WEBSERVER__DEFAULT_WRAP
x_frame_enabled¶
New in version 1.10.8.
Allow the UI to be rendered in a frame
- Type
boolean
- Default
True
- Environment Variable
AIRFLOW__WEBSERVER__X_FRAME_ENABLED
analytics_tool¶
Send anonymous user activity to your analytics tool choose from google_analytics, segment, or metarouter
- Type
string
- Default
None
- Environment Variable
AIRFLOW__WEBSERVER__ANALYTICS_TOOL
analytics_id¶
New in version 1.10.5.
Unique ID of your account in the analytics tool
- Type
string
- Default
None
- Environment Variable
AIRFLOW__WEBSERVER__ANALYTICS_ID
update_fab_perms¶
New in version 1.10.7.
Update FAB permissions and sync security manager roles on webserver startup
- Type
string
- Default
True
- Environment Variable
AIRFLOW__WEBSERVER__UPDATE_FAB_PERMS
force_log_out_after¶
New in version 1.10.8.
Minutes of non-activity before logged out from UI 0 means never get forcibly logged out
- Type
string
- Default
0
- Environment Variable
AIRFLOW__WEBSERVER__FORCE_LOG_OUT_AFTER
session_lifetime_days¶
New in version 1.10.8.
The UI cookie lifetime in days
- Type
string
- Default
30
- Environment Variable
AIRFLOW__WEBSERVER__SESSION_LIFETIME_DAYS
email¶
email_backend¶
- Type
string
- Default
airflow.utils.email.send_email_smtp
- Environment Variable
AIRFLOW__EMAIL__EMAIL_BACKEND
smtp¶
If you want airflow to send emails on retries, failure, and you want to use the airflow.utils.email.send_email_smtp function, you have to configure an smtp server here
smtp_host¶
- Type
string
- Default
localhost
- Environment Variable
AIRFLOW__SMTP__SMTP_HOST
smtp_starttls¶
- Type
string
- Default
True
- Environment Variable
AIRFLOW__SMTP__SMTP_STARTTLS
smtp_ssl¶
- Type
string
- Default
False
- Environment Variable
AIRFLOW__SMTP__SMTP_SSL
smtp_user¶
- Type
string
- Default
None
- Environment Variable
AIRFLOW__SMTP__SMTP_USER
- Example
airflow
smtp_password¶
- Type
string
- Default
None
- Environment Variable
AIRFLOW__SMTP__SMTP_PASSWORD
- Example
airflow
smtp_port¶
- Type
string
- Default
25
- Environment Variable
AIRFLOW__SMTP__SMTP_PORT
smtp_mail_from¶
- Type
string
- Default
airflow@example.com
- Environment Variable
AIRFLOW__SMTP__SMTP_MAIL_FROM
sentry¶
Sentry (https://docs.sentry.io) integration
sentry_dsn¶
New in version 1.10.6.
- Type
string
- Default
''
- Environment Variable
AIRFLOW__SENTRY__SENTRY_DSN
celery¶
This section only applies if you are using the CeleryExecutor in
[core]
section above
celery_app_name¶
The app name that will be used by celery
- Type
string
- Default
airflow.executors.celery_executor
- Environment Variable
AIRFLOW__CELERY__CELERY_APP_NAME
worker_concurrency¶
The concurrency that will be used when starting workers with the
airflow celery worker
command. This defines the number of task instances that
a worker will take, so size up your workers based on the resources on
your worker box and the nature of your tasks
- Type
string
- Default
16
- Environment Variable
AIRFLOW__CELERY__WORKER_CONCURRENCY
worker_autoscale¶
The maximum and minimum concurrency that will be used when starting workers with the
airflow celery worker
command (always keep minimum processes, but grow
to maximum if necessary). Note the value should be max_concurrency,min_concurrency
Pick these numbers based on resources on worker box and the nature of the task.
If autoscale option is available, worker_concurrency will be ignored.
http://docs.celeryproject.org/en/latest/reference/celery.bin.worker.html#cmdoption-celery-worker-autoscale
- Type
string
- Default
16,12
- Environment Variable
AIRFLOW__CELERY__WORKER_AUTOSCALE
- Example
16,12
worker_log_server_port¶
When you start an airflow worker, airflow starts a tiny web server subprocess to serve the workers local log files to the airflow main web server, who then builds pages and sends them to users. This defines the port on which the logs are served. It needs to be unused, and open visible from the main web server to connect into the workers.
- Type
string
- Default
8793
- Environment Variable
AIRFLOW__CELERY__WORKER_LOG_SERVER_PORT
broker_url¶
The Celery broker URL. Celery supports RabbitMQ, Redis and experimentally a sqlalchemy database. Refer to the Celery documentation for more information. http://docs.celeryproject.org/en/latest/userguide/configuration.html#broker-settings
- Type
string
- Default
sqla+mysql://airflow:airflow@localhost:3306/airflow
- Environment Variable
AIRFLOW__CELERY__BROKER_URL
result_backend¶
The Celery result_backend. When a job finishes, it needs to update the metadata of the job. Therefore it will post a message on a message bus, or insert it into a database (depending of the backend) This status is used by the scheduler to update the state of the task The use of a database is highly recommended http://docs.celeryproject.org/en/latest/userguide/configuration.html#task-result-backend-settings
- Type
string
- Default
db+mysql://airflow:airflow@localhost:3306/airflow
- Environment Variable
AIRFLOW__CELERY__RESULT_BACKEND
flower_host¶
Celery Flower is a sweet UI for Celery. Airflow has a shortcut to start
it airflow flower
. This defines the IP that Celery Flower runs on
- Type
string
- Default
0.0.0.0
- Environment Variable
AIRFLOW__CELERY__FLOWER_HOST
flower_url_prefix¶
The root URL for Flower
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CELERY__FLOWER_URL_PREFIX
- Example
/flower
flower_port¶
This defines the port that Celery Flower runs on
- Type
string
- Default
5555
- Environment Variable
AIRFLOW__CELERY__FLOWER_PORT
flower_basic_auth¶
New in version 1.10.2.
Securing Flower with Basic Authentication Accepts user:password pairs separated by a comma
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CELERY__FLOWER_BASIC_AUTH
- Example
user1:password1,user2:password2
default_queue¶
Default queue that tasks get assigned to and that worker listen on.
- Type
string
- Default
default
- Environment Variable
AIRFLOW__CELERY__DEFAULT_QUEUE
sync_parallelism¶
New in version 1.10.3.
How many processes CeleryExecutor uses to sync task state. 0 means to use max(1, number of cores - 1) processes.
- Type
string
- Default
0
- Environment Variable
AIRFLOW__CELERY__SYNC_PARALLELISM
celery_config_options¶
Import path for celery configuration options
- Type
string
- Default
airflow.config_templates.default_celery.DEFAULT_CELERY_CONFIG
- Environment Variable
AIRFLOW__CELERY__CELERY_CONFIG_OPTIONS
ssl_active¶
In case of using SSL
- Type
string
- Default
False
- Environment Variable
AIRFLOW__CELERY__SSL_ACTIVE
ssl_key¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CELERY__SSL_KEY
ssl_cert¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CELERY__SSL_CERT
ssl_cacert¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CELERY__SSL_CACERT
pool¶
New in version 1.10.4.
Celery Pool implementation. Choices include: prefork (default), eventlet, gevent or solo. See: https://docs.celeryproject.org/en/latest/userguide/workers.html#concurrency https://docs.celeryproject.org/en/latest/userguide/concurrency/eventlet.html
- Type
string
- Default
prefork
- Environment Variable
AIRFLOW__CELERY__POOL
operation_timeout¶
New in version 1.10.8.
The number of seconds to wait before timing out send_task_to_executor
or
fetch_celery_task_state
operations.
- Type
int
- Default
2
- Environment Variable
AIRFLOW__CELERY__OPERATION_TIMEOUT
celery_broker_transport_options¶
This section is for specifying options which can be passed to the underlying celery broker transport. See: http://docs.celeryproject.org/en/latest/userguide/configuration.html#std:setting-broker_transport_options
visibility_timeout¶
The visibility timeout defines the number of seconds to wait for the worker to acknowledge the task before the message is redelivered to another worker. Make sure to increase the visibility timeout to match the time of the longest ETA you’re planning to use. visibility_timeout is only supported for Redis and SQS celery brokers. See: http://docs.celeryproject.org/en/master/userguide/configuration.html#std:setting-broker_transport_options
- Type
string
- Default
None
- Environment Variable
AIRFLOW__CELERY_BROKER_TRANSPORT_OPTIONS__VISIBILITY_TIMEOUT
- Example
21600
dask¶
This section only applies if you are using the DaskExecutor in [core] section above
cluster_address¶
The IP address and port of the Dask cluster’s scheduler.
- Type
string
- Default
127.0.0.1:8786
- Environment Variable
AIRFLOW__DASK__CLUSTER_ADDRESS
tls_ca¶
TLS/ SSL settings to access a secured Dask scheduler.
- Type
string
- Default
''
- Environment Variable
AIRFLOW__DASK__TLS_CA
tls_cert¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__DASK__TLS_CERT
tls_key¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__DASK__TLS_KEY
scheduler¶
job_heartbeat_sec¶
Task instances listen for external kill signal (when you clear tasks from the CLI or the UI), this defines the frequency at which they should listen (in seconds).
- Type
string
- Default
5
- Environment Variable
AIRFLOW__SCHEDULER__JOB_HEARTBEAT_SEC
scheduler_heartbeat_sec¶
The scheduler constantly tries to trigger new tasks (look at the scheduler section in the docs for more information). This defines how often the scheduler should run (in seconds).
- Type
string
- Default
5
- Environment Variable
AIRFLOW__SCHEDULER__SCHEDULER_HEARTBEAT_SEC
run_duration¶
After how much time should the scheduler terminate in seconds -1 indicates to run continuously (see also num_runs)
- Type
string
- Default
-1
- Environment Variable
AIRFLOW__SCHEDULER__RUN_DURATION
num_runs¶
New in version 1.10.6.
The number of times to try to schedule each DAG file -1 indicates unlimited number
- Type
string
- Default
-1
- Environment Variable
AIRFLOW__SCHEDULER__NUM_RUNS
processor_poll_interval¶
New in version 1.10.6.
The number of seconds to wait between consecutive DAG file processing
- Type
string
- Default
1
- Environment Variable
AIRFLOW__SCHEDULER__PROCESSOR_POLL_INTERVAL
min_file_process_interval¶
after how much time (seconds) a new DAGs should be picked up from the filesystem
- Type
string
- Default
0
- Environment Variable
AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL
dag_dir_list_interval¶
How often (in seconds) to scan the DAGs directory for new files. Default to 5 minutes.
- Type
string
- Default
300
- Environment Variable
AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL
print_stats_interval¶
How often should stats be printed to the logs. Setting to 0 will disable printing stats
- Type
string
- Default
30
- Environment Variable
AIRFLOW__SCHEDULER__PRINT_STATS_INTERVAL
scheduler_health_check_threshold¶
New in version 1.10.2.
If the last scheduler heartbeat happened more than scheduler_health_check_threshold ago (in seconds), scheduler is considered unhealthy. This is used by the health check in the “/health” endpoint
- Type
string
- Default
30
- Environment Variable
AIRFLOW__SCHEDULER__SCHEDULER_HEALTH_CHECK_THRESHOLD
child_process_log_directory¶
- Type
string
- Default
{AIRFLOW_HOME}/logs/scheduler
- Environment Variable
AIRFLOW__SCHEDULER__CHILD_PROCESS_LOG_DIRECTORY
scheduler_zombie_task_threshold¶
Local task jobs periodically heartbeat to the DB. If the job has not heartbeat in this many seconds, the scheduler will mark the associated task instance as failed and will re-schedule the task.
- Type
string
- Default
300
- Environment Variable
AIRFLOW__SCHEDULER__SCHEDULER_ZOMBIE_TASK_THRESHOLD
catchup_by_default¶
Turn off scheduler catchup by setting this to False. Default behavior is unchanged and Command Line Backfills still work, but the scheduler will not do scheduler catchup if this is False, however it can be set on a per DAG basis in the DAG definition (catchup)
- Type
string
- Default
True
- Environment Variable
AIRFLOW__SCHEDULER__CATCHUP_BY_DEFAULT
max_tis_per_query¶
This changes the batch size of queries in the scheduling main loop. If this is too high, SQL query performance may be impacted by one or more of the following: - reversion to full table scan - complexity of query predicate - excessive locking Additionally, you may hit the maximum allowable query length for your db. Set this to 0 for no limit (not advised)
- Type
string
- Default
512
- Environment Variable
AIRFLOW__SCHEDULER__MAX_TIS_PER_QUERY
statsd_on¶
Statsd (https://github.com/etsy/statsd) integration settings
- Type
string
- Default
False
- Environment Variable
AIRFLOW__SCHEDULER__STATSD_ON
statsd_host¶
- Type
string
- Default
localhost
- Environment Variable
AIRFLOW__SCHEDULER__STATSD_HOST
statsd_port¶
- Type
string
- Default
8125
- Environment Variable
AIRFLOW__SCHEDULER__STATSD_PORT
statsd_prefix¶
- Type
string
- Default
airflow
- Environment Variable
AIRFLOW__SCHEDULER__STATSD_PREFIX
statsd_allow_list¶
New in version 1.10.6.
If you want to avoid send all the available metrics to StatsD, you can configure an allow list of prefixes to send only the metrics that start with the elements of the list (e.g: scheduler,executor,dagrun)
- Type
string
- Default
''
- Environment Variable
AIRFLOW__SCHEDULER__STATSD_ALLOW_LIST
max_threads¶
The scheduler can run multiple threads in parallel to schedule dags. This defines how many threads will run.
- Type
string
- Default
2
- Environment Variable
AIRFLOW__SCHEDULER__MAX_THREADS
authenticate¶
- Type
string
- Default
False
- Environment Variable
AIRFLOW__SCHEDULER__AUTHENTICATE
use_job_schedule¶
New in version 1.10.2.
Turn off scheduler use of cron intervals by setting this to False. DAGs submitted manually in the web UI or with trigger_dag will still run.
- Type
string
- Default
True
- Environment Variable
AIRFLOW__SCHEDULER__USE_JOB_SCHEDULE
allow_trigger_in_future¶
New in version 1.10.8.
Allow externally triggered DagRuns for Execution Dates in the future Only has effect if schedule_interval is set to None in DAG
- Type
string
- Default
False
- Environment Variable
AIRFLOW__SCHEDULER__ALLOW_TRIGGER_IN_FUTURE
ldap¶
uri¶
set this to ldaps://<your.ldap.server>:<port>
- Type
string
- Default
''
- Environment Variable
AIRFLOW__LDAP__URI
user_filter¶
- Type
string
- Default
objectClass=*
- Environment Variable
AIRFLOW__LDAP__USER_FILTER
user_name_attr¶
- Type
string
- Default
uid
- Environment Variable
AIRFLOW__LDAP__USER_NAME_ATTR
group_member_attr¶
- Type
string
- Default
memberOf
- Environment Variable
AIRFLOW__LDAP__GROUP_MEMBER_ATTR
superuser_filter¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__LDAP__SUPERUSER_FILTER
data_profiler_filter¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__LDAP__DATA_PROFILER_FILTER
bind_user¶
- Type
string
- Default
cn=Manager,dc=example,dc=com
- Environment Variable
AIRFLOW__LDAP__BIND_USER
bind_password¶
- Type
string
- Default
insecure
- Environment Variable
AIRFLOW__LDAP__BIND_PASSWORD
basedn¶
- Type
string
- Default
dc=example,dc=com
- Environment Variable
AIRFLOW__LDAP__BASEDN
cacert¶
- Type
string
- Default
/etc/ca/ldap_ca.crt
- Environment Variable
AIRFLOW__LDAP__CACERT
search_scope¶
- Type
string
- Default
LEVEL
- Environment Variable
AIRFLOW__LDAP__SEARCH_SCOPE
ignore_malformed_schema¶
New in version 1.10.3.
This setting allows the use of LDAP servers that either return a broken schema, or do not return a schema.
- Type
string
- Default
False
- Environment Variable
AIRFLOW__LDAP__IGNORE_MALFORMED_SCHEMA
mesos¶
master¶
Mesos master address which MesosExecutor will connect to.
- Type
string
- Default
localhost:5050
- Environment Variable
AIRFLOW__MESOS__MASTER
framework_name¶
The framework name which Airflow scheduler will register itself as on mesos
- Type
string
- Default
Airflow
- Environment Variable
AIRFLOW__MESOS__FRAMEWORK_NAME
task_cpu¶
Number of cpu cores required for running one task instance using ‘airflow run <dag_id> <task_id> <execution_date> –local -p <pickle_id>’ command on a mesos slave
- Type
int
- Default
1
- Environment Variable
AIRFLOW__MESOS__TASK_CPU
task_memory¶
Memory in MB required for running one task instance using ‘airflow run <dag_id> <task_id> <execution_date> –local -p <pickle_id>’ command on a mesos slave
- Type
string
- Default
256
- Environment Variable
AIRFLOW__MESOS__TASK_MEMORY
checkpoint¶
Enable framework checkpointing for mesos See http://mesos.apache.org/documentation/latest/slave-recovery/
- Type
boolean
- Default
False
- Environment Variable
AIRFLOW__MESOS__CHECKPOINT
failover_timeout¶
Failover timeout in milliseconds. When checkpointing is enabled and this option is set, Mesos waits until the configured timeout for the MesosExecutor framework to re-register after a failover. Mesos shuts down running tasks if the MesosExecutor framework fails to re-register within this timeframe.
- Type
int
- Default
None
- Environment Variable
AIRFLOW__MESOS__FAILOVER_TIMEOUT
- Example
604800
authenticate¶
Enable framework authentication for mesos See http://mesos.apache.org/documentation/latest/configuration/
- Type
boolean
- Default
False
- Environment Variable
AIRFLOW__MESOS__AUTHENTICATE
default_principal¶
Mesos credentials, if authentication is enabled
- Type
boolean
- Default
None
- Environment Variable
AIRFLOW__MESOS__DEFAULT_PRINCIPAL
- Example
admin
default_secret¶
- Type
boolean
- Default
None
- Environment Variable
AIRFLOW__MESOS__DEFAULT_SECRET
- Example
admin
docker_image_slave¶
Optional Docker Image to run on slave before running the command This image should be accessible from mesos slave i.e mesos slave should be able to pull this docker image before executing the command.
- Type
boolean
- Default
None
- Environment Variable
AIRFLOW__MESOS__DOCKER_IMAGE_SLAVE
- Example
puckel/docker-airflow
kerberos¶
ccache¶
- Type
string
- Default
/tmp/airflow_krb5_ccache
- Environment Variable
AIRFLOW__KERBEROS__CCACHE
principal¶
gets augmented with fqdn
- Type
string
- Default
airflow
- Environment Variable
AIRFLOW__KERBEROS__PRINCIPAL
reinit_frequency¶
- Type
string
- Default
3600
- Environment Variable
AIRFLOW__KERBEROS__REINIT_FREQUENCY
kinit_path¶
- Type
string
- Default
kinit
- Environment Variable
AIRFLOW__KERBEROS__KINIT_PATH
keytab¶
- Type
string
- Default
airflow.keytab
- Environment Variable
AIRFLOW__KERBEROS__KEYTAB
github_enterprise¶
api_rev¶
- Type
string
- Default
v3
- Environment Variable
AIRFLOW__GITHUB_ENTERPRISE__API_REV
admin¶
hide_sensitive_variable_fields¶
UI to hide sensitive variable fields when set to True
- Type
string
- Default
True
- Environment Variable
AIRFLOW__ADMIN__HIDE_SENSITIVE_VARIABLE_FIELDS
elasticsearch¶
host¶
New in version 1.10.4.
Elasticsearch host
- Type
string
- Default
''
- Environment Variable
AIRFLOW__ELASTICSEARCH__HOST
log_id_template¶
New in version 1.10.4.
Format of the log_id, which is used to query for a given tasks logs
- Type
string
- Default
{{dag_id}}-{{task_id}}-{{execution_date}}-{{try_number}}
- Environment Variable
AIRFLOW__ELASTICSEARCH__LOG_ID_TEMPLATE
end_of_log_mark¶
New in version 1.10.4.
Used to mark the end of a log stream for a task
- Type
string
- Default
end_of_log
- Environment Variable
AIRFLOW__ELASTICSEARCH__END_OF_LOG_MARK
frontend¶
New in version 1.10.4.
Qualified URL for an elasticsearch frontend (like Kibana) with a template argument for log_id Code will construct log_id using the log_id template from the argument above. NOTE: The code will prefix the https:// automatically, don’t include that here.
- Type
string
- Default
''
- Environment Variable
AIRFLOW__ELASTICSEARCH__FRONTEND
write_stdout¶
New in version 1.10.4.
Write the task logs to the stdout of the worker, rather than the default files
- Type
string
- Default
False
- Environment Variable
AIRFLOW__ELASTICSEARCH__WRITE_STDOUT
json_format¶
New in version 1.10.4.
Instead of the default log formatter, write the log lines as JSON
- Type
string
- Default
False
- Environment Variable
AIRFLOW__ELASTICSEARCH__JSON_FORMAT
json_fields¶
New in version 1.10.4.
Log fields to also attach to the json output, if enabled
- Type
string
- Default
asctime, filename, lineno, levelname, message
- Environment Variable
AIRFLOW__ELASTICSEARCH__JSON_FIELDS
elasticsearch_configs¶
use_ssl¶
New in version 1.10.5.
- Type
string
- Default
False
- Environment Variable
AIRFLOW__ELASTICSEARCH_CONFIGS__USE_SSL
verify_certs¶
New in version 1.10.5.
- Type
string
- Default
True
- Environment Variable
AIRFLOW__ELASTICSEARCH_CONFIGS__VERIFY_CERTS
kubernetes¶
worker_container_repository¶
The repository, tag and imagePullPolicy of the Kubernetes Image for the Worker to Run
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__WORKER_CONTAINER_REPOSITORY
worker_container_tag¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__WORKER_CONTAINER_TAG
worker_container_image_pull_policy¶
New in version 1.10.2.
- Type
string
- Default
IfNotPresent
- Environment Variable
AIRFLOW__KUBERNETES__WORKER_CONTAINER_IMAGE_PULL_POLICY
delete_worker_pods¶
If True (default), worker pods will be deleted upon termination
- Type
string
- Default
True
- Environment Variable
AIRFLOW__KUBERNETES__DELETE_WORKER_PODS
worker_pods_creation_batch_size¶
New in version 1.10.3.
Number of Kubernetes Worker Pod creation calls per scheduler loop
- Type
string
- Default
1
- Environment Variable
AIRFLOW__KUBERNETES__WORKER_PODS_CREATION_BATCH_SIZE
namespace¶
The Kubernetes namespace where airflow workers should be created. Defaults to default
- Type
string
- Default
default
- Environment Variable
AIRFLOW__KUBERNETES__NAMESPACE
airflow_configmap¶
The name of the Kubernetes ConfigMap containing the Airflow Configuration (this file)
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__AIRFLOW_CONFIGMAP
- Example
airflow-configmap
airflow_local_settings_configmap¶
New in version 1.10.8.
The name of the Kubernetes ConfigMap containing airflow_local_settings.py
file.
For example:
airflow_local_settings_configmap = "airflow-configmap"
if you have the following ConfigMap.
airflow-configmap.yaml
:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: airflow-configmap
data:
airflow_local_settings.py: |
def pod_mutation_hook(pod):
...
airflow.cfg: |
...
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__AIRFLOW_LOCAL_SETTINGS_CONFIGMAP
- Example
airflow-configmap
dags_in_image¶
New in version 1.10.2.
For docker image already contains DAGs, this is set to True
, and the worker will
search for dags in dags_folder,
otherwise use git sync or dags volume claim to mount DAGs
- Type
string
- Default
False
- Environment Variable
AIRFLOW__KUBERNETES__DAGS_IN_IMAGE
dags_volume_subpath¶
For either git sync or volume mounted DAGs, the worker will look in this subpath for DAGs
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__DAGS_VOLUME_SUBPATH
dags_volume_claim¶
For DAGs mounted via a volume claim (mutually exclusive with git-sync and host path)
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__DAGS_VOLUME_CLAIM
logs_volume_subpath¶
For volume mounted logs, the worker will look in this subpath for logs
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__LOGS_VOLUME_SUBPATH
logs_volume_claim¶
A shared volume claim for the logs
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__LOGS_VOLUME_CLAIM
dags_volume_host¶
New in version 1.10.2.
For DAGs mounted via a hostPath volume (mutually exclusive with volume claim and git-sync) Useful in local environment, discouraged in production
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__DAGS_VOLUME_HOST
logs_volume_host¶
New in version 1.10.2.
A hostPath volume for the logs Useful in local environment, discouraged in production
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__LOGS_VOLUME_HOST
env_from_configmap_ref¶
New in version 1.10.3.
A list of configMapsRefs to envFrom. If more than one configMap is specified, provide a comma separated list: configmap_a,configmap_b
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__ENV_FROM_CONFIGMAP_REF
env_from_secret_ref¶
New in version 1.10.3.
A list of secretRefs to envFrom. If more than one secret is specified, provide a comma separated list: secret_a,secret_b
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__ENV_FROM_SECRET_REF
git_repo¶
Git credentials and repository for DAGs mounted via Git (mutually exclusive with volume claim)
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__GIT_REPO
git_branch¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__GIT_BRANCH
git_subpath¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__GIT_SUBPATH
git_sync_rev¶
New in version 1.10.7.
The specific rev or hash the git_sync init container will checkout This becomes GIT_SYNC_REV environment variable in the git_sync init container for worker pods
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__GIT_SYNC_REV
git_user¶
Use git_user and git_password for user authentication or git_ssh_key_secret_name and git_ssh_key_secret_key for SSH authentication
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__GIT_USER
git_password¶
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__GIT_PASSWORD
git_sync_root¶
New in version 1.10.2.
- Type
string
- Default
/git
- Environment Variable
AIRFLOW__KUBERNETES__GIT_SYNC_ROOT
git_sync_dest¶
New in version 1.10.2.
- Type
string
- Default
repo
- Environment Variable
AIRFLOW__KUBERNETES__GIT_SYNC_DEST
git_dags_folder_mount_point¶
New in version 1.10.2.
Mount point of the volume if git-sync is being used. i.e. {AIRFLOW_HOME}/dags
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__GIT_DAGS_FOLDER_MOUNT_POINT
git_ssh_key_secret_name¶
New in version 1.10.3.
To get Git-sync SSH authentication set up follow this format
airflow-secrets.yaml
:
---
apiVersion: v1
kind: Secret
metadata:
name: airflow-secrets
data:
# key needs to be gitSshKey
gitSshKey: <base64_encoded_data>
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__GIT_SSH_KEY_SECRET_NAME
- Example
airflow-secrets
git_ssh_known_hosts_configmap_name¶
New in version 1.10.3.
To get Git-sync SSH authentication set up follow this format
airflow-configmap.yaml
:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: airflow-configmap
data:
known_hosts: |
github.com ssh-rsa <...>
airflow.cfg: |
...
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__GIT_SSH_KNOWN_HOSTS_CONFIGMAP_NAME
- Example
airflow-configmap
git_sync_credentials_secret¶
New in version 1.10.5.
To give the git_sync init container credentials via a secret, create a secret
with two fields: GIT_SYNC_USERNAME and GIT_SYNC_PASSWORD (example below) and
add git_sync_credentials_secret = <secret_name>
to your airflow config under the
kubernetes
section
Secret Example:
---
apiVersion: v1
kind: Secret
metadata:
name: git-credentials
data:
GIT_SYNC_USERNAME: <base64_encoded_git_username>
GIT_SYNC_PASSWORD: <base64_encoded_git_password>
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__GIT_SYNC_CREDENTIALS_SECRET
git_sync_container_repository¶
For cloning DAGs from git repositories into volumes: https://github.com/kubernetes/git-sync
- Type
string
- Default
k8s.gcr.io/git-sync
- Environment Variable
AIRFLOW__KUBERNETES__GIT_SYNC_CONTAINER_REPOSITORY
git_sync_container_tag¶
- Type
string
- Default
v3.1.1
- Environment Variable
AIRFLOW__KUBERNETES__GIT_SYNC_CONTAINER_TAG
git_sync_init_container_name¶
- Type
string
- Default
git-sync-clone
- Environment Variable
AIRFLOW__KUBERNETES__GIT_SYNC_INIT_CONTAINER_NAME
git_sync_run_as_user¶
New in version 1.10.5.
- Type
string
- Default
65533
- Environment Variable
AIRFLOW__KUBERNETES__GIT_SYNC_RUN_AS_USER
worker_service_account_name¶
The name of the Kubernetes service account to be associated with airflow workers, if any. Service accounts are required for workers that require access to secrets or cluster resources. See the Kubernetes RBAC documentation for more: https://kubernetes.io/docs/admin/authorization/rbac/
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__WORKER_SERVICE_ACCOUNT_NAME
image_pull_secrets¶
Any image pull secrets to be given to worker pods, If more than one secret is required, provide a comma separated list: secret_a,secret_b
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__IMAGE_PULL_SECRETS
gcp_service_account_keys¶
GCP Service Account Keys to be provided to tasks run on Kubernetes Executors Should be supplied in the format: key-name-1:key-path-1,key-name-2:key-path-2
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__GCP_SERVICE_ACCOUNT_KEYS
in_cluster¶
Use the service account kubernetes gives to pods to connect to kubernetes cluster. It’s intended for clients that expect to be running inside a pod running on kubernetes. It will raise an exception if called from a process not running in a kubernetes environment.
- Type
string
- Default
True
- Environment Variable
AIRFLOW__KUBERNETES__IN_CLUSTER
cluster_context¶
New in version 1.10.3.
When running with in_cluster=False change the default cluster_context or config_file
options to Kubernetes client. Leave blank these to use default behaviour like kubectl
has.
- Type
string
- Default
None
- Environment Variable
AIRFLOW__KUBERNETES__CLUSTER_CONTEXT
config_file¶
New in version 1.10.3.
- Type
string
- Default
None
- Environment Variable
AIRFLOW__KUBERNETES__CONFIG_FILE
affinity¶
New in version 1.10.2.
Affinity configuration as a single line formatted JSON object.
See the affinity model for top-level key names (e.g. nodeAffinity
, etc.):
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.12/#affinity-v1-core
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__AFFINITY
tolerations¶
New in version 1.10.2.
A list of toleration objects as a single line formatted JSON array See: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.12/#toleration-v1-core
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__TOLERATIONS
kube_client_request_args¶
New in version 1.10.4.
Keyword parameters to pass while calling a kubernetes client core_v1_api methods from Kubernetes Executor provided as a single line formatted JSON dictionary string. List of supported params are similar for all core_v1_apis, hence a single config variable for all apis. See: https://raw.githubusercontent.com/kubernetes-client/python/master/kubernetes/client/apis/core_v1_api.py Note that if no _request_timeout is specified, the kubernetes client will wait indefinitely for kubernetes api responses, which will cause the scheduler to hang. The timeout is specified as [connect timeout, read timeout]
- Type
string
- Default
{{"_request_timeout" : [60,60] }}
- Environment Variable
AIRFLOW__KUBERNETES__KUBE_CLIENT_REQUEST_ARGS
run_as_user¶
New in version 1.10.3.
Specifies the uid to run the first process of the worker pods containers as
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__RUN_AS_USER
fs_group¶
New in version 1.10.3.
Specifies a gid to associate with all containers in the worker pods if using a git_ssh_key_secret_name use an fs_group that allows for the key to be read, e.g. 65533
- Type
string
- Default
''
- Environment Variable
AIRFLOW__KUBERNETES__FS_GROUP
kubernetes_node_selectors¶
The Key-value pairs to be given to worker pods. The worker pods will be scheduled to the nodes of the specified key-value pairs. Should be supplied in the format: key = value
kubernetes_annotations¶
The Key-value annotations pairs to be given to worker pods. Should be supplied in the format: key = value
kubernetes_environment_variables¶
The scheduler sets the following environment variables into your workers. You may define as
many environment variables as needed and the kubernetes launcher will set them in the launched workers.
Environment variables in this section are defined as follows
<environment_variable_key> = <environment_variable_value>
For example if you wanted to set an environment variable with value prod and key
ENVIRONMENT
you would follow the following format:
ENVIRONMENT = prod
Additionally you may override worker airflow settings with the AIRFLOW__<SECTION>__<KEY>
formatting as supported by airflow normally.
kubernetes_secrets¶
The scheduler mounts the following secrets into your workers as they are launched by the
scheduler. You may define as many secrets as needed and the kubernetes launcher will parse the
defined secrets and mount them as secret environment variables in the launched workers.
Secrets in this section are defined as follows
<environment_variable_mount> = <kubernetes_secret_object>=<kubernetes_secret_key>
For example if you wanted to mount a kubernetes secret key named postgres_password
from the
kubernetes secret object airflow-secret
as the environment variable POSTGRES_PASSWORD
into
your workers you would follow the following format:
POSTGRES_PASSWORD = airflow-secret=postgres_credentials
Additionally you may override worker airflow settings with the AIRFLOW__<SECTION>__<KEY>
formatting as supported by airflow normally.
kubernetes_labels¶
The Key-value pairs to be given to worker pods.
The worker pods will be given these static labels, as well as some additional dynamic labels
to identify the task.
Should be supplied in the format: key = value