Configuration Reference¶
This page contains the list of all available Airflow configurations for the
apache-airflow-providers-celery
provider that can be set in the airflow.cfg
file or using environment variables.
Note
The configuration embedded in provider packages started to be used as of Airflow 2.7.0. Previously the configuration was described and configured in the Airflow core package - so if you are using Airflow below 2.7.0, look at Airflow documentation for the list of available configuration options that were available in Airflow core.
Note
For more information see Setting Configuration Options.
[celery]¶
This section only applies if you are using the CeleryExecutor in
[core]
section above
broker_url¶
The Celery broker URL. Celery supports RabbitMQ, Redis and experimentally a sqlalchemy database. Refer to the Celery documentation for more information.
- Type
string
- Default
redis://redis:6379/0
- Environment Variables
AIRFLOW__CELERY__BROKER_URL
AIRFLOW__CELERY__BROKER_URL_CMD
AIRFLOW__CELERY__BROKER_URL_SECRET
celery_app_name¶
The app name that will be used by celery
- Type
string
- Default
airflow.providers.celery.executors.celery_executor
- Environment Variable
AIRFLOW__CELERY__CELERY_APP_NAME
celery_config_options¶
Import path for celery configuration options
- Type
string
- Default
airflow.providers.celery.executors.default_celery.DEFAULT_CELERY_CONFIG
- Environment Variable
AIRFLOW__CELERY__CELERY_CONFIG_OPTIONS
flower_basic_auth¶
Securing Flower with Basic Authentication Accepts user:password pairs separated by a comma
- Type
string
- Default
''
- Environment Variables
AIRFLOW__CELERY__FLOWER_BASIC_AUTH
AIRFLOW__CELERY__FLOWER_BASIC_AUTH_CMD
AIRFLOW__CELERY__FLOWER_BASIC_AUTH_SECRET
- Example
user1:password1,user2:password2
flower_host¶
Celery Flower is a sweet UI for Celery. Airflow has a shortcut to start
it airflow celery flower
. This defines the IP that Celery Flower runs on
- Type
string
- Default
0.0.0.0
- Environment Variable
AIRFLOW__CELERY__FLOWER_HOST
flower_port¶
This defines the port that Celery Flower runs on
- Type
string
- Default
5555
- Environment Variable
AIRFLOW__CELERY__FLOWER_PORT
flower_url_prefix¶
The root URL for Flower
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CELERY__FLOWER_URL_PREFIX
- Example
/flower
operation_timeout¶
The number of seconds to wait before timing out send_task_to_executor
or
fetch_celery_task_state
operations.
- Type
float
- Default
1.0
- Environment Variable
AIRFLOW__CELERY__OPERATION_TIMEOUT
pool¶
Celery Pool implementation.
Choices include: prefork
(default), eventlet
, gevent
or solo
.
See:
https://docs.celeryq.dev/en/latest/userguide/workers.html#concurrency
https://docs.celeryq.dev/en/latest/userguide/concurrency/eventlet.html
- Type
string
- Default
prefork
- Environment Variable
AIRFLOW__CELERY__POOL
result_backend¶
The Celery result_backend. When a job finishes, it needs to update the metadata of the job. Therefore it will post a message on a message bus, or insert it into a database (depending of the backend) This status is used by the scheduler to update the state of the task The use of a database is highly recommended When not specified, sql_alchemy_conn with a db+ scheme prefix will be used https://docs.celeryq.dev/en/latest/userguide/configuration.html#task-result-backend-settings
- Type
string
- Default
None
- Environment Variables
AIRFLOW__CELERY__RESULT_BACKEND
AIRFLOW__CELERY__RESULT_BACKEND_CMD
AIRFLOW__CELERY__RESULT_BACKEND_SECRET
- Example
db+postgresql://postgres:airflow@postgres/airflow
result_backend_sqlalchemy_engine_options¶
Optional configuration dictionary to pass to the Celery result backend SQLAlchemy engine.
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CELERY__RESULT_BACKEND_SQLALCHEMY_ENGINE_OPTIONS
- Example
{"pool_recycle": 1800}
ssl_active¶
- Type
string
- Default
False
- Environment Variable
AIRFLOW__CELERY__SSL_ACTIVE
ssl_cacert¶
Path to the CA certificate.
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CELERY__SSL_CACERT
ssl_cert¶
Path to the client certificate.
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CELERY__SSL_CERT
ssl_key¶
Path to the client key.
- Type
string
- Default
''
- Environment Variable
AIRFLOW__CELERY__SSL_KEY
sync_parallelism¶
How many processes CeleryExecutor uses to sync task state. 0 means to use max(1, number of cores - 1) processes.
- Type
string
- Default
0
- Environment Variable
AIRFLOW__CELERY__SYNC_PARALLELISM
task_acks_late¶
New in version 3.6.0.
If an Airflow task’s execution time exceeds the visibility_timeout, Celery will re-assign the task to a Celery worker, even if the original task is still running successfully. The new task instance then runs concurrently with the original task and the Airflow UI and logs only show an error message: ‘Task Instance Not Running’ FAILED: Task is in the running state’ Setting task_acks_late to True will force Celery to wait until a task is finished before a new task instance is assigned. This effectively overrides the visibility timeout. See also: https://docs.celeryq.dev/en/stable/reference/celery.app.task.html#celery.app.task.Task.acks_late
- Type
boolean
- Default
True
- Environment Variable
AIRFLOW__CELERY__TASK_ACKS_LATE
- Example
True
task_publish_max_retries¶
The Maximum number of retries for publishing task messages to the broker when failing
due to AirflowTaskTimeout
error before giving up and marking Task as failed.
- Type
integer
- Default
3
- Environment Variable
AIRFLOW__CELERY__TASK_PUBLISH_MAX_RETRIES
task_track_started¶
Celery task will report its status as ‘started’ when the task is executed by a worker. This is used in Airflow to keep track of the running tasks and if a Scheduler is restarted or run in HA mode, it can adopt the orphan tasks launched by previous SchedulerJob.
- Type
boolean
- Default
True
- Environment Variable
AIRFLOW__CELERY__TASK_TRACK_STARTED
worker_autoscale¶
The maximum and minimum number of pool processes that will be used to dynamically resize
the pool based on load.Enable autoscaling by providing max_concurrency,min_concurrency
with the airflow celery worker
command (always keep minimum processes,
but grow to maximum if necessary).
Pick these numbers based on resources on worker box and the nature of the task.
If autoscale option is available, worker_concurrency will be ignored.
https://docs.celeryq.dev/en/latest/reference/celery.bin.worker.html#cmdoption-celery-worker-autoscale
- Type
string
- Default
None
- Environment Variable
AIRFLOW__CELERY__WORKER_AUTOSCALE
- Example
16,12
worker_concurrency¶
The concurrency that will be used when starting workers with the
airflow celery worker
command. This defines the number of task instances that
a worker will take, so size up your workers based on the resources on
your worker box and the nature of your tasks
- Type
string
- Default
16
- Environment Variable
AIRFLOW__CELERY__WORKER_CONCURRENCY
worker_enable_remote_control¶
Specify if remote control of the workers is enabled.
In some cases when the broker does not support remote control, Celery creates lots of
.*reply-celery-pidbox
queues. You can prevent this by setting this to false.
However, with this disabled Flower won’t work.
https://docs.celeryq.dev/en/stable/getting-started/backends-and-brokers/index.html#broker-overview
- Type
boolean
- Default
true
- Environment Variable
AIRFLOW__CELERY__WORKER_ENABLE_REMOTE_CONTROL
worker_precheck¶
Worker initialisation check to validate Metadata Database connection
- Type
string
- Default
False
- Environment Variable
AIRFLOW__CELERY__WORKER_PRECHECK
worker_prefetch_multiplier¶
Used to increase the number of tasks that a worker prefetches which can improve performance. The number of processes multiplied by worker_prefetch_multiplier is the number of tasks that are prefetched by a worker. A value greater than 1 can result in tasks being unnecessarily blocked if there are multiple workers and one worker prefetches tasks that sit behind long running tasks while another worker has unutilized processes that are unable to process the already claimed blocked tasks. https://docs.celeryq.dev/en/stable/userguide/optimizing.html#prefetch-limits
- Type
integer
- Default
1
- Environment Variable
AIRFLOW__CELERY__WORKER_PREFETCH_MULTIPLIER
default_queue (Deprecated)¶
Deprecated since version 2.1.0: The option has been moved to operators.default_queue
stalled_task_timeout (Deprecated)¶
Deprecated since version 2.6.0: The option has been moved to scheduler.task_queued_timeout
task_adoption_timeout (Deprecated)¶
Deprecated since version 2.6.0: The option has been moved to scheduler.task_queued_timeout
worker_log_server_port (Deprecated)¶
Deprecated since version 2.2.0: The option has been moved to logging.worker_log_server_port
[celery_broker_transport_options]¶
This section is for specifying options which can be passed to the underlying celery broker transport. See: https://docs.celeryq.dev/en/latest/userguide/configuration.html#std:setting-broker_transport_options
sentinel_kwargs¶
The sentinel_kwargs parameter allows passing additional options to the Sentinel client. In a typical scenario where Redis Sentinel is used as the broker and Redis servers are password-protected, the password needs to be passed through this parameter. Although its type is string, it is required to pass a string that conforms to the dictionary format. See: https://docs.celeryq.dev/en/stable/getting-started/backends-and-brokers/redis.html#configuration
- Type
string
- Default
None
- Environment Variables
AIRFLOW__CELERY_BROKER_TRANSPORT_OPTIONS__SENTINEL_KWARGS
AIRFLOW__CELERY_BROKER_TRANSPORT_OPTIONS__SENTINEL_KWARGS_CMD
AIRFLOW__CELERY_BROKER_TRANSPORT_OPTIONS__SENTINEL_KWARGS_SECRET
- Example
{"password": "password_for_redis_server"}
visibility_timeout¶
The visibility timeout defines the number of seconds to wait for the worker to acknowledge the task before the message is redelivered to another worker. Make sure to increase the visibility timeout to match the time of the longest ETA you’re planning to use. visibility_timeout is only supported for Redis and SQS celery brokers. See: https://docs.celeryq.dev/en/stable/getting-started/backends-and-brokers/redis.html#visibility-timeout
- Type
string
- Default
None
- Environment Variable
AIRFLOW__CELERY_BROKER_TRANSPORT_OPTIONS__VISIBILITY_TIMEOUT
- Example
21600
[celery_kubernetes_executor]¶
This section only applies if you are using the CeleryKubernetesExecutor
in
[core]
section above
kubernetes_queue¶
Define when to send a task to KubernetesExecutor
when using CeleryKubernetesExecutor
.
When the queue of a task is the value of kubernetes_queue
(default kubernetes
),
the task is executed via KubernetesExecutor
,
otherwise via CeleryExecutor
- Type
string
- Default
kubernetes
- Environment Variable
AIRFLOW__CELERY_KUBERNETES_EXECUTOR__KUBERNETES_QUEUE