airflow.executors.celery_executor¶
CeleryExecutor
See also
For more information on how the CeleryExecutor works, take a look at the guide: Celery Executor
Module Contents¶
Classes¶
| Wrapper class used to propagate exceptions to parent processes from subprocesses. | |
| CeleryExecutor is recommended for production use of Airflow. It allows | |
| Gets status for many Celery tasks using the best method available | 
Functions¶
| 
 | Executes command. | 
| 
 | Sends task to executor. | 
| 
 | Preload some "expensive" airflow modules so that every task process doesn't have to import it again and | 
| 
 | Fetch and return the state of the given celery task. The scope of this function is | 
Attributes¶
| To start the celery worker, run the command: | |
- airflow.executors.celery_executor.CELERY_FETCH_ERR_MSG_HEADER = Error fetching Celery task state[source]¶
- airflow.executors.celery_executor.OPERATION_TIMEOUT[source]¶
- To start the celery worker, run the command: airflow celery worker 
- class airflow.executors.celery_executor.ExceptionWithTraceback(exception, exception_traceback)[source]¶
- Wrapper class used to propagate exceptions to parent processes from subprocesses. 
- airflow.executors.celery_executor.send_task_to_executor(task_tuple)[source]¶
- Sends task to executor. 
- airflow.executors.celery_executor.on_celery_import_modules(*args, **kwargs)[source]¶
- Preload some “expensive” airflow modules so that every task process doesn’t have to import it again and again. - Loading these for each task adds 0.3-0.5s per task before the task can run. For long running tasks this doesn’t matter, but for short tasks this starts to be a noticeable impact. 
- class airflow.executors.celery_executor.CeleryExecutor[source]¶
- Bases: - airflow.executors.base_executor.BaseExecutor- CeleryExecutor is recommended for production use of Airflow. It allows distributing the execution of task instances to multiple worker nodes. - Celery is a simple, flexible and reliable distributed system to process vast amounts of messages, while providing operations with the tools required to maintain such a system. - sync()[source]¶
- Sync will get called periodically by the heartbeat method. Executors should override this to perform gather statuses. 
 - change_state(key, state, info=None)[source]¶
- Changes state of the task. - Parameters
- info – Executor information for the task instance 
- key (airflow.models.taskinstance.TaskInstanceKey) – Unique key for the task instance 
- state (str) – State to set for the task. 
 
 
 - end(synchronous=False)[source]¶
- This method is called when the caller is done submitting job and wants to wait synchronously for the job submitted previously to be all done. 
 - try_adopt_task_instances(tis)[source]¶
- Try to adopt running task instances that have been abandoned by a SchedulerJob dying. - Anything that is not adopted will be cleared by the scheduler (and then become eligible for re-scheduling) - Returns
- any TaskInstances that were unable to be adopted 
- Return type
- list[airflow.models.TaskInstance] 
 
 
- airflow.executors.celery_executor.fetch_celery_task_state(async_result)[source]¶
- Fetch and return the state of the given celery task. The scope of this function is global so that it can be called by subprocesses in the pool. - Parameters
- async_result (celery.result.AsyncResult) – a tuple of the Celery task key and the async Celery object used to fetch the task’s state 
- Returns
- a tuple of the Celery task key and the Celery state and the celery info of the task 
- Return type
 
- class airflow.executors.celery_executor.BulkStateFetcher(sync_parallelism=None)[source]¶
- Bases: - airflow.utils.log.logging_mixin.LoggingMixin- Gets status for many Celery tasks using the best method available - If BaseKeyValueStoreBackend is used as result backend, the mget method is used. If DatabaseBackend is used as result backend, the SELECT …WHERE task_id IN (…) query is used Otherwise, multiprocessing.Pool will be used. Each task status will be downloaded individually. 
