Logging for Tasks

Airflow writes logs for tasks in a way that allows to see the logs for each task separately via Airflow UI. The Core Airflow implements writing and serving logs locally. However you can also write logs to remote services - via community providers, but you can also write your own loggers.

Below we describe the local task logging, but Apache Airflow Community also releases providers for many services (Provider packages) and some of them also provide handlers that extend logging capability of Apache Airflow. You can see all those providers in Writing logs.

Writing logs Locally

Users can specify the directory to place log files in airflow.cfg using base_log_folder. By default, logs are placed in the AIRFLOW_HOME directory.


For more information on setting the configuration, see Setting Configuration Options

The following convention is followed while naming logs: {dag_id}/{task_id}/{logical_date}/{try_number}.log

In addition, users can supply a remote location to store current logs and backups.

In the Airflow Web UI, remote logs take precedence over local logs when remote logging is enabled. If remote logs can not be found or accessed, local logs will be displayed. Note that logs are only sent to remote storage once a task is complete (including failure); In other words, remote logs for running tasks are unavailable (but local logs are available).


If you want to check which task handler is currently set, you can use airflow info command as in the example below.

$ airflow info
airflow on PATH: [True]

Executor: [SequentialExecutor]
Task Logging Handlers: [StackdriverTaskHandler]
SQL Alchemy Conn: [sqlite://///root/airflow/airflow.db]
DAGs Folder: [/root/airflow/dags]
Plugins Folder: [/root/airflow/plugins]
Base Log Folder: [/root/airflow/logs]

You can also use airflow config list to check that the logging configuration options have valid values.

Advanced configuration

Not all configuration options are available from the airflow.cfg file. Some configuration options require that the logging config class be overwritten. This can be done by logging_config_class option in airflow.cfg file. This option should specify the import path indicating to a configuration compatible with logging.config.dictConfig(). If your file is a standard import location, then you should set a PYTHONPATH environment.

Follow the steps below to enable custom logging config class:

  1. Start by setting environment variable to known directory e.g. ~/airflow/

    export PYTHON_PATH=~/airflow/
  2. Create a directory to store the config file e.g. ~/airflow/config

  3. Create file called ~/airflow/config/log_config.py with following content:

    from copy import deepcopy
    from airflow.config_templates.airflow_local_settings import DEFAULT_LOGGING_CONFIG
  4. At the end of the file, add code to modify the default dictionary configuration.

  5. Update $AIRFLOW_HOME/airflow.cfg to contain:

    remote_logging = True
    logging_config_class = log_config.LOGGING_CONFIG
  6. Restart the application.

See Modules Management for details on how Python and Airflow manage modules.

Serving logs from workers

Most task handlers send logs upon completion of a task. In order to view logs in real time, airflow automatically starts an http server to serve the logs in the following cases:

  • If SchedulerExecutor or LocalExecutor is used, then when airflow scheduler is running.

  • If CeleryExecutor is used, then when airflow worker is running.

The server is running on the port specified by worker_log_server_port option in [logging] section. By default, it is 8793. Communication between the webserver and the worker is signed with the key specified by secret_key option in [webserver] section. You must ensure that the key matches so that communication can take place without problems.

We are using Gunicorm as a WSGI server. Its configuration options can be overridden with the GUNICORN_CMD_ARGS env variable. For details, see Gunicorn settings.

Was this entry helpful?