Structure of OpenLineage Airflow integration

OpenLineage integration implements AirflowPlugin. This allows it to be discovered on Airflow start and register Airflow Listener.

The listener is then called when certain events happen in Airflow - when DAGs or TaskInstances start, complete or fail. For DAGs, the listener runs in Airflow Scheduler. For TaskInstances, the listener runs on Airflow Worker.

When TaskInstance listener method gets called, the OpenLineageListener constructs metadata like event’s unique run_id and event time. Then, it tries to find valid Extractor for given operator. The Extractors are a framework for external extraction of metadata from

Was this entry helpful?