Structure of OpenLineage Airflow integration¶
OpenLineage integration implements AirflowPlugin. This allows it to be discovered on Airflow start and register Airflow Listener.
The listener is then called when certain events happen in Airflow - when DAGs or TaskInstances start, complete or fail. For DAGs, the listener runs in Airflow Scheduler. For TaskInstances, the listener runs on Airflow Worker.
When TaskInstance listener method gets called, the
OpenLineageListener constructs metadata like event’s unique
run_id and event time.
Then, it tries to find valid Extractor for given operator. The Extractors are a framework
for external extraction of metadata from