airflow.providers.openlineage.extractors.manager

Classes

ExtractorManager

Class abstracting management of custom extractors.

Module Contents

class airflow.providers.openlineage.extractors.manager.ExtractorManager[source]

Bases: airflow.utils.log.logging_mixin.LoggingMixin

Class abstracting management of custom extractors.

extractors: dict[str, type[airflow.providers.openlineage.extractors.BaseExtractor]][source]
default_extractor[source]
add_extractor(operator_class, extractor)[source]
extract_metadata(dagrun, task, task_instance_state, task_instance)[source]
get_extractor_class(task)[source]
extract_inlets_and_outlets(task_metadata, task)[source]
get_hook_lineage(task_instance=None, task_instance_state=None)[source]

Extract lineage from the Hook Lineage Collector.

Combines two sources into a single OperatorLineage:

  • Asset-based inputs/outputs reported via add_input_asset / add_output_asset.

  • SQL-based lineage from sql_job extras reported via send_sql_hook_lineage(). When task_instance is provided, each extra is parsed and separate per-query OpenLineage events are emitted.

Returns None when nothing was collected.

static convert_to_ol_dataset_from_object_storage_uri(uri)[source]
static convert_to_ol_dataset_from_table(table)[source]
static convert_to_ol_dataset(obj)[source]
validate_task_metadata(task_metadata)[source]

Was this entry helpful?