airflow.providers.openlineage.utils.utils

Attributes

log

Classes

InfoJsonEncodable

Airflow objects might not be json-encodable overall.

DagInfo

Defines encoding DAG object to JSON.

DagRunInfo

Defines encoding DagRun object to JSON.

TaskInstanceInfo

Defines encoding TaskInstance object to JSON.

AssetInfo

Defines encoding Airflow Asset object to JSON.

TaskInfo

Defines encoding BaseOperator/AbstractOperator object to JSON.

TaskInfoComplete

Defines encoding BaseOperator/AbstractOperator object to JSON used when user enables full task info.

TaskGroupInfo

Defines encoding TaskGroup object to JSON.

OpenLineageRedactor

This class redacts sensitive data similar to SecretsMasker in Airflow logs.

Functions

try_import_from_string(string)

get_operator_class(task)

get_operator_provider_version(operator)

Get the provider package version for the given operator.

get_job_name(task)

get_task_parent_run_facet(parent_run_id, parent_job_name)

Retrieve the parent run facet for task-level events.

get_task_documentation(operator)

Get task documentation and mime type, truncated to _MAX_DOC_BYTES bytes length, if present.

get_dag_documentation(dag)

Get dag documentation and mime type, truncated to _MAX_DOC_BYTES bytes length, if present.

get_airflow_mapped_task_facet(task_instance)

get_user_provided_run_facets(ti, ti_state)

get_fully_qualified_class_name(operator)

is_operator_disabled(operator)

is_selective_lineage_enabled(obj)

If selective enable is active check if DAG or Task is enabled to emit events.

is_ti_rescheduled_already(ti[, session])

get_airflow_dag_run_facet(dag_run)

get_processing_engine_facet()

get_airflow_debug_facet()

get_airflow_run_facet(dag_run, dag, task_instance, ...)

get_airflow_job_facet(dag_run)

get_airflow_state_run_facet(dag_id, run_id, task_ids, ...)

get_unknown_source_attribute_run_facet(task[, name])

is_json_serializable(item)

print_warning(log)

get_filtered_unknown_operator_keys(operator)

should_use_external_connection(hook)

translate_airflow_asset(asset, lineage_context)

Convert an Asset with an AIP-60 compliant URI to an OpenLineageDataset.

Module Contents

airflow.providers.openlineage.utils.utils.log[source]
airflow.providers.openlineage.utils.utils.try_import_from_string(string)[source]
airflow.providers.openlineage.utils.utils.get_operator_class(task)[source]
airflow.providers.openlineage.utils.utils.get_operator_provider_version(operator)[source]

Get the provider package version for the given operator.

airflow.providers.openlineage.utils.utils.get_job_name(task)[source]
airflow.providers.openlineage.utils.utils.get_task_parent_run_facet(parent_run_id, parent_job_name, parent_job_namespace=conf.namespace())[source]

Retrieve the parent run facet for task-level events.

This facet currently always points to the DAG-level run ID and name, as external events for DAG runs are not yet handled.

airflow.providers.openlineage.utils.utils.get_task_documentation(operator)[source]

Get task documentation and mime type, truncated to _MAX_DOC_BYTES bytes length, if present.

airflow.providers.openlineage.utils.utils.get_dag_documentation(dag)[source]

Get dag documentation and mime type, truncated to _MAX_DOC_BYTES bytes length, if present.

airflow.providers.openlineage.utils.utils.get_airflow_mapped_task_facet(task_instance)[source]
airflow.providers.openlineage.utils.utils.get_user_provided_run_facets(ti, ti_state)[source]
airflow.providers.openlineage.utils.utils.get_fully_qualified_class_name(operator)[source]
airflow.providers.openlineage.utils.utils.is_operator_disabled(operator)[source]
airflow.providers.openlineage.utils.utils.is_selective_lineage_enabled(obj)[source]

If selective enable is active check if DAG or Task is enabled to emit events.

airflow.providers.openlineage.utils.utils.is_ti_rescheduled_already(ti, session=NEW_SESSION)[source]
class airflow.providers.openlineage.utils.utils.InfoJsonEncodable(obj)[source]

Bases: dict

Airflow objects might not be json-encodable overall.

The class provides additional attributes to control what and how is encoded:

  • renames: a dictionary of attribute name changes

  • casts: a dictionary consisting of attribute names
    and corresponding methods that should change
    object value
  • includes: list of attributes to be included in encoding

  • excludes: list of attributes to be excluded from encoding

Don’t use both includes and excludes.

renames: dict[str, str][source]
casts: dict[str, Any][source]
includes: list[str] = [][source]
excludes: list[str] = [][source]
obj[source]
class airflow.providers.openlineage.utils.utils.DagInfo(obj)[source]

Bases: InfoJsonEncodable

Defines encoding DAG object to JSON.

includes = ['dag_id', 'description', 'fileloc', 'owner', 'owner_links', 'schedule_interval',...[source]
casts[source]
renames[source]
classmethod serialize_timetable(dag)[source]
class airflow.providers.openlineage.utils.utils.DagRunInfo(obj)[source]

Bases: InfoJsonEncodable

Defines encoding DagRun object to JSON.

includes = ['conf', 'dag_id', 'data_interval_start', 'data_interval_end', 'external_trigger',...[source]
casts[source]
classmethod duration(dagrun)[source]
classmethod dag_version_info(dagrun, key)[source]
class airflow.providers.openlineage.utils.utils.TaskInstanceInfo(obj)[source]

Bases: InfoJsonEncodable

Defines encoding TaskInstance object to JSON.

includes = ['duration', 'try_number', 'pool', 'queued_dttm', 'log_url'][source]
casts[source]
class airflow.providers.openlineage.utils.utils.AssetInfo(obj)[source]

Bases: InfoJsonEncodable

Defines encoding Airflow Asset object to JSON.

includes = ['uri', 'extra'][source]
class airflow.providers.openlineage.utils.utils.TaskInfo(obj)[source]

Bases: InfoJsonEncodable

Defines encoding BaseOperator/AbstractOperator object to JSON.

renames[source]
includes = ['deferrable', 'depends_on_past', 'downstream_task_ids', 'execution_timeout', 'executor_config',...[source]
casts[source]
class airflow.providers.openlineage.utils.utils.TaskInfoComplete(obj)[source]

Bases: TaskInfo

Defines encoding BaseOperator/AbstractOperator object to JSON used when user enables full task info.

includes = [][source]
excludes = ['_BaseOperator__instantiated', '_dag', '_hook', '_log', '_outlets', '_inlets',...[source]
class airflow.providers.openlineage.utils.utils.TaskGroupInfo(obj)[source]

Bases: InfoJsonEncodable

Defines encoding TaskGroup object to JSON.

renames[source]
includes = ['downstream_group_ids', 'downstream_task_ids', 'prefix_group_id', 'tooltip',...[source]
airflow.providers.openlineage.utils.utils.get_airflow_dag_run_facet(dag_run)[source]
airflow.providers.openlineage.utils.utils.get_processing_engine_facet()[source]
airflow.providers.openlineage.utils.utils.get_airflow_debug_facet()[source]
airflow.providers.openlineage.utils.utils.get_airflow_run_facet(dag_run, dag, task_instance, task, task_uuid)[source]
airflow.providers.openlineage.utils.utils.get_airflow_job_facet(dag_run)[source]
airflow.providers.openlineage.utils.utils.get_airflow_state_run_facet(dag_id, run_id, task_ids, dag_run_state)[source]
airflow.providers.openlineage.utils.utils.get_unknown_source_attribute_run_facet(task, name=None)[source]
class airflow.providers.openlineage.utils.utils.OpenLineageRedactor[source]

Bases: airflow.sdk.execution_time.secrets_masker.SecretsMasker

This class redacts sensitive data similar to SecretsMasker in Airflow logs.

The difference is that our default max recursion depth is way higher - due to the structure of OL events we need more depth. Additionally, we allow data structures to specify data that needs not to be redacted by specifying _skip_redact list by deriving RedactMixin.

classmethod from_masker(other)[source]
airflow.providers.openlineage.utils.utils.is_json_serializable(item)[source]
airflow.providers.openlineage.utils.utils.print_warning(log)[source]
airflow.providers.openlineage.utils.utils.get_filtered_unknown_operator_keys(operator)[source]
airflow.providers.openlineage.utils.utils.should_use_external_connection(hook)[source]
airflow.providers.openlineage.utils.utils.translate_airflow_asset(asset, lineage_context)[source]

Convert an Asset with an AIP-60 compliant URI to an OpenLineageDataset.

This function returns None if no URI normalizer is defined, no asset converter is found or some core Airflow changes are missing and ImportError is raised.

Was this entry helpful?