airflow.models.dataset
¶
Module Contents¶
Classes¶
A table to store datasets. |
|
References from a DAG to a dataset of which it is a consumer. |
|
References from a task to a dataset that it updates / produces. |
|
Model for storing dataset events that need processing. |
|
A table to store datasets events. |
Attributes¶
- class airflow.models.dataset.DatasetModel(uri, **kwargs)[source]¶
Bases:
airflow.models.base.Base
A table to store datasets.
- Parameters
uri (str) -- a string that uniquely identifies the dataset
extra -- JSON field for arbitrary extra info
- class airflow.models.dataset.DagScheduleDatasetReference[source]¶
Bases:
airflow.models.base.Base
References from a DAG to a dataset of which it is a consumer.
- class airflow.models.dataset.TaskOutletDatasetReference[source]¶
Bases:
airflow.models.base.Base
References from a task to a dataset that it updates / produces.
- class airflow.models.dataset.DatasetDagRunQueue[source]¶
Bases:
airflow.models.base.Base
Model for storing dataset events that need processing.
- class airflow.models.dataset.DatasetEvent[source]¶
Bases:
airflow.models.base.Base
A table to store datasets events.
- Parameters
dataset_id -- reference to DatasetModel record
extra -- JSON field for arbitrary extra info
source_task_id -- the task_id of the TI which updated the dataset
source_dag_id -- the dag_id of the TI which updated the dataset
source_run_id -- the run_id of the TI which updated the dataset
source_map_index -- the map_index of the TI which updated the dataset
timestamp -- the time the event was logged
We use relationships instead of foreign keys so that dataset events are not deleted even if the foreign key object is.