airflow.models.dataset¶
Module Contents¶
Classes¶
| A table to store datasets. | |
| References from a DAG to a dataset of which it is a consumer. | |
| References from a task to a dataset that it updates / produces. | |
| Model for storing dataset events that need processing. | |
| A table to store datasets events. | 
Attributes¶
- class airflow.models.dataset.DatasetModel(uri, **kwargs)[source]¶
- Bases: - airflow.models.base.Base- A table to store datasets. - Parameters
- uri (str) – a string that uniquely identifies the dataset 
- extra – JSON field for arbitrary extra info 
 
 
- class airflow.models.dataset.DagScheduleDatasetReference[source]¶
- Bases: - airflow.models.base.Base- References from a DAG to a dataset of which it is a consumer. 
- class airflow.models.dataset.TaskOutletDatasetReference[source]¶
- Bases: - airflow.models.base.Base- References from a task to a dataset that it updates / produces. 
- class airflow.models.dataset.DatasetDagRunQueue[source]¶
- Bases: - airflow.models.base.Base- Model for storing dataset events that need processing. 
- class airflow.models.dataset.DatasetEvent[source]¶
- Bases: - airflow.models.base.Base- A table to store datasets events. - Parameters
- dataset_id – reference to DatasetModel record 
- extra – JSON field for arbitrary extra info 
- source_task_id – the task_id of the TI which updated the dataset 
- source_dag_id – the dag_id of the TI which updated the dataset 
- source_run_id – the run_id of the TI which updated the dataset 
- source_map_index – the map_index of the TI which updated the dataset 
- timestamp – the time the event was logged 
 
 - We use relationships instead of foreign keys so that dataset events are not deleted even if the foreign key object is. 
