Source code for airflow.example_dags.example_datasets
# Licensed to the Apache Software Foundation (ASF) under one# or more contributor license agreements. See the NOTICE file# distributed with this work for additional information# regarding copyright ownership. The ASF licenses this file# to you under the Apache License, Version 2.0 (the# "License"); you may not use this file except in compliance# with the License. You may obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing,# software distributed under the License is distributed on an# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY# KIND, either express or implied. See the License for the# specific language governing permissions and limitations# under the License."""Example DAG for demonstrating the behavior of the Datasets feature in Airflow, including conditional anddataset expression-based scheduling.Notes on usage:Turn on all the DAGs.dataset_produces_1 is scheduled to run daily. Once it completes, it triggers several DAGs due to its datasetbeing updated. dataset_consumes_1 is triggered immediately, as it depends solely on the dataset produced bydataset_produces_1. consume_1_or_2_with_dataset_expressions will also be triggered, as its condition ofeither dataset_produces_1 or dataset_produces_2 being updated is satisfied with dataset_produces_1.dataset_consumes_1_and_2 will not be triggered after dataset_produces_1 runs because it requires the datasetfrom dataset_produces_2, which has no schedule and must be manually triggered.After manually triggering dataset_produces_2, several DAGs will be affected. dataset_consumes_1_and_2 shouldrun because both its dataset dependencies are now met. consume_1_and_2_with_dataset_expressions will betriggered, as it requires both dataset_produces_1 and dataset_produces_2 datasets to be updated.consume_1_or_2_with_dataset_expressions will be triggered again, since it's conditionally set to run wheneither dataset is updated.consume_1_or_both_2_and_3_with_dataset_expressions demonstrates complex dataset dependency logic.This DAG triggers if dataset_produces_1 is updated or if both dataset_produces_2 and dag3_datasetare updated. This example highlights the capability to combine updates from multiple datasets with logicalexpressions for advanced scheduling.conditional_dataset_and_time_based_timetable illustrates the integration of time-based scheduling withdataset dependencies. This DAG is configured to execute either when both dataset_produces_1 anddataset_produces_2 datasets have been updated or according to a specific cron schedule, showcasingAirflow's versatility in handling mixed triggers for dataset and time-based scheduling.The DAGs dataset_consumes_1_never_scheduled and dataset_consumes_unknown_never_scheduled will not runautomatically as they depend on datasets that do not get updated or are not produced by any scheduled tasks."""from__future__importannotationsimportpendulumfromairflow.datasetsimportDatasetfromairflow.models.dagimportDAGfromairflow.operators.bashimportBashOperatorfromairflow.timetables.datasetsimportDatasetOrTimeSchedulefromairflow.timetables.triggerimportCronTriggerTimetable# [START dataset_def]