airflow.providers.apache.hdfs.sensors.hdfs¶
Module Contents¶
-
class
airflow.providers.apache.hdfs.sensors.hdfs.HdfsSensor(*, filepath: str, hdfs_conn_id: str = 'hdfs_default', ignored_ext: Optional[List[str]] = None, ignore_copying: bool = True, file_size: Optional[int] = None, hook: Type[HDFSHook] = HDFSHook, **kwargs)[source]¶ Bases:
airflow.sensors.base.BaseSensorOperatorWaits for a file or folder to land in HDFS
- Parameters
filepath (str) -- The route to a stored file.
hdfs_conn_id (str) -- The Airflow connection used for HDFS credentials.
ignored_ext (Optional[List[str]]) -- This is the list of ignored extensions.
ignore_copying (Optional[bool]) -- Shall we ignore?
file_size (Optional[int]) -- This is the size of the file.
See also
For more information on how to use this operator, take a look at the guide: Waits for a file or folder to land in HDFS
-
static
filter_for_filesize(result: List[Dict[Any, Any]], size: Optional[int] = None)[source]¶ Will test the filepath result and test if its size is at least self.filesize
- Parameters
result -- a list of dicts returned by Snakebite ls
size -- the file size in MB a file should be at least to trigger True
- Returns
(bool) depending on the matching criteria
-
class
airflow.providers.apache.hdfs.sensors.hdfs.HdfsRegexSensor(regex: Pattern[str], *args, **kwargs)[source]¶ Bases:
airflow.providers.apache.hdfs.sensors.hdfs.HdfsSensorWaits for matching files by matching on regex
See also
For more information on how to use this operator, take a look at the guide: HdfsRegexSensor
-
class
airflow.providers.apache.hdfs.sensors.hdfs.HdfsFolderSensor(be_empty: bool = False, *args, **kwargs)[source]¶ Bases:
airflow.providers.apache.hdfs.sensors.hdfs.HdfsSensorWaits for a non-empty directory
See also
For more information on how to use this operator, take a look at the guide: HdfsFolderSensor