airflow.sensors.hdfs_sensor

Module Contents

class airflow.sensors.hdfs_sensor.HdfsSensor(filepath, hdfs_conn_id='hdfs_default', ignored_ext=None, ignore_copying=True, file_size=None, hook=HDFSHook, *args, **kwargs)[source]

Bases: airflow.sensors.base_sensor_operator.BaseSensorOperator

Waits for a file or folder to land in HDFS

template_fields = ['filepath'][source]
ui_color[source]
static filter_for_filesize(result, size=None)[source]

Will test the filepath result and test if its size is at least self.filesize

Parameters
  • result – a list of dicts returned by Snakebite ls

  • size – the file size in MB a file should be at least to trigger True

Returns

(bool) depending on the matching criteria

static filter_for_ignored_ext(result, ignored_ext, ignore_copying)[source]

Will filter if instructed to do so the result to remove matching criteria

Parameters
  • result (list[dict]) – list of dicts returned by Snakebite ls

  • ignored_ext (list) – list of ignored extensions

  • ignore_copying (bool) – shall we ignore ?

Returns

list of dicts which were not removed

Return type

list[dict]

poke(self, context)[source]