airflow.providers.apache.hive.macros.hive

Module Contents

Functions

max_partition(table[, schema, field, filter_map, ...])

Get the max partition for a table.

closest_ds_partition(table, ds[, before, schema, ...])

Find the date in a list closest to the target date.

airflow.providers.apache.hive.macros.hive.max_partition(table, schema='default', field=None, filter_map=None, metastore_conn_id='metastore_default')[source]

Get the max partition for a table.

Parameters
  • schema – The hive schema the table lives in

  • table – The hive table you are interested in, supports the dot notation as in “my_database.my_table”, if a dot is found, the schema param is disregarded

  • metastore_conn_id – The hive connection you are interested in. If your default is set you don’t need to use this parameter.

  • filter_map – partition_key:partition_value map used for partition filtering, e.g. {‘key1’: ‘value1’, ‘key2’: ‘value2’}. Only partitions matching all partition_key:partition_value pairs will be considered as candidates of max partition.

  • field – the field to get the max value from. If there’s only one partition field, this will be inferred

>>> max_partition("airflow.static_babynames_partitioned")
'2015-01-01'
airflow.providers.apache.hive.macros.hive.closest_ds_partition(table, ds, before=True, schema='default', metastore_conn_id='metastore_default')[source]

Find the date in a list closest to the target date.

An optional parameter can be given to get the closest before or after.

Parameters
  • table – A hive table name

  • ds – A datestamp %Y-%m-%d e.g. yyyy-mm-dd

  • before – closest before (True), after (False) or either side of ds

  • schema – table schema

  • metastore_conn_id – which metastore connection to use

Returns

The closest date

Return type

str | None

>>> tbl = "airflow.static_babynames_partitioned"
>>> closest_ds_partition(tbl, "2015-01-02")
'2015-01-01'

Was this entry helpful?