airflow.providers.google.cloud.sensors.dataproc_metastore

Module Contents

Classes

MetastoreHivePartitionSensor

Waits for partitions to show up in Hive.

class airflow.providers.google.cloud.sensors.dataproc_metastore.MetastoreHivePartitionSensor(service_id, region, table, partitions, gcp_conn_id='google_cloud_default', impersonation_chain=None, *args, **kwargs)[source]

Bases: airflow.sensors.base.BaseSensorOperator

Waits for partitions to show up in Hive.

This sensor uses Google Cloud SDK and passes requests via gRPC.

Parameters
  • service_id (str) – Required. Dataproc Metastore service id.

  • region (str) – Required. The ID of the Google Cloud region that the service belongs to.

  • table (str) – Required. Name of the partitioned table

  • partitions (list[str] | None) – List of table partitions to wait for. A name of a partition should look like “ds=1”, or “a=1/b=2” in case of nested partitions. Note that you cannot use logical or comparison operators as in HivePartitionSensor. If not specified then the sensor will wait for at least one partition regardless its name.

  • gcp_conn_id (str) – Airflow Google Cloud connection ID.

  • impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account.

template_fields: Sequence[str] = ('service_id', 'region', 'table', 'partitions', 'impersonation_chain')[source]
poke(context)[source]

Override when deriving this class.

Was this entry helpful?