airflow.contrib.operators.gcp_bigtable_operator

Module Contents

class airflow.contrib.operators.gcp_bigtable_operator.BigtableValidationMixin[source]

Bases:object

Common class for Cloud Bigtable operators for validating required fields.

REQUIRED_ATTRIBUTES = [][source]
_validate_inputs(self)[source]
class airflow.contrib.operators.gcp_bigtable_operator.BigtableInstanceCreateOperator(instance_id, main_cluster_id, main_cluster_zone, project_id=None, replica_cluster_id=None, replica_cluster_zone=None, instance_display_name=None, instance_type=None, instance_labels=None, cluster_nodes=None, cluster_storage_type=None, timeout=None, *args, **kwargs)[source]

Bases:airflow.models.BaseOperator, airflow.contrib.operators.gcp_bigtable_operator.BigtableValidationMixin

Creates a new Cloud Bigtable instance. If the Cloud Bigtable instance with the given ID exists, the operator does not compare its configuration and immediately succeeds. No changes are made to the existing instance.

For more details about instance creation have a look at the reference: https://googleapis.github.io/google-cloud-python/latest/bigtable/instance.html#google.cloud.bigtable.instance.Instance.create

See also

For more information on how to use this operator, take a look at the guide: BigtableInstanceCreateOperator

Parameters
  • instance_id (str) – The ID of the Cloud Bigtable instance to create.

  • main_cluster_id (str) – The ID for main cluster for the new instance.

  • main_cluster_zone (str) – The zone for main cluster See https://cloud.google.com/bigtable/docs/locations for more details.

  • project_id (str) – Optional, the ID of the GCP project. If set to None or missing, the default project_id from the GCP connection is used.

  • replica_cluster_id (str) – (optional) The ID for replica cluster for the new instance.

  • replica_cluster_zone (str) – (optional) The zone for replica cluster.

  • instance_type (enums.IntEnum) – (optional) The type of the instance.

  • instance_display_name (str) – (optional) Human-readable name of the instance. Defaults to instance_id.

  • instance_labels (dict) – (optional) Dictionary of labels to associate with the instance.

  • cluster_nodes (int) – (optional) Number of nodes for cluster.

  • cluster_storage_type (enums.IntEnum) – (optional) The type of storage.

  • timeout (int) – (optional) timeout (in seconds) for instance creation. If None is not specified, Operator will wait indefinitely.

REQUIRED_ATTRIBUTES = ['instance_id', 'main_cluster_id', 'main_cluster_zone'][source]
template_fields = ['project_id', 'instance_id', 'main_cluster_id', 'main_cluster_zone'][source]
execute(self, context)[source]
class airflow.contrib.operators.gcp_bigtable_operator.BigtableInstanceDeleteOperator(instance_id, project_id=None, *args, **kwargs)[source]

Bases:airflow.models.BaseOperator, airflow.contrib.operators.gcp_bigtable_operator.BigtableValidationMixin

Deletes the Cloud Bigtable instance, including its clusters and all related tables.

For more details about deleting instance have a look at the reference: https://googleapis.github.io/google-cloud-python/latest/bigtable/instance.html#google.cloud.bigtable.instance.Instance.delete

See also

For more information on how to use this operator, take a look at the guide: BigtableInstanceDeleteOperator

Parameters
  • instance_id (str) – The ID of the Cloud Bigtable instance to delete.

  • project_id (str) – Optional, the ID of the GCP project. If set to None or missing, the default project_id from the GCP connection is used.

REQUIRED_ATTRIBUTES = ['instance_id'][source]
template_fields = ['project_id', 'instance_id'][source]
execute(self, context)[source]
class airflow.contrib.operators.gcp_bigtable_operator.BigtableTableCreateOperator(instance_id, table_id, project_id=None, initial_split_keys=None, column_families=None, *args, **kwargs)[source]

Bases:airflow.models.BaseOperator, airflow.contrib.operators.gcp_bigtable_operator.BigtableValidationMixin

Creates the table in the Cloud Bigtable instance.

For more details about creating table have a look at the reference: https://googleapis.github.io/google-cloud-python/latest/bigtable/table.html#google.cloud.bigtable.table.Table.create

See also

For more information on how to use this operator, take a look at the guide: BigtableTableCreateOperator

Parameters
  • instance_id (str) – The ID of the Cloud Bigtable instance that will hold the new table.

  • table_id (str) – The ID of the table to be created.

  • project_id (str) – Optional, the ID of the GCP project. If set to None or missing, the default project_id from the GCP connection is used.

  • initial_split_keys (list) – (Optional) list of row keys in bytes that will be used to initially split the table into several tablets.

  • column_families (dict) – (Optional) A map columns to create. The key is the column_id str and the value is a google.cloud.bigtable.column_family.GarbageCollectionRule

REQUIRED_ATTRIBUTES = ['instance_id', 'table_id'][source]
template_fields = ['project_id', 'instance_id', 'table_id'][source]
_compare_column_families(self)[source]
execute(self, context)[source]
class airflow.contrib.operators.gcp_bigtable_operator.BigtableTableDeleteOperator(instance_id, table_id, project_id=None, app_profile_id=None, *args, **kwargs)[source]

Bases:airflow.models.BaseOperator, airflow.contrib.operators.gcp_bigtable_operator.BigtableValidationMixin

Deletes the Cloud Bigtable table.

For more details about deleting table have a look at the reference: https://googleapis.github.io/google-cloud-python/latest/bigtable/table.html#google.cloud.bigtable.table.Table.delete

See also

For more information on how to use this operator, take a look at the guide: BigtableTableDeleteOperator

Parameters
  • instance_id (str) – The ID of the Cloud Bigtable instance.

  • table_id (str) – The ID of the table to be deleted.

  • project_id (str) – Optional, the ID of the GCP project. If set to None or missing, the default project_id from the GCP connection is used.

Parm app_profile_id

Application profile.

REQUIRED_ATTRIBUTES = ['instance_id', 'table_id'][source]
template_fields = ['project_id', 'instance_id', 'table_id'][source]
execute(self, context)[source]
class airflow.contrib.operators.gcp_bigtable_operator.BigtableClusterUpdateOperator(instance_id, cluster_id, nodes, project_id=None, *args, **kwargs)[source]

Bases:airflow.models.BaseOperator, airflow.contrib.operators.gcp_bigtable_operator.BigtableValidationMixin

Updates a Cloud Bigtable cluster.

For more details about updating a Cloud Bigtable cluster, have a look at the reference: https://googleapis.github.io/google-cloud-python/latest/bigtable/cluster.html#google.cloud.bigtable.cluster.Cluster.update

See also

For more information on how to use this operator, take a look at the guide: BigtableClusterUpdateOperator

Parameters
  • instance_id (str) – The ID of the Cloud Bigtable instance.

  • cluster_id (str) – The ID of the Cloud Bigtable cluster to update.

  • nodes (int) – The desired number of nodes for the Cloud Bigtable cluster.

  • project_id (str) – Optional, the ID of the GCP project.

REQUIRED_ATTRIBUTES = ['instance_id', 'cluster_id', 'nodes'][source]
template_fields = ['project_id', 'instance_id', 'cluster_id', 'nodes'][source]
execute(self, context)[source]
class airflow.contrib.operators.gcp_bigtable_operator.BigtableTableWaitForReplicationSensor(instance_id, table_id, project_id=None, *args, **kwargs)[source]

Bases:airflow.sensors.base_sensor_operator.BaseSensorOperator, airflow.contrib.operators.gcp_bigtable_operator.BigtableValidationMixin

Sensor that waits for Cloud Bigtable table to be fully replicated to its clusters. No exception will be raised if the instance or the table does not exist.

For more details about cluster states for a table, have a look at the reference: https://googleapis.github.io/google-cloud-python/latest/bigtable/table.html#google.cloud.bigtable.table.Table.get_cluster_states

See also

For more information on how to use this operator, take a look at the guide: BigtableTableWaitForReplicationSensor

Parameters
  • instance_id (str) – The ID of the Cloud Bigtable instance.

  • table_id (str) – The ID of the table to check replication status.

  • project_id (str) – Optional, the ID of the GCP project.

REQUIRED_ATTRIBUTES = ['instance_id', 'table_id'][source]
template_fields = ['project_id', 'instance_id', 'table_id'][source]
poke(self, context)[source]