airflow.providers.google.cloud.hooks.dataproc_metastore

This module contains a Google Cloud Dataproc Metastore hook.

Module Contents

Classes

DataprocMetastoreHook

Hook for Google Cloud Dataproc Metastore APIs.

class airflow.providers.google.cloud.hooks.dataproc_metastore.DataprocMetastoreHook(**kwargs)[source]

Bases: airflow.providers.google.common.hooks.base_google.GoogleBaseHook

Hook for Google Cloud Dataproc Metastore APIs.

get_dataproc_metastore_client()[source]

Returns DataprocMetastoreClient.

wait_for_operation(timeout, operation)[source]

Waits for long-lasting operation to complete.

create_backup(project_id, region, service_id, backup, backup_id, request_id=None, retry=DEFAULT, timeout=None, metadata=())[source]

Creates a new backup in a given project and location.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the service belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the service belongs to.

  • service_id (str) –

    Required. The ID of the metastore service, which is used as the final component of the metastore service’s name. This value must be between 2 and 63 characters long inclusive, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the service_id field on the request instance; if request is provided, this should not be set.

  • backup (dict[Any, Any] | Backup) –

    Required. The backup to create. The name field is ignored. The ID of the created backup must be provided in the request’s backup_id field.

    This corresponds to the backup field on the request instance; if request is provided, this should not be set.

  • backup_id (str) –

    Required. The ID of the backup, which is used as the final component of the backup’s name. This value must be between 1 and 64 characters long, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the backup_id field on the request instance; if request is provided, this should not be set.

  • request_id (str | None) – Optional. A unique id used to identify the request.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

create_metadata_import(project_id, region, service_id, metadata_import, metadata_import_id, request_id=None, retry=DEFAULT, timeout=None, metadata=())[source]

Creates a new MetadataImport in a given project and location.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the service belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the service belongs to.

  • service_id (str) –

    Required. The ID of the metastore service, which is used as the final component of the metastore service’s name. This value must be between 2 and 63 characters long inclusive, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the service_id field on the request instance; if request is provided, this should not be set.

  • metadata_import (dict | MetadataImport) –

    Required. The metadata import to create. The name field is ignored. The ID of the created metadata import must be provided in the request’s metadata_import_id field.

    This corresponds to the metadata_import field on the request instance; if request is provided, this should not be set.

  • metadata_import_id (str) –

    Required. The ID of the metadata import, which is used as the final component of the metadata import’s name. This value must be between 1 and 64 characters long, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the metadata_import_id field on the request instance; if request is provided, this should not be set.

  • request_id (str | None) – Optional. A unique id used to identify the request.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

create_service(region, project_id, service, service_id, request_id=None, retry=DEFAULT, timeout=None, metadata=())[source]

Creates a metastore service in a project and location.

Parameters
  • region (str) – Required. The ID of the Google Cloud region that the service belongs to.

  • project_id (str) – Required. The ID of the Google Cloud project that the service belongs to.

  • service (dict | Service) –

    Required. The Metastore service to create. The name field is ignored. The ID of the created metastore service must be provided in the request’s service_id field.

    This corresponds to the service field on the request instance; if request is provided, this should not be set.

  • service_id (str) –

    Required. The ID of the metastore service, which is used as the final component of the metastore service’s name. This value must be between 2 and 63 characters long inclusive, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the service_id field on the request instance; if request is provided, this should not be set.

  • request_id (str | None) – Optional. A unique id used to identify the request.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

delete_backup(project_id, region, service_id, backup_id, request_id=None, retry=DEFAULT, timeout=None, metadata=())[source]

Deletes a single backup.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the service belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the service belongs to.

  • service_id (str) –

    Required. The ID of the metastore service, which is used as the final component of the metastore service’s name. This value must be between 2 and 63 characters long inclusive, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the service_id field on the request instance; if request is provided, this should not be set.

  • backup_id (str) –

    Required. The ID of the backup, which is used as the final component of the backup’s name. This value must be between 1 and 64 characters long, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the backup_id field on the request instance; if request is provided, this should not be set.

  • request_id (str | None) – Optional. A unique id used to identify the request.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

delete_service(project_id, region, service_id, request_id=None, retry=DEFAULT, timeout=None, metadata=())[source]

Deletes a single service.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the service belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the service belongs to.

  • service_id (str) –

    Required. The ID of the metastore service, which is used as the final component of the metastore service’s name. This value must be between 2 and 63 characters long inclusive, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the service_id field on the request instance; if request is provided, this should not be set.

  • request_id (str | None) – Optional. A unique id used to identify the request.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

export_metadata(destination_gcs_folder, project_id, region, service_id, request_id=None, database_dump_type=None, retry=DEFAULT, timeout=None, metadata=())[source]

Exports metadata from a service.

Parameters
  • destination_gcs_folder (str) – A Cloud Storage URI of a folder, in the format gs://<bucket_name>/<path_inside_bucket>. A sub-folder <export_folder> containing exported files will be created below it.

  • project_id (str) – Required. The ID of the Google Cloud project that the service belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the service belongs to.

  • service_id (str) –

    Required. The ID of the metastore service, which is used as the final component of the metastore service’s name. This value must be between 2 and 63 characters long inclusive, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the service_id field on the request instance; if request is provided, this should not be set.

  • request_id (str | None) – Optional. A unique id used to identify the request.

  • database_dump_type (DatabaseDumpSpec | None) – Optional. The type of the database dump. If unspecified, defaults to MYSQL.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

get_service(project_id, region, service_id, retry=DEFAULT, timeout=None, metadata=())[source]

Gets the details of a single service.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the service belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the service belongs to.

  • service_id (str) –

    Required. The ID of the metastore service, which is used as the final component of the metastore service’s name. This value must be between 2 and 63 characters long inclusive, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the service_id field on the request instance; if request is provided, this should not be set.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

get_backup(project_id, region, service_id, backup_id, retry=DEFAULT, timeout=None, metadata=())[source]

Get backup from a service.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the service belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the service belongs to.

  • service_id (str) –

    Required. The ID of the metastore service, which is used as the final component of the metastore service’s name. This value must be between 2 and 63 characters long inclusive, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the service_id field on the request instance; if request is provided, this should not be set.

  • backup_id (str) – Required. The ID of the metastore service backup to restore from

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

list_backups(project_id, region, service_id, page_size=None, page_token=None, filter=None, order_by=None, retry=DEFAULT, timeout=None, metadata=())[source]

Lists backups in a service.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the service belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the service belongs to.

  • service_id (str) –

    Required. The ID of the metastore service, which is used as the final component of the metastore service’s name. This value must be between 2 and 63 characters long inclusive, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the service_id field on the request instance; if request is provided, this should not be set.

  • page_size (int | None) – Optional. The maximum number of backups to return. The response may contain less than the maximum number. If unspecified, no more than 500 backups are returned. The maximum value is 1000; values above 1000 are changed to 1000.

  • page_token (str | None) – Optional. A page token, received from a previous [DataprocMetastore.ListBackups][google.cloud.metastore.v1.DataprocMetastore.ListBackups] call. Provide this token to retrieve the subsequent page. To retrieve the first page, supply an empty page token. When paginating, other parameters provided to [DataprocMetastore.ListBackups][google.cloud.metastore.v1.DataprocMetastore.ListBackups] must match the call that provided the page token.

  • filter (str | None) – Optional. The filter to apply to list results.

  • order_by (str | None) – Optional. Specify the ordering of results as described in Sorting Order. If not specified, the results will be sorted in the default order.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

restore_service(project_id, region, service_id, backup_project_id, backup_region, backup_service_id, backup_id, restore_type=None, request_id=None, retry=DEFAULT, timeout=None, metadata=())[source]

Restores a service from a backup.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the service belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the service belongs to.

  • service_id (str) –

    Required. The ID of the metastore service, which is used as the final component of the metastore service’s name. This value must be between 2 and 63 characters long inclusive, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the service_id field on the request instance; if request is provided, this should not be set.

  • backup_project_id (str) – Required. The ID of the Google Cloud project that the metastore service backup to restore from.

  • backup_region (str) – Required. The ID of the Google Cloud region that the metastore service backup to restore from.

  • backup_service_id (str) – Required. The ID of the metastore service backup to restore from, which is used as the final component of the metastore service’s name. This value must be between 2 and 63 characters long inclusive, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

  • backup_id (str) – Required. The ID of the metastore service backup to restore from

  • restore_type (Restore | None) – Optional. The type of restore. If unspecified, defaults to METADATA_ONLY

  • request_id (str | None) – Optional. A unique id used to identify the request.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

update_service(project_id, region, service_id, service, update_mask, request_id=None, retry=DEFAULT, timeout=None, metadata=())[source]

Updates the parameters of a single service.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the service belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the service belongs to.

  • service_id (str) –

    Required. The ID of the metastore service, which is used as the final component of the metastore service’s name. This value must be between 2 and 63 characters long inclusive, begin with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.

    This corresponds to the service_id field on the request instance; if request is provided, this should not be set.

  • service (dict | Service) –

    Required. The metastore service to update. The server only merges fields in the service if they are specified in update_mask.

    The metastore service’s name field is used to identify the metastore service to be updated.

    This corresponds to the service field on the request instance; if request is provided, this should not be set.

  • update_mask (google.protobuf.field_mask_pb2.FieldMask) –

    Required. A field mask used to specify the fields to be overwritten in the metastore service resource by the update. Fields specified in the update_mask are relative to the resource (not to the full request). A field is overwritten if it is in the mask.

    This corresponds to the update_mask field on the request instance; if request is provided, this should not be set.

  • request_id (str | None) – Optional. A unique id used to identify the request.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

Was this entry helpful?