airflow.providers.google.cloud.hooks.automl

This module contains a Google AutoML hook.

Module Contents

Classes

CloudAutoMLHook

Google Cloud AutoML hook.

class airflow.providers.google.cloud.hooks.automl.CloudAutoMLHook(gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.common.hooks.base_google.GoogleBaseHook

Google Cloud AutoML hook.

All the methods in the hook where project_id is used must be called with keyword arguments rather than positional.

static extract_object_id(obj)[source]

Returns unique id of the object.

get_conn()[source]

Retrieves connection to AutoML.

Returns

Google Cloud AutoML client object.

Return type

google.cloud.automl_v1beta1.AutoMlClient

wait_for_operation(operation, timeout=None)[source]

Waits for long-lasting operation to complete.

prediction_client()[source]

Creates PredictionServiceClient.

Returns

Google Cloud AutoML PredictionServiceClient client object.

Return type

google.cloud.automl_v1beta1.PredictionServiceClient

create_model(model, location, project_id=PROVIDE_PROJECT_ID, timeout=None, metadata=(), retry=DEFAULT)[source]

Creates a model_id and returns a Model in the response field when it completes.

When you create a model, several model evaluations are created for it: a global evaluation, and one evaluation for each annotation spec.

Parameters
  • model (dict | google.cloud.automl_v1beta1.Model) – The model_id to create. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.Model

  • project_id (str) – ID of the Google Cloud project where model will be created if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

google.cloud.automl_v1beta1.types._OperationFuture instance

Return type

google.api_core.operation.Operation

batch_predict(model_id, input_config, output_config, location, project_id=PROVIDE_PROJECT_ID, params=None, retry=DEFAULT, timeout=None, metadata=())[source]

Perform a batch prediction and returns a long-running operation object.

Unlike the online Predict, batch prediction result won’t be immediately available in the response. Instead, a long-running operation object is returned.

Parameters
  • model_id (str) – Name of the model_id requested to serve the batch prediction.

  • input_config (dict | google.cloud.automl_v1beta1.BatchPredictInputConfig) – Required. The input configuration for batch prediction. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.BatchPredictInputConfig

  • output_config (dict | google.cloud.automl_v1beta1.BatchPredictOutputConfig) – Required. The Configuration specifying where output predictions should be written. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.BatchPredictOutputConfig

  • params (dict[str, str] | None) – Additional domain-specific parameters for the predictions, any string must be up to 25000 characters long.

  • project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

google.cloud.automl_v1beta1.types._OperationFuture instance

Return type

google.api_core.operation.Operation

predict(model_id, payload, location, project_id=PROVIDE_PROJECT_ID, params=None, retry=DEFAULT, timeout=None, metadata=())[source]

Perform an online prediction and returns the prediction result in the response.

Parameters
  • model_id (str) – Name of the model_id requested to serve the prediction.

  • payload (dict | google.cloud.automl_v1beta1.ExamplePayload) – Required. Payload to perform a prediction on. The payload must match the problem type that the model_id was trained to solve. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.ExamplePayload

  • params (dict[str, str] | None) – Additional domain-specific parameters, any string must be up to 25000 characters long.

  • project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

google.cloud.automl_v1beta1.types.PredictResponse instance

Return type

google.cloud.automl_v1beta1.PredictResponse

create_dataset(dataset, location, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=())[source]

Creates a dataset.

Parameters
  • dataset (dict | google.cloud.automl_v1beta1.Dataset) – The dataset to create. If a dict is provided, it must be of the same form as the protobuf message Dataset.

  • project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

google.cloud.automl_v1beta1.types.Dataset instance.

Return type

google.cloud.automl_v1beta1.Dataset

import_data(dataset_id, location, input_config, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=())[source]

Imports data into a dataset. For Tables this method can only be called on an empty Dataset.

Parameters
  • dataset_id (str) – Name of the AutoML dataset.

  • input_config (dict | google.cloud.automl_v1beta1.InputConfig) – The desired input location and its domain specific semantics, if any. If a dict is provided, it must be of the same form as the protobuf message InputConfig.

  • project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

google.cloud.automl_v1beta1.types._OperationFuture instance

Return type

google.api_core.operation.Operation

list_column_specs(dataset_id, table_spec_id, location, project_id=PROVIDE_PROJECT_ID, field_mask=None, filter_=None, page_size=None, retry=DEFAULT, timeout=None, metadata=())[source]

Lists column specs in a table spec.

Parameters
  • dataset_id (str) – Name of the AutoML dataset.

  • table_spec_id (str) – table_spec_id for path builder.

  • field_mask (dict | google.protobuf.field_mask_pb2.FieldMask | None) – Mask specifying which fields to read. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.FieldMask

  • filter – Filter expression, see go/filtering.

  • page_size (int | None) – The maximum number of resources contained in the underlying API response. If page streaming is performed per resource, this parameter does not affect the return value. If page streaming is performed per-page, this determines the maximum number of resources in a page.

  • project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

google.cloud.automl_v1beta1.types.ColumnSpec instance.

Return type

google.cloud.automl_v1beta1.services.auto_ml.pagers.ListColumnSpecsPager

get_model(model_id, location, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=())[source]

Gets a AutoML model.

Parameters
  • model_id (str) – Name of the model.

  • project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

google.cloud.automl_v1beta1.types.Model instance.

Return type

google.cloud.automl_v1beta1.Model

delete_model(model_id, location, project_id=PROVIDE_PROJECT_ID, retry=DEFAULT, timeout=None, metadata=())[source]

Deletes a AutoML model.

Parameters
  • model_id (str) – Name of the model.

  • project_id (str) – ID of the Google Cloud project where model is located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

google.cloud.automl_v1beta1.types._OperationFuture instance.

Return type

google.api_core.operation.Operation

update_dataset(dataset, update_mask=None, retry=DEFAULT, timeout=None, metadata=())[source]

Updates a dataset.

Parameters
  • dataset (dict | google.cloud.automl_v1beta1.Dataset) – The dataset which replaces the resource on the server. If a dict is provided, it must be of the same form as the protobuf message Dataset.

  • update_mask (dict | google.protobuf.field_mask_pb2.FieldMask | None) – The update mask applies to the resource. If a dict is provided, it must be of the same form as the protobuf message FieldMask.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

google.cloud.automl_v1beta1.types.Dataset instance..

Return type

google.cloud.automl_v1beta1.Dataset

deploy_model(model_id, location, project_id=PROVIDE_PROJECT_ID, image_detection_metadata=None, retry=DEFAULT, timeout=None, metadata=())[source]

Deploys a model.

If a model is already deployed, deploying it with the same parameters has no effect. Deploying with different parameters (as e.g. changing node_number) will reset the deployment state without pausing the model_id’s availability.

Only applicable for Text Classification, Image Object Detection and Tables; all other domains manage deployment automatically.

Parameters
  • model_id (str) – Name of the model requested to serve the prediction.

  • image_detection_metadata (google.cloud.automl_v1beta1.ImageObjectDetectionModelDeploymentMetadata | dict | None) – Model deployment metadata specific to Image Object Detection. If a dict is provided, it must be of the same form as the protobuf message ImageObjectDetectionModelDeploymentMetadata

  • project_id (str) – ID of the Google Cloud project where model will be created if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

google.cloud.automl_v1beta1.types._OperationFuture instance.

Return type

google.api_core.operation.Operation

list_table_specs(dataset_id, location, project_id=None, filter_=None, page_size=None, retry=DEFAULT, timeout=None, metadata=())[source]

Lists table specs in a dataset_id.

Parameters
  • dataset_id (str) – Name of the dataset.

  • filter – Filter expression, see go/filtering.

  • page_size (int | None) – The maximum number of resources contained in the underlying API response. If page streaming is performed per resource, this parameter does not affect the return value. If page streaming is performed per-page, this determines the maximum number of resources in a page.

  • project_id (str | None) – ID of the Google Cloud project where dataset is located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

A google.gax.PageIterator instance. By default, this is an iterable of google.cloud.automl_v1beta1.types.TableSpec instances. This object can also be configured to iterate over the pages of the response through the options parameter.

Return type

google.cloud.automl_v1beta1.services.auto_ml.pagers.ListTableSpecsPager

list_datasets(location, project_id, retry=DEFAULT, timeout=None, metadata=())[source]

Lists datasets in a project.

Parameters
  • project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

A google.gax.PageIterator instance. By default, this is an iterable of google.cloud.automl_v1beta1.types.Dataset instances. This object can also be configured to iterate over the pages of the response through the options parameter.

Return type

google.cloud.automl_v1beta1.services.auto_ml.pagers.ListDatasetsPager

delete_dataset(dataset_id, location, project_id, retry=DEFAULT, timeout=None, metadata=())[source]

Deletes a dataset and all of its contents.

Parameters
  • dataset_id (str) – ID of dataset to be deleted.

  • project_id (str) – ID of the Google Cloud project where dataset is located if None then default project_id is used.

  • location (str) – The location of the project.

  • retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.

Returns

google.cloud.automl_v1beta1.types._OperationFuture instance

Return type

google.api_core.operation.Operation

Was this entry helpful?