airflow.providers.google.cloud.operators.automl
¶
This module contains Google AutoML operators.
Module Contents¶
Classes¶
Creates Google Cloud AutoML model. |
|
Runs prediction operation on Google Cloud AutoML. |
|
Perform a batch prediction on Google Cloud AutoML. |
|
Creates a Google Cloud AutoML dataset. |
|
Imports data to a Google Cloud AutoML dataset. |
|
Lists column specs in a table. |
|
Updates a dataset. |
|
Get Google Cloud AutoML model. |
|
Delete Google Cloud AutoML model. |
|
Deploys a model; if a model is already deployed, deploying it with the same parameters has no effect. |
|
Lists table specs in a dataset. |
|
Lists AutoML Datasets in project. |
|
Deletes a dataset and all of its contents. |
Attributes¶
- class airflow.providers.google.cloud.operators.automl.AutoMLTrainModelOperator(*, model, location, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Creates Google Cloud AutoML model.
AutoMLTrainModelOperator for text prediction is deprecated. Please use
airflow.providers.google.cloud.operators.vertex_ai.auto_ml.CreateAutoMLTextTrainingJobOperator
instead.See also
For more information on how to use this operator, take a look at the guide: Operations On Models
- Parameters
model (dict) – Model definition.
project_id (str | None) – ID of the Google Cloud project where model will be created if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (MetaData) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- class airflow.providers.google.cloud.operators.automl.AutoMLPredictOperator(*, model_id, location, payload, operation_params=None, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Runs prediction operation on Google Cloud AutoML.
See also
For more information on how to use this operator, take a look at the guide: Making Predictions
- Parameters
model_id (str) – Name of the model requested to serve the batch prediction.
payload (dict) – Name od the model used for the prediction.
project_id (str | None) – ID of the Google Cloud project where model is located if None then default project_id is used.
location (str) – The location of the project.
operation_params (dict[str, str] | None) – Additional domain-specific parameters for the predictions.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (MetaData) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- class airflow.providers.google.cloud.operators.automl.AutoMLBatchPredictOperator(*, model_id, input_config, output_config, location, project_id=None, prediction_params=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Perform a batch prediction on Google Cloud AutoML.
See also
For more information on how to use this operator, take a look at the guide: Making Predictions
- Parameters
project_id (str | None) – ID of the Google Cloud project where model will be created if None then default project_id is used.
location (str) – The location of the project.
model_id (str) – Name of the model_id requested to serve the batch prediction.
input_config (dict) – Required. The input configuration for batch prediction. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.BatchPredictInputConfig
output_config (dict) – Required. The Configuration specifying where output predictions should be written. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.BatchPredictOutputConfig
prediction_params (dict[str, str] | None) – Additional domain-specific parameters for the predictions, any string must be up to 25000 characters long.
project_id – ID of the Google Cloud project where model is located if None then default project_id is used.
location – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (MetaData) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- class airflow.providers.google.cloud.operators.automl.AutoMLCreateDatasetOperator(*, dataset, location, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Creates a Google Cloud AutoML dataset.
See also
For more information on how to use this operator, take a look at the guide: Creating Datasets
- Parameters
dataset (dict) – The dataset to create. If a dict is provided, it must be of the same form as the protobuf message Dataset.
project_id (str | None) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
params – Additional domain-specific parameters for the predictions.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (MetaData) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- class airflow.providers.google.cloud.operators.automl.AutoMLImportDataOperator(*, dataset_id, location, input_config, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Imports data to a Google Cloud AutoML dataset.
See also
For more information on how to use this operator, take a look at the guide: Creating Datasets
- Parameters
dataset_id (str) – ID of dataset to be updated.
input_config (dict) – The desired input location and its domain specific semantics, if any. If a dict is provided, it must be of the same form as the protobuf message InputConfig.
project_id (str | None) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
params – Additional domain-specific parameters for the predictions.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (MetaData) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- class airflow.providers.google.cloud.operators.automl.AutoMLTablesListColumnSpecsOperator(*, dataset_id, table_spec_id, location, field_mask=None, filter_=None, page_size=None, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Lists column specs in a table.
See also
For more information on how to use this operator, take a look at the guide: Listing Table And Columns Specs
- Parameters
dataset_id (str) – Name of the dataset.
table_spec_id (str) – table_spec_id for path builder.
field_mask (dict | None) – Mask specifying which fields to read. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.FieldMask
filter – Filter expression, see go/filtering.
page_size (int | None) – The maximum number of resources contained in the underlying API response. If page streaming is performed per resource, this parameter does not affect the return value. If page streaming is performed per page, this determines the maximum number of resources in a page.
project_id (str | None) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (MetaData) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- class airflow.providers.google.cloud.operators.automl.AutoMLTablesUpdateDatasetOperator(*, dataset, location, update_mask=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Updates a dataset.
See also
For more information on how to use this operator, take a look at the guide: Creating Datasets
- Parameters
dataset (dict) – The dataset which replaces the resource on the server. If a dict is provided, it must be of the same form as the protobuf message Dataset.
update_mask (dict | None) – The update mask applies to the resource. If a dict is provided, it must be of the same form as the protobuf message FieldMask.
location (str) – The location of the project.
params – Additional domain-specific parameters for the predictions.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (MetaData) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- class airflow.providers.google.cloud.operators.automl.AutoMLGetModelOperator(*, model_id, location, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Get Google Cloud AutoML model.
See also
For more information on how to use this operator, take a look at the guide: Operations On Models
- Parameters
model_id (str) – Name of the model requested to serve the prediction.
project_id (str | None) – ID of the Google Cloud project where model is located if None then default project_id is used.
location (str) – The location of the project.
params – Additional domain-specific parameters for the predictions.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (MetaData) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- class airflow.providers.google.cloud.operators.automl.AutoMLDeleteModelOperator(*, model_id, location, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Delete Google Cloud AutoML model.
See also
For more information on how to use this operator, take a look at the guide: Operations On Models
- Parameters
model_id (str) – Name of the model requested to serve the prediction.
project_id (str | None) – ID of the Google Cloud project where model is located if None then default project_id is used.
location (str) – The location of the project.
params – Additional domain-specific parameters for the predictions.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (MetaData) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- class airflow.providers.google.cloud.operators.automl.AutoMLDeployModelOperator(*, model_id, location, project_id=None, image_detection_metadata=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Deploys a model; if a model is already deployed, deploying it with the same parameters has no effect.
Deploying with different parameters (as e.g. changing node_number) will reset the deployment state without pausing the model_id’s availability.
Only applicable for Text Classification, Image Object Detection and Tables; all other domains manage deployment automatically.
See also
For more information on how to use this operator, take a look at the guide: Operations On Models
- Parameters
model_id (str) – Name of the model to be deployed.
image_detection_metadata (dict | None) – Model deployment metadata specific to Image Object Detection. If a dict is provided, it must be of the same form as the protobuf message ImageObjectDetectionModelDeploymentMetadata
project_id (str | None) – ID of the Google Cloud project where model is located if None then default project_id is used.
location (str) – The location of the project.
params – Additional domain-specific parameters for the predictions.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- class airflow.providers.google.cloud.operators.automl.AutoMLTablesListTableSpecsOperator(*, dataset_id, location, page_size=None, filter_=None, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Lists table specs in a dataset.
See also
For more information on how to use this operator, take a look at the guide: Listing Table And Columns Specs
- Parameters
dataset_id (str) – Name of the dataset.
filter – Filter expression, see go/filtering.
page_size (int | None) – The maximum number of resources contained in the underlying API response. If page streaming is performed per resource, this parameter does not affect the return value. If page streaming is performed per-page, this determines the maximum number of resources in a page.
project_id (str | None) – ID of the Google Cloud project if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (MetaData) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- class airflow.providers.google.cloud.operators.automl.AutoMLListDatasetOperator(*, location, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Lists AutoML Datasets in project.
See also
For more information on how to use this operator, take a look at the guide: Listing And Deleting Datasets
- Parameters
project_id (str | None) – ID of the Google Cloud project where datasets are located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (MetaData) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).
- class airflow.providers.google.cloud.operators.automl.AutoMLDeleteDatasetOperator(*, dataset_id, location, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
Bases:
airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator
Deletes a dataset and all of its contents.
See also
For more information on how to use this operator, take a look at the guide: Listing And Deleting Datasets
- Parameters
dataset_id (str | list[str]) – Name of the dataset_id, list of dataset_id or string of dataset_id coma separated to be deleted.
project_id (str | None) – ID of the Google Cloud project where dataset is located if None then default project_id is used.
location (str) – The location of the project.
retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (MetaData) – Additional metadata that is provided to the method.
gcp_conn_id (str) – The connection ID to use to connect to Google Cloud.
impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).