airflow.providers.google.cloud.operators.automl¶
This module contains Google AutoML operators.
Module Contents¶
Classes¶
| Creates Google Cloud AutoML model. | |
| Runs prediction operation on Google Cloud AutoML. | |
| Perform a batch prediction on Google Cloud AutoML. | |
| Creates a Google Cloud AutoML dataset. | |
| Imports data to a Google Cloud AutoML dataset. | |
| Lists column specs in a table. | |
| Updates a dataset. | |
| Get Google Cloud AutoML model. | |
| Delete Google Cloud AutoML model. | |
| Deploys a model; if a model is already deployed, deploying it with the same parameters has no effect. | |
| Lists table specs in a dataset. | |
| Lists AutoML Datasets in project. | |
| Deletes a dataset and all of its contents. | 
Attributes¶
- class airflow.providers.google.cloud.operators.automl.AutoMLTrainModelOperator(*, model, location, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Creates Google Cloud AutoML model. - AutoMLTrainModelOperator for text prediction is deprecated. Please use - airflow.providers.google.cloud.operators.vertex_ai.auto_ml.CreateAutoMLTextTrainingJobOperatorinstead.- See also - For more information on how to use this operator, take a look at the guide: Operations On Models - Parameters
- model (dict) – Model definition. 
- project_id (str | None) – ID of the Google Cloud project where model will be created if None then default project_id is used. 
- location (str) – The location of the project. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (MetaData) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.automl.AutoMLPredictOperator(*, model_id, location, payload, operation_params=None, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Runs prediction operation on Google Cloud AutoML. - See also - For more information on how to use this operator, take a look at the guide: Making Predictions - Parameters
- model_id (str) – Name of the model requested to serve the batch prediction. 
- payload (dict) – Name od the model used for the prediction. 
- project_id (str | None) – ID of the Google Cloud project where model is located if None then default project_id is used. 
- location (str) – The location of the project. 
- operation_params (dict[str, str] | None) – Additional domain-specific parameters for the predictions. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (MetaData) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.automl.AutoMLBatchPredictOperator(*, model_id, input_config, output_config, location, project_id=None, prediction_params=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Perform a batch prediction on Google Cloud AutoML. - See also - For more information on how to use this operator, take a look at the guide: Making Predictions - Parameters
- project_id (str | None) – ID of the Google Cloud project where model will be created if None then default project_id is used. 
- location (str) – The location of the project. 
- model_id (str) – Name of the model_id requested to serve the batch prediction. 
- input_config (dict) – Required. The input configuration for batch prediction. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.BatchPredictInputConfig 
- output_config (dict) – Required. The Configuration specifying where output predictions should be written. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.BatchPredictOutputConfig 
- prediction_params (dict[str, str] | None) – Additional domain-specific parameters for the predictions, any string must be up to 25000 characters long. 
- project_id – ID of the Google Cloud project where model is located if None then default project_id is used. 
- location – The location of the project. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (MetaData) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.automl.AutoMLCreateDatasetOperator(*, dataset, location, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Creates a Google Cloud AutoML dataset. - See also - For more information on how to use this operator, take a look at the guide: Creating Datasets - Parameters
- dataset (dict) – The dataset to create. If a dict is provided, it must be of the same form as the protobuf message Dataset. 
- project_id (str | None) – ID of the Google Cloud project where dataset is located if None then default project_id is used. 
- location (str) – The location of the project. 
- params – Additional domain-specific parameters for the predictions. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (MetaData) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.automl.AutoMLImportDataOperator(*, dataset_id, location, input_config, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Imports data to a Google Cloud AutoML dataset. - See also - For more information on how to use this operator, take a look at the guide: Creating Datasets - Parameters
- dataset_id (str) – ID of dataset to be updated. 
- input_config (dict) – The desired input location and its domain specific semantics, if any. If a dict is provided, it must be of the same form as the protobuf message InputConfig. 
- project_id (str | None) – ID of the Google Cloud project where dataset is located if None then default project_id is used. 
- location (str) – The location of the project. 
- params – Additional domain-specific parameters for the predictions. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (MetaData) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.automl.AutoMLTablesListColumnSpecsOperator(*, dataset_id, table_spec_id, location, field_mask=None, filter_=None, page_size=None, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Lists column specs in a table. - See also - For more information on how to use this operator, take a look at the guide: Listing Table And Columns Specs - Parameters
- dataset_id (str) – Name of the dataset. 
- table_spec_id (str) – table_spec_id for path builder. 
- field_mask (dict | None) – Mask specifying which fields to read. If a dict is provided, it must be of the same form as the protobuf message google.cloud.automl_v1beta1.types.FieldMask 
- filter – Filter expression, see go/filtering. 
- page_size (int | None) – The maximum number of resources contained in the underlying API response. If page streaming is performed per resource, this parameter does not affect the return value. If page streaming is performed per page, this determines the maximum number of resources in a page. 
- project_id (str | None) – ID of the Google Cloud project where dataset is located if None then default project_id is used. 
- location (str) – The location of the project. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (MetaData) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.automl.AutoMLTablesUpdateDatasetOperator(*, dataset, location, update_mask=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Updates a dataset. - See also - For more information on how to use this operator, take a look at the guide: Creating Datasets - Parameters
- dataset (dict) – The dataset which replaces the resource on the server. If a dict is provided, it must be of the same form as the protobuf message Dataset. 
- update_mask (dict | None) – The update mask applies to the resource. If a dict is provided, it must be of the same form as the protobuf message FieldMask. 
- location (str) – The location of the project. 
- params – Additional domain-specific parameters for the predictions. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (MetaData) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.automl.AutoMLGetModelOperator(*, model_id, location, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Get Google Cloud AutoML model. - See also - For more information on how to use this operator, take a look at the guide: Operations On Models - Parameters
- model_id (str) – Name of the model requested to serve the prediction. 
- project_id (str | None) – ID of the Google Cloud project where model is located if None then default project_id is used. 
- location (str) – The location of the project. 
- params – Additional domain-specific parameters for the predictions. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (MetaData) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.automl.AutoMLDeleteModelOperator(*, model_id, location, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Delete Google Cloud AutoML model. - See also - For more information on how to use this operator, take a look at the guide: Operations On Models - Parameters
- model_id (str) – Name of the model requested to serve the prediction. 
- project_id (str | None) – ID of the Google Cloud project where model is located if None then default project_id is used. 
- location (str) – The location of the project. 
- params – Additional domain-specific parameters for the predictions. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (MetaData) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.automl.AutoMLDeployModelOperator(*, model_id, location, project_id=None, image_detection_metadata=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Deploys a model; if a model is already deployed, deploying it with the same parameters has no effect. - Deploying with different parameters (as e.g. changing node_number) will reset the deployment state without pausing the model_id’s availability. - Only applicable for Text Classification, Image Object Detection and Tables; all other domains manage deployment automatically. - See also - For more information on how to use this operator, take a look at the guide: Operations On Models - Parameters
- model_id (str) – Name of the model to be deployed. 
- image_detection_metadata (dict | None) – Model deployment metadata specific to Image Object Detection. If a dict is provided, it must be of the same form as the protobuf message ImageObjectDetectionModelDeploymentMetadata 
- project_id (str | None) – ID of the Google Cloud project where model is located if None then default project_id is used. 
- location (str) – The location of the project. 
- params – Additional domain-specific parameters for the predictions. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (Sequence[tuple[str, str]]) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.automl.AutoMLTablesListTableSpecsOperator(*, dataset_id, location, page_size=None, filter_=None, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Lists table specs in a dataset. - See also - For more information on how to use this operator, take a look at the guide: Listing Table And Columns Specs - Parameters
- dataset_id (str) – Name of the dataset. 
- filter – Filter expression, see go/filtering. 
- page_size (int | None) – The maximum number of resources contained in the underlying API response. If page streaming is performed per resource, this parameter does not affect the return value. If page streaming is performed per-page, this determines the maximum number of resources in a page. 
- project_id (str | None) – ID of the Google Cloud project if None then default project_id is used. 
- location (str) – The location of the project. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (MetaData) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.automl.AutoMLListDatasetOperator(*, location, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Lists AutoML Datasets in project. - See also - For more information on how to use this operator, take a look at the guide: Listing And Deleting Datasets - Parameters
- project_id (str | None) – ID of the Google Cloud project where datasets are located if None then default project_id is used. 
- location (str) – The location of the project. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (MetaData) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.automl.AutoMLDeleteDatasetOperator(*, dataset_id, location, project_id=None, metadata=(), timeout=None, retry=DEFAULT, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Deletes a dataset and all of its contents. - See also - For more information on how to use this operator, take a look at the guide: Listing And Deleting Datasets - Parameters
- dataset_id (str | list[str]) – Name of the dataset_id, list of dataset_id or string of dataset_id coma separated to be deleted. 
- project_id (str | None) – ID of the Google Cloud project where dataset is located if None then default project_id is used. 
- location (str) – The location of the project. 
- retry (google.api_core.retry.Retry | google.api_core.gapic_v1.method._MethodDefault) – A retry object used to retry requests. If None is specified, requests will not be retried. 
- timeout (float | None) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt. 
- metadata (MetaData) – Additional metadata that is provided to the method. 
- gcp_conn_id (str) – The connection ID to use to connect to Google Cloud. 
- impersonation_chain (str | Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
