airflow.providers.google.cloud.operators.gen_ai¶
This module contains Google Gen AI operators.
Classes¶
| Uses the Gemini AI Embeddings API to generate embeddings for words, phrases, sentences, and code. | |
| Generate a model response based on given configuration. Input capabilities differ between models, including tuned models. | |
| Create a tuning job to adapt model behavior with a labeled dataset. | |
| Use Count Tokens API to calculate the number of input tokens before sending a request to Gemini API. | |
| Create CachedContent resource to reduce the cost of requests that contain repeat content with high input token counts. | 
Module Contents¶
- class airflow.providers.google.cloud.operators.gen_ai.GenAIGenerateEmbeddingsOperator(*, project_id, location, model, contents, config=None, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Uses the Gemini AI Embeddings API to generate embeddings for words, phrases, sentences, and code. - Parameters:
- project_id (str) – Required. The ID of the Google Cloud project that the service belongs to (templated). 
- location (str) – Required. The ID of the Google Cloud location that the service belongs to (templated). 
- model (str) – Required. The name of the model to use for content generation, which can be a text-only or multimodal model. For example, gemini-pro or gemini-pro-vision. 
- contents (google.genai.types.ContentListUnion | google.genai.types.ContentListUnionDict | list[str]) – Optional. The contents to use for embedding. 
- config (google.genai.types.EmbedContentConfigOrDict | None) – Optional. Configuration for embeddings. 
- gcp_conn_id (str) – Optional. The connection ID to use connecting to Google Cloud. 
- impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional. Service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.gen_ai.GenAIGenerateContentOperator(*, project_id, location, contents, model, generation_config=None, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Generate a model response based on given configuration. Input capabilities differ between models, including tuned models. - Parameters:
- project_id (str) – Required. The ID of the Google Cloud project that the service belongs to (templated). 
- location (str) – Required. The ID of the Google Cloud location that the service belongs to (templated). 
- model (str) – Required. The name of the model to use for content generation, which can be a text-only or multimodal model. For example, gemini-pro or gemini-pro-vision. 
- contents (google.genai.types.ContentListUnionDict) – Required. The multi-part content of a message that a user or a program gives to the generative model, in order to elicit a specific response. 
- generation_config (google.genai.types.GenerateContentConfig | dict[str, Any] | None) – Optional. Generation configuration settings. 
- gcp_conn_id (str) – The connection ID to use connecting to Google Cloud. 
- impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.gen_ai.GenAISupervisedFineTuningTrainOperator(*, project_id, location, source_model, training_dataset, tuning_job_config=None, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Create a tuning job to adapt model behavior with a labeled dataset. - Parameters:
- project_id (str) – Required. The ID of the Google Cloud project that the service belongs to. 
- location (str) – Required. The ID of the Google Cloud location that the service belongs to. 
- source_model (str) – Required. A pre-trained model optimized for performing natural language tasks such as classification, summarization, extraction, content creation, and ideation. 
- training_dataset (google.genai.types.TuningDatasetOrDict) – Required. Cloud Storage URI of your training dataset. The dataset must be formatted as a JSONL file. For best results, provide at least 100 to 500 examples. 
- tuning_job_config (google.genai.types.CreateTuningJobConfigOrDict | dict[str, Any] | None) – Optional. Configuration of the Tuning job to be created. 
- gcp_conn_id (str) – The connection ID to use connecting to Google Cloud. 
- impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.gen_ai.GenAICountTokensOperator(*, project_id, location, contents, model, config=None, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Use Count Tokens API to calculate the number of input tokens before sending a request to Gemini API. - Parameters:
- project_id (str) – Required. The ID of the Google Cloud project that the service belongs to (templated). 
- location (str) – Required. The ID of the Google Cloud location that the service belongs to (templated). 
- contents (google.genai.types.ContentListUnion | google.genai.types.ContentListUnionDict) – Required. The multi-part content of a message that a user or a program gives to the generative model, in order to elicit a specific response. 
- model (str) – Required. Model, supporting prompts with text-only input, including natural language tasks, multi-turn text and code chat, and code generation. It can output text and code. 
- config (google.genai.types.CountTokensConfigOrDict | None) – Optional. Configuration for Count Tokens. 
- gcp_conn_id (str) – The connection ID to use connecting to Google Cloud. 
- impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated). 
 
 
- class airflow.providers.google.cloud.operators.gen_ai.GenAICreateCachedContentOperator(*, project_id, location, model, cached_content_config=None, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]¶
- Bases: - airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator- Create CachedContent resource to reduce the cost of requests that contain repeat content with high input token counts. - Parameters:
- project_id (str) – Required. The ID of the Google Cloud project that the service belongs to. 
- location (str) – Required. The ID of the Google Cloud location that the service belongs to. 
- model (str) – Required. The name of the publisher model to use for cached content. 
- cached_content_config (google.genai.types.CreateCachedContentConfigOrDict | None) – Optional. Configuration of the Cached Content. 
- gcp_conn_id (str) – The connection ID to use connecting to Google Cloud. 
- impersonation_chain (str | collections.abc.Sequence[str] | None) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).