airflow.providers.google.cloud.operators.natural_language

This module contains Google Cloud Language operators.

Module Contents

Classes

CloudNaturalLanguageAnalyzeEntitiesOperator

Finds named entities in the text along with entity types,

CloudNaturalLanguageAnalyzeEntitySentimentOperator

Finds entities, similar to AnalyzeEntities in the text and analyzes sentiment associated with each

CloudNaturalLanguageAnalyzeSentimentOperator

Analyzes the sentiment of the provided text.

CloudNaturalLanguageClassifyTextOperator

Classifies a document into categories.

Attributes

MetaData

airflow.providers.google.cloud.operators.natural_language.MetaData[source]
class airflow.providers.google.cloud.operators.natural_language.CloudNaturalLanguageAnalyzeEntitiesOperator(*, document, encoding_type=None, retry=None, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Finds named entities in the text along with entity types, salience, mentions for each entity, and other properties.

See also

For more information on how to use this operator, take a look at the guide: Analyzing Entities

Parameters
  • document (Union[dict, google.cloud.language_v1.types.Document]) – Input document. If a dict is provided, it must be of the same form as the protobuf message Document

  • encoding_type (Optional[google.cloud.language_v1.enums.EncodingType]) – The encoding type used by the API to calculate offsets.

  • retry (Optional[google.api_core.retry.Retry]) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (Optional[float]) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use connecting to Google Cloud.

  • impersonation_chain (Optional[Union[str, Sequence[str]]]) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields :Sequence[str] = ['document', 'gcp_conn_id', 'impersonation_chain'][source]
execute(self, context)[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.natural_language.CloudNaturalLanguageAnalyzeEntitySentimentOperator(*, document, encoding_type=None, retry=None, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Finds entities, similar to AnalyzeEntities in the text and analyzes sentiment associated with each entity and its mentions.

See also

For more information on how to use this operator, take a look at the guide: Analyzing Entity Sentiment

Parameters
  • document (Union[dict, google.cloud.language_v1.types.Document]) – Input document. If a dict is provided, it must be of the same form as the protobuf message Document

  • encoding_type (Optional[google.cloud.language_v1.enums.EncodingType]) – The encoding type used by the API to calculate offsets.

  • retry (Optional[google.api_core.retry.Retry]) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (Optional[float]) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use connecting to Google Cloud.

  • impersonation_chain (Optional[Union[str, Sequence[str]]]) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

Return type

google.cloud.language_v1.types.AnalyzeEntitiesResponse

template_fields :Sequence[str] = ['document', 'gcp_conn_id', 'impersonation_chain'][source]
execute(self, context)[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.natural_language.CloudNaturalLanguageAnalyzeSentimentOperator(*, document, encoding_type=None, retry=None, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Analyzes the sentiment of the provided text.

See also

For more information on how to use this operator, take a look at the guide: Analyzing Sentiment

Parameters
  • document (Union[dict, google.cloud.language_v1.types.Document]) – Input document. If a dict is provided, it must be of the same form as the protobuf message Document

  • encoding_type (Optional[google.cloud.language_v1.enums.EncodingType]) – The encoding type used by the API to calculate offsets.

  • retry (Optional[google.api_core.retry.Retry]) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (Optional[float]) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use connecting to Google Cloud.

  • impersonation_chain (Optional[Union[str, Sequence[str]]]) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

Return type

google.cloud.language_v1.types.AnalyzeEntitiesResponse

template_fields :Sequence[str] = ['document', 'gcp_conn_id', 'impersonation_chain'][source]
execute(self, context)[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.natural_language.CloudNaturalLanguageClassifyTextOperator(*, document, retry=None, timeout=None, metadata=(), gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Classifies a document into categories.

See also

For more information on how to use this operator, take a look at the guide: Classifying Content

Parameters
  • document (Union[dict, google.cloud.language_v1.types.Document]) – Input document. If a dict is provided, it must be of the same form as the protobuf message Document

  • retry (Optional[google.api_core.retry.Retry]) – A retry object used to retry requests. If None is specified, requests will not be retried.

  • timeout (Optional[float]) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.

  • metadata (MetaData) – Additional metadata that is provided to the method.

  • gcp_conn_id (str) – The connection ID to use connecting to Google Cloud.

  • impersonation_chain (Optional[Union[str, Sequence[str]]]) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields :Sequence[str] = ['document', 'gcp_conn_id', 'impersonation_chain'][source]
execute(self, context)[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?