Google Cloud Natural Language Operators

The Google Cloud Natural Language can be used to reveal the structure and meaning of text via powerful machine learning models. You can use it to extract information about people, places, events and much more, mentioned in text documents, news articles or blog posts. You can use it to understand sentiment about your product on social media or parse intent from customer conversations happening in a call center or a messaging app.

Prerequisite Tasks

To use these operators, you must do a few things:

Documents

Each operator uses a Document for representing text.

Here is an example of document with text provided as a string:

airflow/contrib/example_dags/example_gcp_natural_language.pyView Source

TEXT = """
Airflow is a platform to programmatically author, schedule and monitor workflows.

Use Airflow to author workflows as Directed Acyclic Graphs (DAGs) of tasks. The Airflow scheduler executes
 your tasks on an array of workers while following the specified dependencies. Rich command line utilities
 make performing complex surgeries on DAGs a snap. The rich user interface makes it easy to visualize
 pipelines running in production, monitor progress, and troubleshoot issues when needed.
"""
document = Document(content=TEXT, type="PLAIN_TEXT")

In addition to supplying string, a document can refer to content stored in Google Cloud Storage.

airflow/contrib/example_dags/example_gcp_natural_language.pyView Source

GCS_CONTENT_URI = "gs://my-text-bucket/sentiment-me.txt"
document_gcs = Document(gcs_content_uri=GCS_CONTENT_URI, type="PLAIN_TEXT")

Analyzing Entities

Entity Analysis inspects the given text for known entities (proper nouns such as public figures, landmarks, etc.), and returns information about those entities. Entity analysis is performed with the CloudLanguageAnalyzeEntitiesOperator operator.

airflow/contrib/example_dags/example_gcp_natural_language.pyView Source

analyze_entities = CloudLanguageAnalyzeEntitiesOperator(document=document, task_id="analyze_entities")

You can use Jinja templating with document, gcp_conn_id parameters which allows you to dynamically determine values. The result is saved to XCom, which allows it to be used by other operators.

airflow/contrib/example_dags/example_gcp_natural_language.pyView Source

analyze_entities_result = BashOperator(
    bash_command="echo \"{{ task_instance.xcom_pull('analyze_entities') }}\"",
    task_id="analyze_entities_result",
)

Analyzing Entity Sentiment

Sentiment Analysis inspects the given text and identifies the prevailing emotional opinion within the text, especially to determine a writer’s attitude as positive, negative, or neutral. Sentiment analysis is performed through the CloudLanguageAnalyzeEntitySentimentOperator operator.

airflow/contrib/example_dags/example_gcp_natural_language.pyView Source

analyze_entity_sentiment = CloudLanguageAnalyzeEntitySentimentOperator(
    document=document, task_id="analyze_entity_sentiment"
)

You can use Jinja templating with document, gcp_conn_id parameters which allows you to dynamically determine values. The result is saved to XCom, which allows it to be used by other operators.

airflow/contrib/example_dags/example_gcp_natural_language.pyView Source

analyze_entity_sentiment_result = BashOperator(
    bash_command="echo \"{{ task_instance.xcom_pull('analyze_entity_sentiment') }}\"",
    task_id="analyze_entity_sentiment_result",
)

Analyzing Sentiment

Sentiment Analysis inspects the given text and identifies the prevailing emotional opinion within the text, especially to determine a writer’s attitude as positive, negative, or neutral. Sentiment analysis is performed through the CloudLanguageAnalyzeSentimentOperator operator.

airflow/contrib/example_dags/example_gcp_natural_language.pyView Source

analyze_sentiment = CloudLanguageAnalyzeSentimentOperator(document=document, task_id="analyze_sentiment")

You can use Jinja templating with document, gcp_conn_id parameters which allows you to dynamically determine values. The result is saved to XCom, which allows it to be used by other operators.

airflow/contrib/example_dags/example_gcp_natural_language.pyView Source

analyze_sentiment_result = BashOperator(
    bash_command="echo \"{{ task_instance.xcom_pull('analyze_sentiment') }}\"",
    task_id="analyze_sentiment_result",
)

Classifying Content

Content Classification analyzes a document and returns a list of content categories that apply to the text found in the document. To classify the content in a document, use the CloudLanguageClassifyTextOperator operator.

airflow/contrib/example_dags/example_gcp_natural_language.pyView Source

analyze_classify_text = CloudLanguageClassifyTextOperator(
    document=document, task_id="analyze_classify_text"
)

You can use Jinja templating with document, gcp_conn_id parameters which allows you to dynamically determine values. The result is saved to XCom, which allows it to be used by other operators.

airflow/contrib/example_dags/example_gcp_natural_language.pyView Source

analyze_classify_text_result = BashOperator(
    bash_command="echo \"{{ task_instance.xcom_pull('analyze_classify_text') }}\"",
    task_id="analyze_classify_text_result",
)

Was this entry helpful?