Google Cloud Text to Speech Operators

Prerequisite Tasks

To use these operators, you must do a few things:

GcpTextToSpeechSynthesizeOperator

Synthesizes text to audio file and stores it to Google Cloud Storage

For parameter definition, take a look at airflow.contrib.operators.gcp_text_to_speech_operator.GcpTextToSpeechSynthesizeOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_speech.pyView Source

GCP_PROJECT_ID = os.environ.get("GCP_PROJECT_ID", "example-project")
BUCKET_NAME = os.environ.get("GCP_SPEECH_TEST_BUCKET", "gcp-speech-test-bucket")

input, voice and audio_config arguments need to be dicts or objects of corresponding classes from google.cloud.texttospeech_v1.types module

for more information, see: https://googleapis.github.io/google-cloud-python/latest/texttospeech/gapic/v1/api.html#google.cloud.texttospeech_v1.TextToSpeechClient.synthesize_speech

airflow/contrib/example_dags/example_gcp_speech.pyView Source

INPUT = {"text": "Sample text for demo purposes"}
VOICE = {"language_code": "en-US", "ssml_gender": "FEMALE"}
AUDIO_CONFIG = {"audio_encoding": "LINEAR16"}

filename is a simple string argument:

airflow/contrib/example_dags/example_gcp_speech.pyView Source

FILENAME = "gcp-speech-test-file"

Using the operator

airflow/contrib/example_dags/example_gcp_speech.pyView Source

text_to_speech_synthesize_task = GcpTextToSpeechSynthesizeOperator(
    project_id=GCP_PROJECT_ID,
    input_data=INPUT,
    voice=VOICE,
    audio_config=AUDIO_CONFIG,
    target_bucket_name=BUCKET_NAME,
    target_filename=FILENAME,
    task_id="text_to_speech_synthesize_task",
)

Templating

template_fields = (
    "input_data",
    "voice",
    "audio_config",
    "project_id",
    "gcp_conn_id",
    "target_bucket_name",
    "target_filename",
)

Google Cloud Speech to Text Operators

GcpSpeechToTextRecognizeSpeechOperator

Recognizes speech in audio input and returns text.

For parameter definition, take a look at airflow.contrib.operators.gcp_speech_to_text_operator.GcpSpeechToTextRecognizeSpeechOperator

Arguments

config and audio arguments need to be dicts or objects of corresponding classes from google.cloud.speech_v1.types module

for more information, see: https://googleapis.github.io/google-cloud-python/latest/speech/gapic/v1/api.html#google.cloud.speech_v1.SpeechClient.recognize

airflow/contrib/example_dags/example_gcp_speech.pyView Source

INPUT = {"text": "Sample text for demo purposes"}
VOICE = {"language_code": "en-US", "ssml_gender": "FEMALE"}
AUDIO_CONFIG = {"audio_encoding": "LINEAR16"}

filename is a simple string argument:

airflow/contrib/example_dags/example_gcp_speech.pyView Source

CONFIG = {"encoding": "LINEAR16", "language_code": "en_US"}
AUDIO = {"uri": "gs://{bucket}/{object}".format(bucket=BUCKET_NAME, object=FILENAME)}

Using the operator

airflow/contrib/example_dags/example_gcp_speech.pyView Source

speech_to_text_recognize_task = GcpSpeechToTextRecognizeSpeechOperator(
    project_id=GCP_PROJECT_ID, config=CONFIG, audio=AUDIO, task_id="speech_to_text_recognize_task"
)

Templating

template_fields = ("audio", "config", "project_id", "gcp_conn_id", "timeout")

Reference

For further information, look at:

Was this entry helpful?