Google Cloud Vision Operators

Prerequisite Tasks

To use these operators, you must do a few things:

CloudVisionAddProductToProductSetOperator

Creates a new ReferenceImage resource.

For parameter definition, take a look at CloudVisionAddProductToProductSetOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_LOCATION = os.environ.get('GCP_VISION_LOCATION', 'europe-west1')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_PRODUCT_ID = os.environ.get('GCP_VISION_PRODUCT_ID', 'product_explicit_id')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_PRODUCT_SET_ID = os.environ.get('GCP_VISION_PRODUCT_SET_ID', 'product_set_explicit_id')

Using the operator

We are using the Product, ProductSet and Retry objects from Google libraries:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from airflow.utils.dates import days_ago
from google.api_core.retry import Retry

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from google.cloud.vision_v1.types import ProductSet

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from google.cloud.vision_v1.types import Product

If product_set_id and product_id was generated by the API it can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

add_product_to_product_set = CloudVisionAddProductToProductSetOperator(
    location=GCP_VISION_LOCATION,
    product_set_id="{{ task_instance.xcom_pull('product_set_create') }}",
    product_id="{{ task_instance.xcom_pull('product_create') }}",
    retry=Retry(maximum=10.0),
    timeout=5,
    task_id='add_product_to_product_set',
)

Otherwise it can be specified explicitly:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

add_product_to_product_set_2 = CloudVisionAddProductToProductSetOperator(
    location=GCP_VISION_LOCATION,
    product_set_id=GCP_VISION_PRODUCT_SET_ID,
    product_id=GCP_VISION_PRODUCT_ID,
    retry=Retry(maximum=10.0),
    timeout=5,
    task_id='add_product_to_product_set_2',
)

Templating

template_fields = ("location", "product_set_id", "product_id", "project_id", "gcp_conn_id")

CloudVisionAnnotateImageOperator

Run image detection and annotation for an image.

For parameter definition, take a look at CloudVisionAnnotateImageOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_LOCATION = os.environ.get('GCP_VISION_LOCATION', 'europe-west1')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_ANNOTATE_IMAGE_URL = os.environ.get('GCP_VISION_ANNOTATE_IMAGE_URL', 'gs://bucket/image2.jpg')

Using the operator

We are using the enums and Retry objects from Google libraries:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from airflow.utils.dates import days_ago
from google.api_core.retry import Retry

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from google.cloud.vision import enums

airflow/contrib/example_dags/example_gcp_vision.pyView Source

annotate_image = CloudVisionAnnotateImageOperator(
    request=annotate_image_request, retry=Retry(maximum=10.0), timeout=5, task_id='annotate_image'
)

The result can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

annotate_image_result = BashOperator(
    bash_command="echo {{ task_instance.xcom_pull('annotate_image')"
    "['logoAnnotations'][0]['description'] }}",
    task_id='annotate_image_result',
)

Templating

template_fields = ('request', 'gcp_conn_id')

CloudVisionProductCreateOperator

Creates and returns a new product resource.

Possible errors regarding the Product object provided:

  • Returns INVALID_ARGUMENT if display_name is missing or longer than 4096 characters.

  • Returns INVALID_ARGUMENT if description is longer than 4096 characters.

  • Returns INVALID_ARGUMENT if product_category is missing or invalid.

For parameter definition, take a look at CloudVisionProductCreateOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_LOCATION = os.environ.get('GCP_VISION_LOCATION', 'europe-west1')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_PRODUCT_ID = os.environ.get('GCP_VISION_PRODUCT_ID', 'product_explicit_id')

Using the operator

We are using the Product and Retry objects from Google libraries:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from google.cloud.vision_v1.types import Product

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from airflow.utils.dates import days_ago
from google.api_core.retry import Retry

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product = Product(display_name='My Product 1', product_category='toys')

The product_id argument can be omitted (it will be generated by the API):

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_create = CloudVisionProductCreateOperator(
    location=GCP_VISION_LOCATION,
    product=product,
    retry=Retry(maximum=10.0),
    timeout=5,
    task_id='product_create',
)

Or it can be specified explicitly:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_create_2 = CloudVisionProductCreateOperator(
    product_id=GCP_VISION_PRODUCT_ID,
    location=GCP_VISION_LOCATION,
    product=product,
    retry=Retry(maximum=10.0),
    timeout=5,
    task_id='product_create_2',
)

Templating

template_fields = ('location', 'project_id', 'product_id', 'gcp_conn_id')

CloudVisionProductDeleteOperator

Permanently deletes a product and its reference images.

Metadata of the product and all its images will be deleted right away, but search queries against ProductSets containing the product may still work until all related caches are refreshed.

Possible errors:

  • Returns NOT_FOUND if the product does not exist.

For parameter definition, take a look at CloudVisionProductDeleteOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_LOCATION = os.environ.get('GCP_VISION_LOCATION', 'europe-west1')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_PRODUCT_ID = os.environ.get('GCP_VISION_PRODUCT_ID', 'product_explicit_id')

Using the operator

If product_id was generated by the API it can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_delete = CloudVisionProductDeleteOperator(
    location=GCP_VISION_LOCATION,
    product_id="{{ task_instance.xcom_pull('product_create') }}",
    task_id='product_delete',
)

Otherwise it can be specified explicitly:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_delete_2 = CloudVisionProductDeleteOperator(
    location=GCP_VISION_LOCATION, product_id=GCP_VISION_PRODUCT_ID, task_id='product_delete_2'
)

Templating

template_fields = ('location', 'project_id', 'product_id', 'gcp_conn_id')

CloudVisionProductGetOperator

Gets information associated with a Product.

Possible errors:

  • Returns NOT_FOUND if the Product does not exist.

For parameter definition, take a look at CloudVisionProductGetOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_LOCATION = os.environ.get('GCP_VISION_LOCATION', 'europe-west1')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_PRODUCT_ID = os.environ.get('GCP_VISION_PRODUCT_ID', 'product_explicit_id')

Using the operator

If product_id was generated by the API it can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_get = CloudVisionProductGetOperator(
    location=GCP_VISION_LOCATION,
    product_id="{{ task_instance.xcom_pull('product_create') }}",
    task_id='product_get',
)

Otherwise it can be specified explicitly:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_get_2 = CloudVisionProductGetOperator(
    location=GCP_VISION_LOCATION, product_id=GCP_VISION_PRODUCT_ID, task_id='product_get_2'
)

Templating

template_fields = ('location', 'project_id', 'product_id', 'gcp_conn_id')

CloudVisionProductSetCreateOperator

Creates a new ProductSet resource.

For parameter definition, take a look at CloudVisionProductSetCreateOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_LOCATION = os.environ.get('GCP_VISION_LOCATION', 'europe-west1')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_PRODUCT_SET_ID = os.environ.get('GCP_VISION_PRODUCT_SET_ID', 'product_set_explicit_id')

Using the operator

We are using the ProductSet and Retry objects from Google libraries:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from google.cloud.vision_v1.types import ProductSet

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from airflow.utils.dates import days_ago
from google.api_core.retry import Retry

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_set = ProductSet(display_name='My Product Set')

The product_set_id argument can be omitted (it will be generated by the API):

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_set_create = CloudVisionProductSetCreateOperator(
    location=GCP_VISION_LOCATION,
    product_set=product_set,
    retry=Retry(maximum=10.0),
    timeout=5,
    task_id='product_set_create',
)

Or it can be specified explicitly:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_set_create_2 = CloudVisionProductSetCreateOperator(
    product_set_id=GCP_VISION_PRODUCT_SET_ID,
    location=GCP_VISION_LOCATION,
    product_set=product_set,
    retry=Retry(maximum=10.0),
    timeout=5,
    task_id='product_set_create_2',
)

Templating

template_fields = ("location", "project_id", "product_set_id", "gcp_conn_id")

CloudVisionProductSetDeleteOperator

Permanently deletes a ProductSet. Products and ReferenceImages in the ProductSet are not deleted. The actual image files are not deleted from Google Cloud Storage.

For parameter definition, take a look at CloudVisionProductSetDeleteOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_LOCATION = os.environ.get('GCP_VISION_LOCATION', 'europe-west1')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_PRODUCT_SET_ID = os.environ.get('GCP_VISION_PRODUCT_SET_ID', 'product_set_explicit_id')

Using the operator

If product_set_id was generated by the API it can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_set_delete = CloudVisionProductSetDeleteOperator(
    location=GCP_VISION_LOCATION,
    product_set_id="{{ task_instance.xcom_pull('product_set_create') }}",
    task_id='product_set_delete',
)

Otherwise it can be specified explicitly:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_set_delete_2 = CloudVisionProductSetDeleteOperator(
    location=GCP_VISION_LOCATION, product_set_id=GCP_VISION_PRODUCT_SET_ID, task_id='product_set_delete_2'
)

Templating

template_fields = ('location', 'project_id', 'product_set_id', 'gcp_conn_id')

CloudVisionProductSetGetOperator

Gets information associated with a ProductSet.

For parameter definition, take a look at CloudVisionProductSetGetOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_LOCATION = os.environ.get('GCP_VISION_LOCATION', 'europe-west1')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_PRODUCT_SET_ID = os.environ.get('GCP_VISION_PRODUCT_SET_ID', 'product_set_explicit_id')

Using the operator

If product_set_id was generated by the API it can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_set_get = CloudVisionProductSetGetOperator(
    location=GCP_VISION_LOCATION,
    product_set_id="{{ task_instance.xcom_pull('product_set_create') }}",
    task_id='product_set_get',
)

Otherwise it can be specified explicitly:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_set_get_2 = CloudVisionProductSetGetOperator(
    location=GCP_VISION_LOCATION, product_set_id=GCP_VISION_PRODUCT_SET_ID, task_id='product_set_get_2'
)

Templating

template_fields = ('location', 'project_id', 'product_set_id', 'gcp_conn_id')

CloudVisionProductSetUpdateOperator

Makes changes to a ProductSet resource. Only display_name can be updated currently.

Note

To locate the ProductSet resource, its name in the form projects/PROJECT_ID/locations/LOC_ID/productSets/PRODUCT_SET_ID is necessary.

You can provide the name directly as an attribute of the product_set object. However, you can leave it blank and provide location and product_set_id instead (and optionally project_id - if not present, the connection default will be used) and the name will be created by the operator itself.

This mechanism exists for your convenience, to allow leaving the project_id empty and having Airflow use the connection default project_id.

For parameter definition, take a look at CloudVisionProductSetUpdateOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_LOCATION = os.environ.get('GCP_VISION_LOCATION', 'europe-west1')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_PRODUCT_SET_ID = os.environ.get('GCP_VISION_PRODUCT_SET_ID', 'product_set_explicit_id')

Using the operator

We are using the ProductSet object from the Google Cloud Vision library:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from google.cloud.vision_v1.types import ProductSet

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_set = ProductSet(display_name='My Product Set')

Initialization of the task:

If product_set_id was generated by the API it can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_set_update = CloudVisionProductSetUpdateOperator(
    location=GCP_VISION_LOCATION,
    product_set_id="{{ task_instance.xcom_pull('product_set_create') }}",
    product_set=ProductSet(display_name='My Product Set 2'),
    task_id='product_set_update',
)

Otherwise it can be specified explicitly:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_set_update_2 = CloudVisionProductSetUpdateOperator(
    location=GCP_VISION_LOCATION,
    product_set_id=GCP_VISION_PRODUCT_SET_ID,
    product_set=ProductSet(display_name='My Product Set 2'),
    task_id='product_set_update_2',
)

Templating

template_fields = ('location', 'project_id', 'product_set_id', 'gcp_conn_id')

CloudVisionProductUpdateOperator

Makes changes to a Product resource. Only the display_name, description, and labels fields can be updated right now. If labels are updated, the change will not be reflected in queries until the next index time.

Note

To locate the Product resource, its name in the form projects/PROJECT_ID/locations/LOC_ID/products/PRODUCT_ID is necessary.

You can provide the name directly as an attribute of the product object. However, you can leave it blank and provide location and product_id instead (and optionally project_id - if not present, the connection default will be used) and the name will be created by the operator itself.

This mechanism exists for your convenience, to allow leaving the project_id empty and having Airflow use the connection default project_id.

Possible errors:

  • Returns NOT_FOUND if the Product does not exist.

  • Returns INVALID_ARGUMENT if display_name is present in update_mask but is missing from the request or longer than 4096 characters.

  • Returns INVALID_ARGUMENT if description is present in update_mask but is longer than 4096 characters.

  • Returns INVALID_ARGUMENT if product_category is present in update_mask.

For parameter definition, take a look at CloudVisionProductUpdateOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_LOCATION = os.environ.get('GCP_VISION_LOCATION', 'europe-west1')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_PRODUCT_ID = os.environ.get('GCP_VISION_PRODUCT_ID', 'product_explicit_id')

Using the operator

We are using the Product object from the Google Cloud Vision library:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from google.cloud.vision_v1.types import Product

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product = Product(display_name='My Product 1', product_category='toys')

If product_id was generated by the API it can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_update = CloudVisionProductUpdateOperator(
    location=GCP_VISION_LOCATION,
    product_id="{{ task_instance.xcom_pull('product_create') }}",
    product=Product(display_name='My Product 2', description='My updated description'),
    task_id='product_update',
)

Otherwise it can be specified explicitly:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

product_update_2 = CloudVisionProductUpdateOperator(
    location=GCP_VISION_LOCATION,
    product_id=GCP_VISION_PRODUCT_ID,
    product=Product(display_name='My Product 2', description='My updated description'),
    task_id='product_update_2',
)

Templating

template_fields = ('location', 'project_id', 'product_id', 'gcp_conn_id')

CloudVisionReferenceImageCreateOperator

Creates a new ReferenceImage resource.

For parameter definition, take a look at CloudVisionReferenceImageCreateOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_LOCATION = os.environ.get('GCP_VISION_LOCATION', 'europe-west1')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_REFERENCE_IMAGE_ID = os.environ.get('GCP_VISION_REFERENCE_IMAGE_ID', 'reference_image_explicit_id')
GCP_VISION_REFERENCE_IMAGE_URL = os.environ.get('GCP_VISION_REFERENCE_IMAGE_URL', 'gs://bucket/image1.jpg')

Using the operator

We are using the ReferenceImage and Retry objects from Google libraries:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from google.cloud.vision_v1.types import ReferenceImage

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from airflow.utils.dates import days_ago
from google.api_core.retry import Retry

airflow/contrib/example_dags/example_gcp_vision.pyView Source

reference_image = ReferenceImage(uri=GCP_VISION_REFERENCE_IMAGE_URL)

The product_set_id argument can be omitted (it will be generated by the API):

airflow/contrib/example_dags/example_gcp_vision.pyView Source

reference_image_create = CloudVisionReferenceImageCreateOperator(
    location=GCP_VISION_LOCATION,
    reference_image=reference_image,
    product_id="{{ task_instance.xcom_pull('product_create') }}",
    reference_image_id=GCP_VISION_REFERENCE_IMAGE_ID,
    retry=Retry(maximum=10.0),
    timeout=5,
    task_id='reference_image_create',
)

Or it can be specified explicitly:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

reference_image_create_2 = CloudVisionReferenceImageCreateOperator(
    location=GCP_VISION_LOCATION,
    reference_image=reference_image,
    product_id=GCP_VISION_PRODUCT_ID,
    reference_image_id=GCP_VISION_REFERENCE_IMAGE_ID,
    retry=Retry(maximum=10.0),
    timeout=5,
    task_id='reference_image_create_2',
)

Templating

template_fields = (
    "location",
    "reference_image",
    "product_id",
    "reference_image_id",
    "project_id",
    "gcp_conn_id",
)

CloudVisionRemoveProductFromProductSetOperator

Creates a new ReferenceImage resource.

For parameter definition, take a look at CloudVisionRemoveProductFromProductSetOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_LOCATION = os.environ.get('GCP_VISION_LOCATION', 'europe-west1')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_PRODUCT_ID = os.environ.get('GCP_VISION_PRODUCT_ID', 'product_explicit_id')

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_PRODUCT_SET_ID = os.environ.get('GCP_VISION_PRODUCT_SET_ID', 'product_set_explicit_id')

Using the operator

We are using the Product, ProductSet and Retry objects from Google libraries:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from airflow.utils.dates import days_ago
from google.api_core.retry import Retry

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from google.cloud.vision_v1.types import ProductSet

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from google.cloud.vision_v1.types import Product

If product_set_id and product_id was generated by the API it can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

remove_product_from_product_set = CloudVisionRemoveProductFromProductSetOperator(
    location=GCP_VISION_LOCATION,
    product_set_id="{{ task_instance.xcom_pull('product_set_create') }}",
    product_id="{{ task_instance.xcom_pull('product_create') }}",
    retry=Retry(maximum=10.0),
    timeout=5,
    task_id='remove_product_from_product_set',
)

Otherwise it can be specified explicitly:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

remove_product_from_product_set_2 = CloudVisionRemoveProductFromProductSetOperator(
    location=GCP_VISION_LOCATION,
    product_set_id=GCP_VISION_PRODUCT_SET_ID,
    product_id=GCP_VISION_PRODUCT_ID,
    retry=Retry(maximum=10.0),
    timeout=5,
    task_id='remove_product_from_product_set_2',
)

Templating

template_fields = ("location", "product_set_id", "product_id", "project_id", "gcp_conn_id")

More information

See Google Cloud Vision Remove Product From Product Set documentation.

CloudVisionDetectTextOperator

Run text detection for an image.

For parameter definition, take a look at CloudVisionDetectTextOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_ANNOTATE_IMAGE_URL = os.environ.get('GCP_VISION_ANNOTATE_IMAGE_URL', 'gs://bucket/image2.jpg')

Using the operator

We are using the Retry objects from Google libraries:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from airflow.utils.dates import days_ago
from google.api_core.retry import Retry

airflow/contrib/example_dags/example_gcp_vision.pyView Source

detect_text = CloudVisionDetectTextOperator(
    image=DETECT_IMAGE,
    retry=Retry(maximum=10.0),
    timeout=5,
    task_id="detect_text",
    language_hints="en",
    web_detection_params={'include_geo_results': True},
)

The result can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

detect_text_result = BashOperator(
    bash_command="echo {{ task_instance.xcom_pull('detect_text')['textAnnotations'][0] }}",
    task_id="detect_text_result",
)

Templating

template_fields = ("image", "max_results", "timeout", "gcp_conn_id")

More information

See Google Cloud Vision Text Detection documentation.

CloudVisionDetectDocumentTextOperator

Run document text detection for an image.

For parameter definition, take a look at CloudVisionDetectDocumentTextOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_ANNOTATE_IMAGE_URL = os.environ.get('GCP_VISION_ANNOTATE_IMAGE_URL', 'gs://bucket/image2.jpg')

Using the operator

We are using the Retry objects from Google libraries:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from airflow.utils.dates import days_ago
from google.api_core.retry import Retry

airflow/contrib/example_dags/example_gcp_vision.pyView Source

document_detect_text = CloudVisionDetectDocumentTextOperator(
    image=DETECT_IMAGE, retry=Retry(maximum=10.0), timeout=5, task_id="document_detect_text"
)

The result can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

document_detect_text_result = BashOperator(
    bash_command="echo {{ task_instance.xcom_pull('document_detect_text')['textAnnotations'][0] }}",
    task_id="document_detect_text_result",
)

Templating

template_fields = ("image", "max_results", "timeout", "gcp_conn_id")

More information

See Google Cloud Vision Document Text Detection documentation.

CloudVisionDetectImageLabelsOperator

Run image label detection for an image.

For parameter definition, take a look at CloudVisionDetectImageLabelsOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_ANNOTATE_IMAGE_URL = os.environ.get('GCP_VISION_ANNOTATE_IMAGE_URL', 'gs://bucket/image2.jpg')

Using the operator

We are using the Retry objects from Google libraries:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from airflow.utils.dates import days_ago
from google.api_core.retry import Retry

airflow/contrib/example_dags/example_gcp_vision.pyView Source

detect_labels = CloudVisionDetectImageLabelsOperator(
    image=DETECT_IMAGE, retry=Retry(maximum=10.0), timeout=5, task_id="detect_labels"
)

The result can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

detect_labels_result = BashOperator(
    bash_command="echo {{ task_instance.xcom_pull('detect_labels')['labelAnnotations'][0] }}",
    task_id="detect_labels_result",
)

Templating

template_fields = ("image", "max_results", "timeout", "gcp_conn_id")

More information

See Google Cloud Vision Label Detection documentation.

CloudVisionDetectImageSafeSearchOperator

Run image label detection for an image.

For parameter definition, take a look at CloudVisionDetectImageSafeSearchOperator

Arguments

Some arguments in the example DAG are taken from the OS environment variables:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

GCP_VISION_ANNOTATE_IMAGE_URL = os.environ.get('GCP_VISION_ANNOTATE_IMAGE_URL', 'gs://bucket/image2.jpg')

Using the operator

We are using the Retry objects from Google libraries:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

from airflow.utils.dates import days_ago
from google.api_core.retry import Retry

airflow/contrib/example_dags/example_gcp_vision.pyView Source

detect_safe_search = CloudVisionDetectImageSafeSearchOperator(
    image=DETECT_IMAGE, retry=Retry(maximum=10.0), timeout=5, task_id="detect_safe_search"
)

The result can be extracted from XCOM:

airflow/contrib/example_dags/example_gcp_vision.pyView Source

detect_safe_search_result = BashOperator(
    bash_command="echo {{ task_instance.xcom_pull('detect_safe_search') }}",
    task_id="detect_safe_search_result",
)

Templating

template_fields = ("image", "max_results", "timeout", "gcp_conn_id")

Was this entry helpful?