airflow.contrib.operators.gcp_vision_operator
¶
Module Contents¶
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionProductSetCreateOperator
(product_set, location, project_id=None, product_set_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Creates a new ProductSet resource.
See also
For more information on how to use this operator, take a look at the guide: CloudVisionProductSetCreateOperator
- Parameters
product_set (dict or google.cloud.vision_v1.types.ProductSet) – (Required) The ProductSet to create. If a dict is provided, it must be of the same form as the protobuf message ProductSet.
location (str) – (Required) The region where the ProductSet should be created. Valid regions (as of 2019-02-05) are: us-east1, us-west1, europe-west1, asia-east1
project_id (str) – (Optional) The project in which the ProductSet should be created. If set to None or missing, the default project_id from the GCP connection is used.
product_set_id (str) – (Optional) A user-supplied resource id for this ProductSet. If set, the server will attempt to use this value as the resource id. If it is already in use, an error is returned with code ALREADY_EXISTS. Must be at most 128 characters long. It cannot contain the character /.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionProductSetGetOperator
(location, product_set_id, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Gets information associated with a ProductSet.
See also
For more information on how to use this operator, take a look at the guide: CloudVisionProductSetGetOperator
- Parameters
location (str) – (Required) The region where the ProductSet is located. Valid regions (as of 2019-02-05) are: us-east1, us-west1, europe-west1, asia-east1
product_set_id (str) – (Required) The resource id of this ProductSet.
project_id (str) – (Optional) The project in which the ProductSet is located. If set to None or missing, the default project_id from the GCP connection is used.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionProductSetUpdateOperator
(product_set, location=None, product_set_id=None, project_id=None, update_mask=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Makes changes to a ProductSet resource. Only display_name can be updated currently.
Note
To locate the ProductSet resource, its name in the form projects/PROJECT_ID/locations/LOC_ID/productSets/PRODUCT_SET_ID is necessary.
You can provide the name directly as an attribute of the product_set object. However, you can leave it blank and provide location and product_set_id instead (and optionally project_id - if not present, the connection default will be used) and the name will be created by the operator itself.
This mechanism exists for your convenience, to allow leaving the project_id empty and having Airflow use the connection default project_id.
See also
For more information on how to use this operator, take a look at the guide: CloudVisionProductSetUpdateOperator
- Parameters
product_set (dict or google.cloud.vision_v1.types.ProductSet) – (Required) The ProductSet resource which replaces the one on the server. If a dict is provided, it must be of the same form as the protobuf message ProductSet.
location (str) – (Optional) The region where the ProductSet is located. Valid regions (as of 2019-02-05) are: us-east1, us-west1, europe-west1, asia-east1
product_set_id (str) – (Optional) The resource id of this ProductSet.
project_id (str) – (Optional) The project in which the ProductSet should be created. If set to None or missing, the default project_id from the GCP connection is used.
update_mask (dict or google.cloud.vision_v1.types.FieldMask) – (Optional) The FieldMask that specifies which fields to update. If update_mask isn’t specified, all mutable fields are to be updated. Valid mask path is display_name. If a dict is provided, it must be of the same form as the protobuf message FieldMask.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionProductSetDeleteOperator
(location, product_set_id, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Permanently deletes a ProductSet. Products and ReferenceImages in the ProductSet are not deleted. The actual image files are not deleted from Google Cloud Storage.
See also
For more information on how to use this operator, take a look at the guide: CloudVisionProductSetDeleteOperator
- Parameters
location (str) – (Required) The region where the ProductSet is located. Valid regions (as of 2019-02-05) are: us-east1, us-west1, europe-west1, asia-east1
product_set_id (str) – (Required) The resource id of this ProductSet.
project_id (str) – (Optional) The project in which the ProductSet should be created. If set to None or missing, the default project_id from the GCP connection is used.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionProductCreateOperator
(location, product, project_id=None, product_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Creates and returns a new product resource.
Possible errors regarding the Product object provided:
Returns INVALID_ARGUMENT if display_name is missing or longer than 4096 characters.
Returns INVALID_ARGUMENT if description is longer than 4096 characters.
Returns INVALID_ARGUMENT if product_category is missing or invalid.
See also
For more information on how to use this operator, take a look at the guide: CloudVisionProductCreateOperator
- Parameters
location (str) – (Required) The region where the Product should be created. Valid regions (as of 2019-02-05) are: us-east1, us-west1, europe-west1, asia-east1
product (dict or google.cloud.vision_v1.types.Product) – (Required) The product to create. If a dict is provided, it must be of the same form as the protobuf message Product.
project_id (str) – (Optional) The project in which the Product should be created. If set to None or missing, the default project_id from the GCP connection is used.
product_id (str) – (Optional) A user-supplied resource id for this Product. If set, the server will attempt to use this value as the resource id. If it is already in use, an error is returned with code ALREADY_EXISTS. Must be at most 128 characters long. It cannot contain the character /.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionProductGetOperator
(location, product_id, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Gets information associated with a Product.
Possible errors:
Returns NOT_FOUND if the Product does not exist.
See also
For more information on how to use this operator, take a look at the guide: CloudVisionProductGetOperator
- Parameters
location (str) – (Required) The region where the Product is located. Valid regions (as of 2019-02-05) are: us-east1, us-west1, europe-west1, asia-east1
product_id (str) – (Required) The resource id of this Product.
project_id (str) – (Optional) The project in which the Product is located. If set to None or missing, the default project_id from the GCP connection is used.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionProductUpdateOperator
(product, location=None, product_id=None, project_id=None, update_mask=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Makes changes to a Product resource. Only the display_name, description, and labels fields can be updated right now.
If labels are updated, the change will not be reflected in queries until the next index time.
Note
To locate the Product resource, its name in the form projects/PROJECT_ID/locations/LOC_ID/products/PRODUCT_ID is necessary.
You can provide the name directly as an attribute of the product object. However, you can leave it blank and provide location and product_id instead (and optionally project_id - if not present, the connection default will be used) and the name will be created by the operator itself.
This mechanism exists for your convenience, to allow leaving the project_id empty and having Airflow use the connection default project_id.
Possible errors related to the provided Product:
Returns NOT_FOUND if the Product does not exist.
- Returns INVALID_ARGUMENT if display_name is present in update_mask but is missing from the request
or longer than 4096 characters.
- Returns INVALID_ARGUMENT if description is present in update_mask but is longer than 4096
characters.
Returns INVALID_ARGUMENT if product_category is present in update_mask.
See also
For more information on how to use this operator, take a look at the guide: CloudVisionProductUpdateOperator
- Parameters
product (dict or google.cloud.vision_v1.types.ProductSet) – (Required) The Product resource which replaces the one on the server. product.name is immutable. If a dict is provided, it must be of the same form as the protobuf message Product.
location (str) – (Optional) The region where the Product is located. Valid regions (as of 2019-02-05) are: us-east1, us-west1, europe-west1, asia-east1
product_id (str) – (Optional) The resource id of this Product.
project_id (str) – (Optional) The project in which the Product is located. If set to None or missing, the default project_id from the GCP connection is used.
update_mask (dict or google.cloud.vision_v1.types.FieldMask) – (Optional) The FieldMask that specifies which fields to update. If update_mask isn’t specified, all mutable fields are to be updated. Valid mask paths include product_labels, display_name, and description. If a dict is provided, it must be of the same form as the protobuf message FieldMask.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionProductDeleteOperator
(location, product_id, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Permanently deletes a product and its reference images.
Metadata of the product and all its images will be deleted right away, but search queries against ProductSets containing the product may still work until all related caches are refreshed.
Possible errors:
Returns NOT_FOUND if the product does not exist.
See also
For more information on how to use this operator, take a look at the guide: CloudVisionProductDeleteOperator
- Parameters
location (str) – (Required) The region where the Product is located. Valid regions (as of 2019-02-05) are: us-east1, us-west1, europe-west1, asia-east1
product_id (str) – (Required) The resource id of this Product.
project_id (str) – (Optional) The project in which the Product is located. If set to None or missing, the default project_id from the GCP connection is used.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionAnnotateImageOperator
(request, retry=None, timeout=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Run image detection and annotation for an image or a batch of images.
See also
For more information on how to use this operator, take a look at the guide: CloudVisionAnnotateImageOperator
- Parameters
request (list[dict or google.cloud.vision_v1.types.AnnotateImageRequest] for batch or dict or google.cloud.vision_v1.types.AnnotateImageRequest for single image.) – (Required) Annotation request for image or a batch. If a dict is provided, it must be of the same form as the protobuf message class:google.cloud.vision_v1.types.AnnotateImageRequest
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionReferenceImageCreateOperator
(location, reference_image, product_id, reference_image_id=None, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Creates and returns a new ReferenceImage ID resource.
See also
For more information on how to use this operator, take a look at the guide: CloudVisionReferenceImageCreateOperator
- Parameters
location (str) – (Required) The region where the Product is located. Valid regions (as of 2019-02-05) are: us-east1, us-west1, europe-west1, asia-east1
reference_image (dict or google.cloud.vision_v1.types.ReferenceImage) – (Required) The reference image to create. If an image ID is specified, it is ignored. If a dict is provided, it must be of the same form as the protobuf message
google.cloud.vision_v1.types.ReferenceImage
reference_image_id (str) – (Optional) A user-supplied resource id for the ReferenceImage to be added. If set, the server will attempt to use this value as the resource id. If it is already in use, an error is returned with code ALREADY_EXISTS. Must be at most 128 characters long. It cannot contain the character /.
product_id (str) – (Optional) The resource id of this Product.
project_id (str) – (Optional) The project in which the Product is located. If set to None or missing, the default project_id from the GCP connection is used.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionAddProductToProductSetOperator
(product_set_id, product_id, location, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Adds a Product to the specified ProductSet. If the Product is already present, no change is made.
One Product can be added to at most 100 ProductSets.
Possible errors:
Returns NOT_FOUND if the Product or the ProductSet doesn’t exist.
See also
For more information on how to use this operator, take a look at the guide: CloudVisionAddProductToProductSetOperator
- Parameters
product_set_id (str) – (Required) The resource id for the ProductSet to modify.
product_id (str) – (Required) The resource id of this Product.
location – (Required) The region where the ProductSet is located. Valid regions (as of 2019-02-05) are: us-east1, us-west1, europe-west1, asia-east1
project_id (str) – (Optional) The project in which the Product is located. If set to None or missing, the default project_id from the GCP connection is used.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Type
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionRemoveProductFromProductSetOperator
(product_set_id, product_id, location, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Removes a Product from the specified ProductSet.
See also
For more information on how to use this operator, take a look at the guide: CloudVisionRemoveProductFromProductSetOperator
- Parameters
product_set_id (str) – (Required) The resource id for the ProductSet to modify.
product_id (str) – (Required) The resource id of this Product.
location – (Required) The region where the ProductSet is located. Valid regions (as of 2019-02-05) are: us-east1, us-west1, europe-west1, asia-east1
project_id (str) – (Optional) The project in which the Product is located. If set to None or missing, the default project_id from the GCP connection is used.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Type
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionDetectTextOperator
(image, max_results=None, retry=None, timeout=None, language_hints=None, web_detection_params=None, additional_properties=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Detects Text in the image
See also
For more information on how to use this operator, take a look at the guide: CloudVisionDetectTextOperator
- Parameters
image (dict or google.cloud.vision_v1.types.Image) – (Required) The image to analyze. See more: https://googleapis.github.io/google-cloud-python/latest/vision/gapic/v1/types.html#google.cloud.vision_v1.types.Image
max_results (int) – (Optional) Number of results to return.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – Number of seconds before timing out.
language_hints (str, list or google.cloud.vision.v1.ImageContext.language_hints:) – List of languages to use for TEXT_DETECTION. In most cases, an empty value yields the best results since it enables automatic language detection. For languages based on the Latin alphabet, setting language_hints is not needed.
web_detection_params (dict or google.cloud.vision.v1.ImageContext.web_detection_params) – Parameters for web detection.
additional_properties (dict) – Additional properties to be set on the AnnotateImageRequest. See more:
google.cloud.vision_v1.types.AnnotateImageRequest
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionDetectDocumentTextOperator
(image, max_results=None, retry=None, timeout=None, language_hints=None, web_detection_params=None, additional_properties=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Detects Document Text in the image
See also
For more information on how to use this operator, take a look at the guide: CloudVisionDetectDocumentTextOperator
- Parameters
image (dict or google.cloud.vision_v1.types.Image) – (Required) The image to analyze. See more: https://googleapis.github.io/google-cloud-python/latest/vision/gapic/v1/types.html#google.cloud.vision_v1.types.Image
max_results (int) – Number of results to return.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – Number of seconds before timing out.
language_hints (str, list or google.cloud.vision.v1.ImageContext.language_hints:) – List of languages to use for TEXT_DETECTION. In most cases, an empty value yields the best results since it enables automatic language detection. For languages based on the Latin alphabet, setting language_hints is not needed.
web_detection_params (dict or google.cloud.vision.v1.ImageContext.web_detection_params) – Parameters for web detection.
additional_properties (dict) – Additional properties to be set on the AnnotateImageRequest. See more: https://googleapis.github.io/google-cloud-python/latest/vision/gapic/v1/types.html#google.cloud.vision_v1.types.AnnotateImageRequest
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionDetectImageLabelsOperator
(image, max_results=None, retry=None, timeout=None, additional_properties=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Detects Document Text in the image
See also
For more information on how to use this operator, take a look at the guide: CloudVisionDetectImageLabelsOperator
- Parameters
image (dict or google.cloud.vision_v1.types.Image) – (Required) The image to analyze. See more: https://googleapis.github.io/google-cloud-python/latest/vision/gapic/v1/types.html#google.cloud.vision_v1.types.Image
max_results (int) – Number of results to return.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – Number of seconds before timing out.
additional_properties (dict) – Additional properties to be set on the AnnotateImageRequest. See more: https://googleapis.github.io/google-cloud-python/latest/vision/gapic/v1/types.html#google.cloud.vision_v1.types.AnnotateImageRequest
-
class
airflow.contrib.operators.gcp_vision_operator.
CloudVisionDetectImageSafeSearchOperator
(image, max_results=None, retry=None, timeout=None, additional_properties=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Detects Document Text in the image
See also
For more information on how to use this operator, take a look at the guide: CloudVisionDetectImageSafeSearchOperator
- Parameters
image (dict or google.cloud.vision_v1.types.Image) – (Required) The image to analyze. See more: https://googleapis.github.io/google-cloud-python/latest/vision/gapic/v1/types.html#google.cloud.vision_v1.types.Image
max_results (int) – Number of results to return.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – Number of seconds before timing out.
additional_properties (dict) – Additional properties to be set on the AnnotateImageRequest. See more: https://googleapis.github.io/google-cloud-python/latest/vision/gapic/v1/types.html#google.cloud.vision_v1.types.AnnotateImageRequest