airflow.contrib.operators.gcp_dlp_operator¶
This module contains various GCP Cloud DLP operators which allow you to perform basic operations using Cloud DLP.
Module Contents¶
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPCancelDLPJobOperator(dlp_job_id, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorStarts asynchronous cancellation on a long-running DlpJob.
- Parameters
dlp_job_id (str) – ID of the DLP job resource to be cancelled.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPCreateDeidentifyTemplateOperator(organization_id=None, project_id=None, deidentify_template=None, template_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorCreates a DeidentifyTemplate for re-using frequently used configuration for de-identifying content, images, and storage.
- Parameters
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
deidentify_template (dict or google.cloud.dlp_v2.types.DeidentifyTemplate) – (Optional) The DeidentifyTemplate to create.
template_id (str) – (Optional) The template ID.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPCreateDLPJobOperator(project_id=None, inspect_job=None, risk_job=None, job_id=None, retry=None, timeout=None, metadata=None, wait_until_finished=True, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorCreates a new job to inspect storage or calculate risk metrics.
- Parameters
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
inspect_job (dict or google.cloud.dlp_v2.types.InspectJobConfig) – (Optional) The configuration for the inspect job.
risk_job (dict or google.cloud.dlp_v2.types.RiskAnalysisJobConfig) – (Optional) The configuration for the risk job.
job_id (str) – (Optional) The job ID.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
wait_until_finished (bool) – (Optional) If true, it will keep polling the job state until it is set to DONE.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPCreateInspectTemplateOperator(organization_id=None, project_id=None, inspect_template=None, template_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorCreates an InspectTemplate for re-using frequently used configuration for inspecting content, images, and storage.
- Parameters
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
inspect_template (dict or google.cloud.dlp_v2.types.InspectTemplate) – (Optional) The InspectTemplate to create.
template_id (str) – (Optional) The template ID.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPCreateJobTriggerOperator(project_id=None, job_trigger=None, trigger_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorCreates a job trigger to run DLP actions such as scanning storage for sensitive information on a set schedule.
- Parameters
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
job_trigger (dict or google.cloud.dlp_v2.types.JobTrigger) – (Optional) The JobTrigger to create.
trigger_id (str) – (Optional) The JobTrigger ID.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPCreateStoredInfoTypeOperator(organization_id=None, project_id=None, config=None, stored_info_type_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorCreates a pre-built stored infoType to be used for inspection.
- Parameters
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
config (dict or google.cloud.dlp_v2.types.StoredInfoTypeConfig) – (Optional) The config for the StoredInfoType.
stored_info_type_id (str) – (Optional) The StoredInfoType ID.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPDeidentifyContentOperator(project_id=None, deidentify_config=None, inspect_config=None, item=None, inspect_template_name=None, deidentify_template_name=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorDe-identifies potentially sensitive info from a ContentItem. This method has limits on input size and output size.
- Parameters
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
deidentify_config (dict or google.cloud.dlp_v2.types.DeidentifyConfig) – (Optional) Configuration for the de-identification of the content item. Items specified here will override the template referenced by the deidentify_template_name argument.
inspect_config (dict or google.cloud.dlp_v2.types.InspectConfig) – (Optional) Configuration for the inspector. Items specified here will override the template referenced by the inspect_template_name argument.
item (dict or google.cloud.dlp_v2.types.ContentItem) – (Optional) The item to de-identify. Will be treated as text.
inspect_template_name (str) – (Optional) Optional template to use. Any configuration directly specified in inspect_config will override those set in the template.
deidentify_template_name (str) – (Optional) Optional template to use. Any configuration directly specified in deidentify_config will override those set in the template.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPDeleteDeidentifyTemplateOperator(template_id, organization_id=None, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorDeletes a DeidentifyTemplate.
- Parameters
template_id (str) – The ID of deidentify template to be deleted.
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPDeleteDlpJobOperator(dlp_job_id, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorDeletes a long-running DlpJob. This method indicates that the client is no longer interested in the DlpJob result. The job will be cancelled if possible.
- Parameters
dlp_job_id (str) – The ID of the DLP job resource to be cancelled.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPDeleteInspectTemplateOperator(template_id, organization_id=None, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorDeletes an InspectTemplate.
- Parameters
template_id (str) – The ID of the inspect template to be deleted.
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPDeleteJobTriggerOperator(job_trigger_id, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorDeletes a job trigger.
- Parameters
job_trigger_id (str) – The ID of the DLP job trigger to be deleted.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPDeleteStoredInfoTypeOperator(stored_info_type_id, organization_id=None, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorDeletes a stored infoType.
- Parameters
stored_info_type_id (str) – The ID of the stored info type to be deleted.
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPGetDeidentifyTemplateOperator(template_id, organization_id=None, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorGets a DeidentifyTemplate.
- Parameters
template_id (str) – The ID of deidentify template to be read.
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPGetDlpJobOperator(dlp_job_id, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorGets the latest state of a long-running DlpJob.
- Parameters
dlp_job_id (str) – The ID of the DLP job resource to be read.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPGetInspectTemplateOperator(template_id, organization_id=None, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorGets an InspectTemplate.
- Parameters
template_id (str) – The ID of inspect template to be read.
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPGetJobTripperOperator(job_trigger_id, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorGets a job trigger.
- Parameters
job_trigger_id (str) – The ID of the DLP job trigger to be read.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPGetStoredInfoTypeOperator(stored_info_type_id, organization_id=None, project_id=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorGets a stored infoType.
- Parameters
stored_info_type_id (str) – The ID of the stored info type to be read.
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPInspectContentOperator(project_id=None, inspect_config=None, item=None, inspect_template_name=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorFinds potentially sensitive info in content. This method has limits on input size, processing time, and output size.
- Parameters
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
inspect_config (dict or google.cloud.dlp_v2.types.InspectConfig) – (Optional) Configuration for the inspector. Items specified here will override the template referenced by the inspect_template_name argument.
item (dict or google.cloud.dlp_v2.types.ContentItem) – (Optional) The item to de-identify. Will be treated as text.
inspect_template_name (str) – (Optional) Optional template to use. Any configuration directly specified in inspect_config will override those set in the template.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
google.cloud.tasks_v2.types.InspectContentResponse
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPListDeidentifyTemplatesOperator(organization_id=None, project_id=None, page_size=None, order_by=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorLists DeidentifyTemplates.
- Parameters
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
page_size (int) – (Optional) The maximum number of resources contained in the underlying API response.
order_by (str) – (Optional) Optional comma separated list of fields to order by, followed by asc or desc postfix.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPListDlpJobsOperator(project_id=None, results_filter=None, page_size=None, job_type=None, order_by=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorLists DlpJobs that match the specified filter in the request.
- Parameters
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
results_filter (str) – (Optional) Filter used to specify a subset of results.
page_size (int) – (Optional) The maximum number of resources contained in the underlying API response.
job_type (str) – (Optional) The type of job.
order_by (str) – (Optional) Optional comma separated list of fields to order by, followed by asc or desc postfix.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPListInfoTypesOperator(language_code=None, results_filter=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorReturns a list of the sensitive information types that the DLP API supports.
- Parameters
language_code (str) – (Optional) Optional BCP-47 language code for localized infoType friendly names. If omitted, or if localized strings are not available, en-US strings will be returned.
results_filter (str) – (Optional) Filter used to specify a subset of results.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
ListInfoTypesResponse
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPListInspectTemplatesOperator(organization_id=None, project_id=None, page_size=None, order_by=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorLists InspectTemplates.
- Parameters
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
page_size (int) – (Optional) The maximum number of resources contained in the underlying API response.
order_by (str) – (Optional) Optional comma separated list of fields to order by, followed by asc or desc postfix.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPListJobTriggersOperator(project_id=None, page_size=None, order_by=None, results_filter=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorLists job triggers.
- Parameters
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
page_size (int) – (Optional) The maximum number of resources contained in the underlying API response.
order_by (str) – (Optional) Optional comma separated list of fields to order by, followed by asc or desc postfix.
results_filter (str) – (Optional) Filter used to specify a subset of results.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPListStoredInfoTypesOperator(organization_id=None, project_id=None, page_size=None, order_by=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorLists stored infoTypes.
- Parameters
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
page_size (int) – (Optional) The maximum number of resources contained in the underlying API response.
order_by (str) – (Optional) Optional comma separated list of fields to order by, followed by asc or desc postfix.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPRedactImageOperator(project_id=None, inspect_config=None, image_redaction_configs=None, include_findings=None, byte_item=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorRedacts potentially sensitive info from an image. This method has limits on input size, processing time, and output size.
- Parameters
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
inspect_config (dict or google.cloud.dlp_v2.types.InspectConfig) – (Optional) Configuration for the inspector. Items specified here will override the template referenced by the inspect_template_name argument.
image_redaction_configs (list[dict] or list[google.cloud.dlp_v2.types.ImageRedactionConfig]) – (Optional) The configuration for specifying what content to redact from images.
include_findings (bool) – (Optional) Whether the response should include findings along with the redacted image.
byte_item (dict or google.cloud.dlp_v2.types.ByteContentItem) – (Optional) The content must be PNG, JPEG, SVG or BMP.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPReidentifyContentOperator(project_id=None, reidentify_config=None, inspect_config=None, item=None, inspect_template_name=None, reidentify_template_name=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorRe-identifies content that has been de-identified.
- Parameters
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
reidentify_config (dict or google.cloud.dlp_v2.types.DeidentifyConfig) – (Optional) Configuration for the re-identification of the content item.
inspect_config (dict or google.cloud.dlp_v2.types.InspectConfig) – (Optional) Configuration for the inspector.
item (dict or google.cloud.dlp_v2.types.ContentItem) – (Optional) The item to re-identify. Will be treated as text.
inspect_template_name (str) – (Optional) Optional template to use. Any configuration directly specified in inspect_config will override those set in the template.
reidentify_template_name (str) – (Optional) Optional template to use. References an instance of DeidentifyTemplate. Any configuration directly specified in reidentify_config or inspect_config will override those set in the template.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPUpdateDeidentifyTemplateOperator(template_id, organization_id=None, project_id=None, deidentify_template=None, update_mask=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorUpdates the DeidentifyTemplate.
- Parameters
template_id (str) – The ID of deidentify template to be updated.
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
deidentify_template (dict or google.cloud.dlp_v2.types.DeidentifyTemplate) – New DeidentifyTemplate value.
update_mask (dict or google.cloud.dlp_v2.types.FieldMask) – Mask to control which fields get updated.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPUpdateInspectTemplateOperator(template_id, organization_id=None, project_id=None, inspect_template=None, update_mask=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorUpdates the InspectTemplate.
- Parameters
template_id (str) – The ID of the inspect template to be updated.
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organzation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organzation.
inspect_template (dict or google.cloud.dlp_v2.types.InspectTemplate) – New InspectTemplate value.
update_mask (dict or google.cloud.dlp_v2.types.FieldMask) – Mask to control which fields get updated.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPUpdateJobTriggerOperator(job_trigger_id, project_id=None, job_trigger=None, update_mask=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorUpdates a job trigger.
- Parameters
job_trigger_id (str) – The ID of the DLP job trigger to be updated.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. If set to None or missing, the default project_id from the GCP connection is used.
job_trigger (dict or google.cloud.dlp_v2.types.JobTrigger) – New JobTrigger value.
update_mask (dict or google.cloud.dlp_v2.types.FieldMask) – Mask to control which fields get updated.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type
-
class
airflow.contrib.operators.gcp_dlp_operator.CloudDLPUpdateStoredInfoTypeOperator(stored_info_type_id, organization_id=None, project_id=None, config=None, update_mask=None, retry=None, timeout=None, metadata=None, gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperatorUpdates the stored infoType by creating a new version.
- Parameters
stored_info_type_id (str) – The ID of the stored info type to be updated.
organization_id (str) – (Optional) The organization ID. Required to set this field if parent resource is an organisation.
project_id (str) – (Optional) Google Cloud Platform project ID where the DLP Instance exists. Only set this field if the parent resource is a project instead of an organisation.
config (dict or google.cloud.dlp_v2.types.StoredInfoTypeConfig) – Updated configuration for the storedInfoType. If not provided, a new version of the storedInfoType will be created with the existing configuration.
update_mask (dict or google.cloud.dlp_v2.types.FieldMask) – Mask to control which fields get updated.
retry (google.api_core.retry.Retry) – (Optional) A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (float) – (Optional) The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (sequence[tuple[str, str]]]) – (Optional) Additional metadata that is provided to the method.
gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud Platform.
- Return type