airflow.contrib.operators.gcp_transfer_operator
¶
Module Contents¶
-
class
airflow.contrib.operators.gcp_transfer_operator.
TransferJobPreprocessor
(body, aws_conn_id='aws_default')[source]¶
-
class
airflow.contrib.operators.gcp_transfer_operator.
GcpTransferServiceJobCreateOperator
(body, aws_conn_id='aws_default', gcp_conn_id='google_cloud_default', api_version='v1', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Creates a transfer job that runs periodically.
Warning
This operator is NOT idempotent. If you run it many times, many transfer jobs will be created in the Google Cloud Platform.
See also
For more information on how to use this operator, take a look at the guide: GcpTransferServiceJobCreateOperator
- Parameters
body (dict) –
(Required) The request body, as described in https://cloud.google.com/storage-transfer/docs/reference/rest/v1/transferJobs/create#request-body With three additional improvements:
dates can be given in the form
datetime.date
times can be given in the form
datetime.time
credentials to Amazon Web Service should be stored in the connection and indicated by the aws_conn_id parameter
aws_conn_id (str) – The connection ID used to retrieve credentials to Amazon Web Service.
gcp_conn_id (str) – The connection ID used to connect to Google Cloud Platform.
api_version (str) – API version used (e.g. v1).
-
class
airflow.contrib.operators.gcp_transfer_operator.
GcpTransferServiceJobUpdateOperator
(job_name, body, aws_conn_id='aws_default', gcp_conn_id='google_cloud_default', api_version='v1', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Updates a transfer job that runs periodically.
See also
For more information on how to use this operator, take a look at the guide: GcpTransferServiceJobUpdateOperator
- Parameters
job_name (str) – (Required) Name of the job to be updated
body (dict) –
(Required) The request body, as described in https://cloud.google.com/storage-transfer/docs/reference/rest/v1/transferJobs/patch#request-body With three additional improvements:
dates can be given in the form
datetime.date
times can be given in the form
datetime.time
credentials to Amazon Web Service should be stored in the connection and indicated by the aws_conn_id parameter
aws_conn_id (str) – The connection ID used to retrieve credentials to Amazon Web Service.
gcp_conn_id (str) – The connection ID used to connect to Google Cloud Platform.
api_version (str) – API version used (e.g. v1).
-
class
airflow.contrib.operators.gcp_transfer_operator.
GcpTransferServiceJobDeleteOperator
(job_name, gcp_conn_id='google_cloud_default', api_version='v1', project_id=None, *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Delete a transfer job. This is a soft delete. After a transfer job is deleted, the job and all the transfer executions are subject to garbage collection. Transfer jobs become eligible for garbage collection 30 days after soft delete.
See also
For more information on how to use this operator, take a look at the guide: GcpTransferServiceJobDeleteOperator
- Parameters
job_name (str) – (Required) Name of the TRANSFER operation
project_id (str) – (Optional) the ID of the project that owns the Transfer Job. If set to None or missing, the default project_id from the GCP connection is used.
gcp_conn_id (str) – The connection ID used to connect to Google Cloud Platform.
api_version (str) – API version used (e.g. v1).
-
class
airflow.contrib.operators.gcp_transfer_operator.
GcpTransferServiceOperationGetOperator
(operation_name, gcp_conn_id='google_cloud_default', api_version='v1', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Gets the latest state of a long-running operation in Google Storage Transfer Service.
See also
For more information on how to use this operator, take a look at the guide: GcpTransferServiceOperationGetOperator
- Parameters
-
class
airflow.contrib.operators.gcp_transfer_operator.
GcpTransferServiceOperationsListOperator
(filter, gcp_conn_id='google_cloud_default', api_version='v1', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Lists long-running operations in Google Storage Transfer Service that match the specified filter.
See also
For more information on how to use this operator, take a look at the guide: GcpTransferServiceOperationsListOperator
- Parameters
filter (dict) – (Required) A request filter, as described in https://cloud.google.com/storage-transfer/docs/reference/rest/v1/transferJobs/list#body.QUERY_PARAMETERS.filter
gcp_conn_id (str) – The connection ID used to connect to Google Cloud Platform.
api_version (str) – API version used (e.g. v1).
-
class
airflow.contrib.operators.gcp_transfer_operator.
GcpTransferServiceOperationPauseOperator
(operation_name, gcp_conn_id='google_cloud_default', api_version='v1', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Pauses a transfer operation in Google Storage Transfer Service.
See also
For more information on how to use this operator, take a look at the guide: GcpTransferServiceOperationPauseOperator
- Parameters
-
class
airflow.contrib.operators.gcp_transfer_operator.
GcpTransferServiceOperationResumeOperator
(operation_name, gcp_conn_id='google_cloud_default', api_version='v1', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Resumes a transfer operation in Google Storage Transfer Service.
See also
For more information on how to use this operator, take a look at the guide: GcpTransferServiceOperationResumeOperator
- Parameters
-
class
airflow.contrib.operators.gcp_transfer_operator.
GcpTransferServiceOperationCancelOperator
(operation_name, api_version='v1', gcp_conn_id='google_cloud_default', *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Cancels a transfer operation in Google Storage Transfer Service.
See also
For more information on how to use this operator, take a look at the guide: GcpTransferServiceOperationCancelOperator
- Parameters
-
class
airflow.contrib.operators.gcp_transfer_operator.
S3ToGoogleCloudStorageTransferOperator
(s3_bucket, gcs_bucket, project_id=None, aws_conn_id='aws_default', gcp_conn_id='google_cloud_default', delegate_to=None, description=None, schedule=None, object_conditions=None, transfer_options=None, wait=True, timeout=None, *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Synchronizes an S3 bucket with a Google Cloud Storage bucket using the GCP Storage Transfer Service.
Warning
This operator is NOT idempotent. If you run it many times, many transfer jobs will be created in the Google Cloud Platform.
Example:
s3_to_gcs_transfer_op = S3ToGoogleCloudStorageTransferOperator( task_id='s3_to_gcs_transfer_example', s3_bucket='my-s3-bucket', project_id='my-gcp-project', gcs_bucket='my-gcs-bucket', dag=my_dag)
- Parameters
s3_bucket (str) – The S3 bucket where to find the objects. (templated)
gcs_bucket (str) – The destination Google Cloud Storage bucket where you want to store the files. (templated)
project_id (str) – Optional ID of the Google Cloud Platform Console project that owns the job
aws_conn_id (str) – The source S3 connection
gcp_conn_id (str) – The destination connection ID to use when connecting to Google Cloud Storage.
delegate_to (str) – The account to impersonate, if any. For this to work, the service account making the request must have domain-wide delegation enabled.
description (str) – Optional transfer service job description
schedule (dict) –
Optional transfer service schedule; If not set, run transfer job once as soon as the operator runs The format is described https://cloud.google.com/storage-transfer/docs/reference/rest/v1/transferJobs. With two additional improvements:
dates they can be passed as
datetime.date
times they can be passed as
datetime.time
object_conditions (dict) – Optional transfer service object conditions; see https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec
transfer_options (dict) – Optional transfer service transfer options; see https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec
wait (bool) – Wait for transfer to finish
timeout (int) – Time to wait for the operation to end in seconds
-
class
airflow.contrib.operators.gcp_transfer_operator.
GoogleCloudStorageToGoogleCloudStorageTransferOperator
(source_bucket, destination_bucket, project_id=None, gcp_conn_id='google_cloud_default', delegate_to=None, description=None, schedule=None, object_conditions=None, transfer_options=None, wait=True, timeout=None, *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
Copies objects from a bucket to another using the GCP Storage Transfer Service.
Warning
This operator is NOT idempotent. If you run it many times, many transfer jobs will be created in the Google Cloud Platform.
Example:
gcs_to_gcs_transfer_op = GoogleCloudStorageToGoogleCloudStorageTransferOperator( task_id='gcs_to_gcs_transfer_example', source_bucket='my-source-bucket', destination_bucket='my-destination-bucket', project_id='my-gcp-project', dag=my_dag)
- Parameters
source_bucket (str) – The source Google cloud storage bucket where the object is. (templated)
destination_bucket (str) – The destination Google cloud storage bucket where the object should be. (templated)
project_id (str) – The ID of the Google Cloud Platform Console project that owns the job
gcp_conn_id (str) – Optional connection ID to use when connecting to Google Cloud Storage.
delegate_to (str) – The account to impersonate, if any. For this to work, the service account making the request must have domain-wide delegation enabled.
description (str) – Optional transfer service job description
schedule (dict) –
Optional transfer service schedule; If not set, run transfer job once as soon as the operator runs See: https://cloud.google.com/storage-transfer/docs/reference/rest/v1/transferJobs. With two additional improvements:
dates they can be passed as
datetime.date
times they can be passed as
datetime.time
object_conditions (dict) – Optional transfer service object conditions; see https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#ObjectConditions
transfer_options (dict) – Optional transfer service transfer options; see https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#TransferOptions
wait (bool) – Wait for transfer to finish; defaults to True
timeout (int) – Time to wait for the operation to end in seconds