Google Cloud Storage to Google Drive Transfer Operators

Google has two services that store data. The Google Cloud Storage is used to store large data from various applications. The Google Drive is used to store daily use data, including documents and photos. Google Cloud Storage has strong integration with Google Cloud services. Google Drive has built-in mechanisms to facilitate group work e.g. document editor, file sharing mechanisms.

Prerequisite Tasks

Operator

Transfer files between Google Storage and Google Drive is performed with the GCSToGoogleDriveOperator operator.

You can use Jinja templating with source_bucket, source_object, destination_object, impersonation_chain parameters which allows you to dynamically determine values.

Copy single files

The following Operator would copy a single file.

tests/system/providers/google/cloud/gcs/example_gcs_to_gdrive.py[source]

copy_single_file = GCSToGoogleDriveOperator(
    task_id="copy_single_file",
    source_bucket=GCS_TO_GDRIVE_BUCKET,
    source_object="sales/january.avro",
    destination_object="copied_sales/january-backup.avro",
)

Copy multiple files

The following Operator would copy all the multiples files (i.e. using wildcard).

tests/system/providers/google/cloud/gcs/example_gcs_to_gdrive.py[source]

copy_files = GCSToGoogleDriveOperator(
    task_id="copy_files",
    source_bucket=GCS_TO_GDRIVE_BUCKET,
    source_object="sales/*",
    destination_object="copied_sales/",
)

Move files

Using the move_object parameter allows you to move the files. After copying the file to Google Drive, the original file from the bucket is deleted.

tests/system/providers/google/cloud/gcs/example_gcs_to_gdrive.py[source]

move_files = GCSToGoogleDriveOperator(
    task_id="move_files",
    source_bucket=GCS_TO_GDRIVE_BUCKET,
    source_object="sales/*.avro",
    move_object=True,
)

Reference

For further information, look at:

Was this entry helpful?