Google Drive to Google Cloud Storage Transfer Operator¶
Google has two services that store data. The Google Cloud Storage is used to store large data from various applications. The Google Drive is used to store daily use data, including documents and photos. Google Cloud Storage has strong integration with Google Cloud services. Google Drive has built-in mechanisms to facilitate group work e.g. document editor, file sharing mechanisms.
Prerequisite Tasks¶
To use these operators, you must do a few things:
Select or create a Cloud Platform project using the Cloud Console.
Enable billing for your project, as described in the Google Cloud documentation.
Enable the API, as described in the Cloud Console documentation.
Install API libraries via pip.
pip install 'apache-airflow[google]'Detailed information is available for Installation.
Operator¶
Transfer files between Google Storage and Google Drive is performed with the
GoogleDriveToGCSOperator
operator.
Copy single files¶
The following Operator copies a single file from a shared Google Drive folder to a Google Cloud Storage Bucket.
Note that you can transfer a file from the root folder of a shared drive by passing the id of the shared
drive to both the folder_id
and drive_id
parameters.
upload_gdrive_to_gcs = GoogleDriveToGCSOperator(
task_id="upload_gdrive_object_to_gcs",
gcp_conn_id=CONNECTION_ID,
folder_id=FOLDER_ID,
file_name=DRIVE_FILE_NAME,
bucket_name=BUCKET_NAME,
object_name=OBJECT,
)
You can use Jinja templating with
bucket_name
, object_name
, folder_id
, file_name
, drive_id
, impersonation_chain
parameters which allows you to dynamically determine values.
Reference¶
For further information, look at: