Google Cloud Storage Transfer Operator to SFTP

Google has a service Google Cloud Storage. This service is used to store large data from various applications. SFTP (SSH File Transfer Protocol) is a secure file transfer protocol. It runs over the SSH protocol. It supports the full security and authentication functionality of SSH.

Prerequisite Tasks

To use these operators, you must do a few things:

Operator

Transfer files between SFTP and Google Storage is performed with the GCSToSFTPOperator operator.

Use Jinja templating with source_bucket, source_object, destination_path, impersonation_chain to define values dynamically.

Copying a single file

The following Operator copies a single file.

tests/system/providers/google/cloud/transfers/example_gcs_to_sftp.py[source]

copy_file_from_gcs_to_sftp = GCSToSFTPOperator(
    task_id="file-copy-gsc-to-sftp",
    sftp_conn_id=SFTP_CONN_ID,
    source_bucket=BUCKET_NAME,
    source_object=GCS_SRC_FILE,
    destination_path=DESTINATION_PATH_1,
)

Moving a single file

To move the file use the move_object parameter. Once the file is copied to SFTP, the original file from the Google Storage is deleted. The destination_path parameter defines the full path of the file on the SFTP server.

tests/system/providers/google/cloud/transfers/example_gcs_to_sftp.py[source]

move_file_from_gcs_to_sftp = GCSToSFTPOperator(
    task_id="file-move-gsc-to-sftp",
    sftp_conn_id=SFTP_CONN_ID,
    source_bucket=BUCKET_NAME,
    source_object=GCS_SRC_FILE_IN_DIR,
    destination_path=DESTINATION_PATH_1,
    move_object=True,
)

Copying a directory

Use the wildcard in source_path parameter to copy a directory.

tests/system/providers/google/cloud/transfers/example_gcs_to_sftp.py[source]

copy_dir_from_gcs_to_sftp = GCSToSFTPOperator(
    task_id="dir-copy-gsc-to-sftp",
    sftp_conn_id=SFTP_CONN_ID,
    source_bucket=BUCKET_NAME,
    source_object=GCS_SRC_DIR,
    destination_path=DESTINATION_PATH_2,
)

Moving specific files

Use the wildcard in source_path parameter to move the specific files. The destination_path defines the path that is prefixed to all copied files.

tests/system/providers/google/cloud/transfers/example_gcs_to_sftp.py[source]

move_dir_from_gcs_to_sftp = GCSToSFTPOperator(
    task_id="dir-move-gsc-to-sftp",
    sftp_conn_id=SFTP_CONN_ID,
    source_bucket=BUCKET_NAME,
    source_object=GCS_SRC_DIR,
    destination_path=DESTINATION_PATH_3,
    keep_directory_structure=False,
)

Reference

For further information, look at:

Was this entry helpful?