Google Cloud Storage Transfer Operator to SFTP

Google has a service Google Cloud Storage. This service is used to store large data from various applications. SFTP (SSH File Transfer Protocol) is a secure file transfer protocol. It runs over the SSH protocol. It supports the full security and authentication functionality of SSH.

Operator

Transfer files between SFTP and Google Storage is performed with the GCSToSFTPOperator operator.

Use Jinja templating with source_bucket, source_object, destination_path, impersonation_chain to define values dynamically.

Copying a single file

The following Operator copies a single file.

airflow/providers/google/cloud/example_dags/example_gcs_to_sftp.pyView Source

copy_file_from_gcs_to_sftp = GCSToSFTPOperator(
    task_id="file-copy-gsc-to-sftp",
    sftp_conn_id=SFTP_CONN_ID,
    source_bucket=BUCKET_SRC,
    source_object=OBJECT_SRC_1,
    destination_path=DESTINATION_PATH_1,
)

Moving a single file

To move the file use the move_object parameter. Once the file is copied to SFTP, the original file from the Google Storage is deleted. The destination_path parameter defines the full path of the file on the SFTP server.

airflow/providers/google/cloud/example_dags/example_gcs_to_sftp.pyView Source

move_file_from_gcs_to_sftp = GCSToSFTPOperator(
    task_id="file-move-gsc-to-sftp",
    sftp_conn_id=SFTP_CONN_ID,
    source_bucket=BUCKET_SRC,
    source_object=OBJECT_SRC_2,
    destination_path=DESTINATION_PATH_1,
    move_object=True,
)

Copying a directory

Use the wildcard in source_path parameter to copy a directory.

airflow/providers/google/cloud/example_dags/example_gcs_to_sftp.pyView Source

copy_dir_from_gcs_to_sftp = GCSToSFTPOperator(
    task_id="dir-copy-gsc-to-sftp",
    sftp_conn_id=SFTP_CONN_ID,
    source_bucket=BUCKET_SRC,
    source_object=OBJECT_SRC_3,
    destination_path=DESTINATION_PATH_2,
)

Moving specific files

Use the wildcard in source_path parameter to move the specific files. The destination_path defines the path that is prefixed to all copied files.

airflow/providers/google/cloud/example_dags/example_gcs_to_sftp.pyView Source

move_dir_from_gcs_to_sftp = GCSToSFTPOperator(
    task_id="dir-move-gsc-to-sftp",
    sftp_conn_id=SFTP_CONN_ID,
    source_bucket=BUCKET_SRC,
    source_object=OBJECT_SRC_3,
    destination_path=DESTINATION_PATH_3,
    keep_directory_structure=False,
)

Reference

For further information, look at:

Was this entry helpful?