airflow.providers.google.cloud.transfers.gcs_to_sftp

This module contains Google Cloud Storage to SFTP operator.

Module Contents

airflow.providers.google.cloud.transfers.gcs_to_sftp.WILDCARD = *[source]
class airflow.providers.google.cloud.transfers.gcs_to_sftp.GCSToSFTPOperator(*, source_bucket: str, source_object: str, destination_path: str, keep_directory_structure: bool = True, move_object: bool = False, gcp_conn_id: str = 'google_cloud_default', sftp_conn_id: str = 'ssh_default', delegate_to: Optional[str] = None, impersonation_chain: Optional[Union[str, Sequence[str]]] = None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Transfer files from a Google Cloud Storage bucket to SFTP server.

Example:

with models.DAG(
    "example_gcs_to_sftp",
    start_date=datetime(2020, 6, 19),
    schedule_interval=None,
) as dag:
    # downloads file to /tmp/sftp/folder/subfolder/file.txt
    copy_file_from_gcs_to_sftp = GCSToSFTPOperator(
        task_id="file-copy-gsc-to-sftp",
        source_bucket="test-gcs-sftp-bucket-name",
        source_object="folder/subfolder/file.txt",
        destination_path="/tmp/sftp",
    )

    # moves file to /tmp/data.txt
    move_file_from_gcs_to_sftp = GCSToSFTPOperator(
        task_id="file-move-gsc-to-sftp",
        source_bucket="test-gcs-sftp-bucket-name",
        source_object="folder/subfolder/data.txt",
        destination_path="/tmp",
        move_object=True,
        keep_directory_structure=False,
    )

See also

For more information on how to use this operator, take a look at the guide: Operator

Parameters
  • source_bucket (str) – The source Google Cloud Storage bucket where the object is. (templated)

  • source_object (str) – The source name of the object to copy in the Google cloud storage bucket. (templated) You can use only one wildcard for objects (filenames) within your bucket. The wildcard can appear inside the object name or at the end of the object name. Appending a wildcard to the bucket name is unsupported.

  • destination_path (str) – The sftp remote path. This is the specified directory path for uploading to the SFTP server.

  • keep_directory_structure (bool) – (Optional) When set to False the path of the file on the bucket is recreated within path passed in destination_path.

  • move_object (bool) – When move object is True, the object is moved instead of copied to the new location. This is the equivalent of a mv command as opposed to a cp command.

  • gcp_conn_id (str) – (Optional) The connection ID used to connect to Google Cloud.

  • sftp_conn_id (str) – The sftp connection id. The name or identifier for establishing a connection to the SFTP server.

  • delegate_to (str) – The account to impersonate using domain-wide delegation of authority, if any. For this to work, the service account making the request must have domain-wide delegation enabled.

  • impersonation_chain (Union[str, Sequence[str]]) – Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields = ['source_bucket', 'source_object', 'destination_path', 'impersonation_chain'][source]
ui_color = #f0eee4[source]
execute(self, context)[source]
_resolve_destination_path(self, source_object: str, prefix: Optional[str] = None)[source]
_copy_single_object(self, gcs_hook: GCSHook, sftp_hook: SFTPHook, source_object: str, destination_path: str)[source]

Helper function to copy single object.

Was this entry helpful?