airflow.providers.sftp.hooks.sftp

This module contains SFTP hook.

Module Contents

Classes

SFTPHook

Interact with SFTP.

SFTPHookAsync

Interact with an SFTP server via asyncssh package.

class airflow.providers.sftp.hooks.sftp.SFTPHook(ssh_conn_id='sftp_default', ssh_hook=None, *args, **kwargs)[source]

Bases: airflow.providers.ssh.hooks.ssh.SSHHook

Interact with SFTP.

This hook inherits the SSH hook. Please refer to SSH hook for the input arguments.

Pitfalls:
  • In contrast with FTPHook describe_directory only returns size, type and modify. It doesn’t return unix.owner, unix.mode, perm, unix.group and unique.

  • retrieve_file and store_file only take a local full path and not a

    buffer.

  • If no mode is passed to create_directory it will be created with 777 permissions.

Errors that may occur throughout but should be handled downstream.

For consistency reasons with SSHHook, the preferred parameter is “ssh_conn_id”.

Parameters
conn_name_attr = 'ssh_conn_id'[source]
default_conn_name = 'sftp_default'[source]
conn_type = 'sftp'[source]
hook_name = 'SFTP'[source]
classmethod get_ui_field_behaviour()[source]

Return custom UI field behaviour for SSH connection.

get_conn()[source]

Open an SFTP connection to the remote host.

close_conn()[source]

Close the SFTP connection.

describe_directory(path)[source]

Get file information in a directory on the remote system.

The return format is {filename: {attributes}}. The remote system support the MLSD command.

Parameters

path (str) – full path to the remote directory

list_directory(path)[source]

List files in a directory on the remote system.

Parameters

path (str) – full path to the remote directory to list

mkdir(path, mode=511)[source]

Create a directory on the remote system.

The default mode is 0o777, but on some systems, the current umask value may be first masked out.

Parameters
  • path (str) – full path to the remote directory to create

  • mode (int) – int permissions of octal mode for directory

isdir(path)[source]

Check if the path provided is a directory.

Parameters

path (str) – full path to the remote directory to check

isfile(path)[source]

Check if the path provided is a file.

Parameters

path (str) – full path to the remote file to check

create_directory(path, mode=511)[source]

Create a directory on the remote system.

The default mode is 0o777, but on some systems, the current umask value may be first masked out. Different from mkdir(), this function attempts to create parent directories if needed, and returns silently if the target directory already exists.

Parameters
  • path (str) – full path to the remote directory to create

  • mode (int) – int permissions of octal mode for directory

delete_directory(path)[source]

Delete a directory on the remote system.

Parameters

path (str) – full path to the remote directory to delete

retrieve_file(remote_full_path, local_full_path, prefetch=True)[source]

Transfer the remote file to a local location.

If local_full_path is a string path, the file will be put at that location.

Parameters
  • remote_full_path (str) – full path to the remote file

  • local_full_path (str) – full path to the local file

  • prefetch (bool) – controls whether prefetch is performed (default: True)

store_file(remote_full_path, local_full_path, confirm=True)[source]

Transfer a local file to the remote location.

If local_full_path_or_buffer is a string path, the file will be read from that location.

Parameters
  • remote_full_path (str) – full path to the remote file

  • local_full_path (str) – full path to the local file

delete_file(path)[source]

Remove a file on the server.

Parameters

path (str) – full path to the remote file

get_mod_time(path)[source]

Get an entry’s modification time.

Parameters

path (str) – full path to the remote file

path_exists(path)[source]

Whether a remote entity exists.

Parameters

path (str) – full path to the remote file or directory

walktree(path, fcallback, dcallback, ucallback, recurse=True)[source]

Recursively descend, depth first, the directory tree at path.

This calls discrete callback functions for each regular file, directory, and unknown file type.

Parameters
  • path (str) – root of remote directory to descend, use ‘.’ to start at pwd

  • fcallback (callable) – callback function to invoke for a regular file. (form: func(str))

  • dcallback (callable) – callback function to invoke for a directory. (form: func(str))

  • ucallback (callable) – callback function to invoke for an unknown file type. (form: func(str))

  • recurse (bool) – Default: True - should it recurse

get_tree_map(path, prefix=None, delimiter=None)[source]

Get tuple with recursive lists of files, directories and unknown paths.

It is possible to filter results by giving prefix and/or delimiter parameters.

Parameters
  • path (str) – path from which tree will be built

  • prefix (str | None) – if set paths will be added if start with prefix

  • delimiter (str | None) – if set paths will be added if end with delimiter

Returns

tuple with list of files, dirs and unknown items

Return type

tuple[list[str], list[str], list[str]]

test_connection()[source]

Test the SFTP connection by calling path with directory.

get_file_by_pattern(path, fnmatch_pattern)[source]

Get the first matching file based on the given fnmatch type pattern.

Parameters
  • path – path to be checked

  • fnmatch_pattern – The pattern that will be matched with fnmatch

Returns

string containing the first found file, or an empty string if none matched

Return type

str

get_files_by_pattern(path, fnmatch_pattern)[source]

Get all matching files based on the given fnmatch type pattern.

Parameters
  • path – path to be checked

  • fnmatch_pattern – The pattern that will be matched with fnmatch

Returns

list of string containing the found files, or an empty list if none matched

Return type

list[str]

class airflow.providers.sftp.hooks.sftp.SFTPHookAsync(sftp_conn_id=default_conn_name, host='', port=22, username='', password='', known_hosts=default_known_hosts, key_file='', passphrase='', private_key='')[source]

Bases: airflow.hooks.base.BaseHook

Interact with an SFTP server via asyncssh package.

Parameters
  • sftp_conn_id (str) – SFTP connection ID to be used for connecting to SFTP server

  • host (str) – hostname of the SFTP server

  • port (int) – port of the SFTP server

  • username (str) – username used when authenticating to the SFTP server

  • password (str) – password used when authenticating to the SFTP server. Can be left blank if using a key file

  • known_hosts (str) – path to the known_hosts file on the local file system. Defaults to ~/.ssh/known_hosts.

  • key_file (str) – path to the client key file used for authentication to SFTP server

  • passphrase (str) – passphrase used with the key_file for authentication to SFTP server

conn_name_attr = 'ssh_conn_id'[source]
default_conn_name = 'sftp_default'[source]
conn_type = 'sftp'[source]
hook_name = 'SFTP'[source]
default_known_hosts = '~/.ssh/known_hosts'[source]
async list_directory(path='')[source]

Return a list of files on the SFTP server at the provided path.

async read_directory(path='')[source]

Return a list of files along with their attributes on the SFTP server at the provided path.

async get_files_and_attrs_by_pattern(path='', fnmatch_pattern='')[source]

Get the files along with their attributes matching the pattern (e.g. *.pdf) at the provided path.

if one exists. Otherwise, raises an AirflowException to be handled upstream for deferring

async get_mod_time(path)[source]

Make SFTP async connection.

Looks for last modified time in the specific file path and returns last modification time for

the file path.

Parameters

path (str) – full path to the remote file

Was this entry helpful?