airflow.providers.microsoft.azure.hooks.wasb¶
This module contains integration with Azure Blob Storage.
It communicate via the Window Azure Storage Blob protocol. Make sure that a Airflow connection of type wasb exists. Authorization can be done by supplying a login (=Storage account name) and password (=KEY), or login and SAS token in the extra field (see connection wasb_default for an example).
Attributes¶
Classes¶
| Interact with Azure Blob Storage through the  | |
| An async hook that connects to Azure WASB to perform operations. | 
Module Contents¶
- class airflow.providers.microsoft.azure.hooks.wasb.WasbHook(wasb_conn_id=default_conn_name, public_read=False)[source]¶
- Bases: - airflow.providers.microsoft.azure.version_compat.BaseHook- Interact with Azure Blob Storage through the - wasb://protocol.- These parameters have to be passed in Airflow Data Base: account_name and account_key. - Additional options passed in the ‘extra’ field of the connection will be passed to the BlobServiceClient() constructor. For example, authenticate using a SAS token by adding {“sas_token”: “YOUR_TOKEN”} or using an account key by adding {“account_key”: “YOUR_ACCOUNT_KEY”}. - If no authentication configuration is provided, DefaultAzureCredential will be used (applicable when using Azure compute infrastructure). - Parameters:
- wasb_conn_id (str) – Reference to the wasb connection. 
- public_read (bool) – Whether an anonymous public read access should be used. default is False 
 
 - classmethod get_connection_form_widgets()[source]¶
- Return connection widgets to add to connection form. 
 - property blob_service_client: azure.storage.blob.aio.BlobServiceClient | azure.storage.blob.BlobServiceClient[source]¶
- Return the BlobServiceClient object (cached). 
 - check_for_blob(container_name, blob_name, **kwargs)[source]¶
- Check if a blob exists on Azure Blob Storage. 
 - check_for_prefix(container_name, prefix, **kwargs)[source]¶
- Check if a prefix exists on Azure Blob storage. 
 - get_blobs_list(container_name, prefix=None, include=None, delimiter='/', **kwargs)[source]¶
- List blobs in a given container. - Parameters:
- container_name (str) – The name of the container 
- prefix (str | None) – Filters the results to return only blobs whose names begin with the specified prefix. 
- include (list[str] | None) – Specifies one or more additional datasets to include in the response. Options include: - snapshots,- metadata,- uncommittedblobs,- copy`, ``deleted.
- delimiter (str) – filters objects based on the delimiter (for e.g ‘.csv’) 
 
 
 - get_blobs_list_recursive(container_name, prefix=None, include=None, endswith='', **kwargs)[source]¶
- List blobs in a given container. - Parameters:
- container_name (str) – The name of the container 
- prefix (str | None) – Filters the results to return only blobs whose names begin with the specified prefix. 
- include (list[str] | None) – Specifies one or more additional datasets to include in the response. Options include: - snapshots,- metadata,- uncommittedblobs,- copy`, ``deleted.
- delimiter – filters objects based on the delimiter (for e.g ‘.csv’) 
 
 
 - load_file(file_path, container_name, blob_name, create_container=False, **kwargs)[source]¶
- Upload a file to Azure Blob Storage. - Parameters:
- file_path (str) – Path to the file to load. 
- container_name (str) – Name of the container. 
- blob_name (str) – Name of the blob. 
- create_container (bool) – Attempt to create the target container prior to uploading the blob. This is useful if the target container may not exist yet. Defaults to False. 
- kwargs – Optional keyword arguments that - BlobClient.upload_blob()takes.
 
 
 - load_string(string_data, container_name, blob_name, create_container=False, **kwargs)[source]¶
- Upload a string to Azure Blob Storage. - Parameters:
- string_data (str) – String to load. 
- container_name (str) – Name of the container. 
- blob_name (str) – Name of the blob. 
- create_container (bool) – Attempt to create the target container prior to uploading the blob. This is useful if the target container may not exist yet. Defaults to False. 
- kwargs – Optional keyword arguments that - BlobClient.upload()takes.
 
 
 - get_file(file_path, container_name, blob_name, **kwargs)[source]¶
- Download a file from Azure Blob Storage. 
 - read_file(container_name, blob_name, **kwargs)[source]¶
- Read a file from Azure Blob Storage and return as a string. 
 - upload(container_name, blob_name, data, blob_type='BlockBlob', length=None, create_container=False, **kwargs)[source]¶
- Create a new blob from a data source with automatic chunking. - Parameters:
- container_name (str) – The name of the container to upload data 
- blob_name (str) – The name of the blob to upload. This need not exist in the container 
- data (Any) – The blob data to upload 
- blob_type (str) – The type of the blob. This can be either - BlockBlob,- PageBlobor- AppendBlob. The default value is- BlockBlob.
- length (int | None) – Number of bytes to read from the stream. This is optional, but should be supplied for optimal performance. 
- create_container (bool) – Attempt to create the target container prior to uploading the blob. This is useful if the target container may not exist yet. Defaults to False. 
 
 
 - download(container_name, blob_name, offset=None, length=None, **kwargs)[source]¶
- Download a blob to the StorageStreamDownloader. - Parameters:
 
 - create_container(container_name)[source]¶
- Create container object if not already existing. - Parameters:
- container_name (str) – The name of the container to create 
 
 - delete_container(container_name)[source]¶
- Delete a container object. - Parameters:
- container_name (str) – The name of the container 
 
 - delete_blobs(container_name, *blobs, **kwargs)[source]¶
- Mark the specified blobs or snapshots for deletion. - Parameters:
- container_name (str) – The name of the container containing the blobs 
- blobs – The blobs to delete. This can be a single blob, or multiple values can be supplied, where each value is either the name of the blob (str) or BlobProperties. 
 
 
 - copy_blobs(source_container_name, source_blob_name, destination_container_name, destination_blob_name)[source]¶
- Copy the specified blobs from one blob prefix to another. - Parameters:
- source_container_name (str) – The name of the source container containing the blobs. 
- source_blob_name (str) – The full source blob path without the container name. 
- destination_container_name (str) – The name of the destination container where the blobs will be copied to. 
- destination_blob_name (str) – The full destination blob path without the container name. 
 
 
 - delete_file(container_name, blob_name, is_prefix=False, ignore_if_missing=False, delimiter='', **kwargs)[source]¶
- Delete a file, or all blobs matching a prefix, from Azure Blob Storage. - Parameters:
- container_name (str) – Name of the container. 
- blob_name (str) – Name of the blob. 
- is_prefix (bool) – If blob_name is a prefix, delete all matching files 
- ignore_if_missing (bool) – if True, then return success even if the blob does not exist. 
- kwargs – Optional keyword arguments that - ContainerClient.delete_blobs()takes.
 
 
 
- class airflow.providers.microsoft.azure.hooks.wasb.WasbAsyncHook(wasb_conn_id='wasb_default', public_read=False)[source]¶
- Bases: - WasbHook- An async hook that connects to Azure WASB to perform operations. - Parameters:
- wasb_conn_id (str) – reference to the wasb connection 
- public_read (bool) – whether an anonymous public read access should be used. default is False 
 
 - async check_for_blob_async(container_name, blob_name, **kwargs)[source]¶
- Check if a blob exists on Azure Blob Storage. 
 - async get_blobs_list_async(container_name, prefix=None, include=None, delimiter='/', **kwargs)[source]¶
- List blobs in a given container. - Parameters:
- container_name (str) – the name of the container 
- prefix (str | None) – filters the results to return only blobs whose names begin with the specified prefix. 
- include (list[str] | None) – specifies one or more additional datasets to include in the response. Options include: - snapshots,- metadata,- uncommittedblobs,- copy`, ``deleted.
- delimiter (str) – filters objects based on the delimiter (for e.g ‘.csv’)