airflow.providers.microsoft.azure.hooks.wasb¶
This module contains integration with Azure Blob Storage.
It communicate via the Window Azure Storage Blob protocol. Make sure that a Airflow connection of type wasb exists. Authorization can be done by supplying a login (=Storage account name) and password (=KEY), or login and SAS token in the extra field (see connection wasb_default for an example).
Module Contents¶
Classes¶
| Interacts with Azure Blob Storage through the  | |
| An async hook that connects to Azure WASB to perform operations. | 
Attributes¶
- class airflow.providers.microsoft.azure.hooks.wasb.WasbHook(wasb_conn_id=default_conn_name, public_read=False)[source]¶
- Bases: - airflow.hooks.base.BaseHook- Interacts with Azure Blob Storage through the - wasb://protocol.- These parameters have to be passed in Airflow Data Base: account_name and account_key. - Additional options passed in the ‘extra’ field of the connection will be passed to the BlockBlockService() constructor. For example, authenticate using a SAS token by adding {“sas_token”: “YOUR_TOKEN”}. - If no authentication configuration is provided, DefaultAzureCredential will be used (applicable when using Azure compute infrastructure). - Parameters
- wasb_conn_id (str) – Reference to the wasb connection. 
- public_read (bool) – Whether an anonymous public read access should be used. default is False 
 
 - check_for_blob(container_name, blob_name, **kwargs)[source]¶
- Check if a blob exists on Azure Blob Storage. 
 - check_for_prefix(container_name, prefix, **kwargs)[source]¶
- Check if a prefix exists on Azure Blob storage. 
 - get_blobs_list(container_name, prefix=None, include=None, delimiter='/', **kwargs)[source]¶
- List blobs in a given container. - Parameters
- container_name (str) – The name of the container 
- prefix (str | None) – Filters the results to return only blobs whose names begin with the specified prefix. 
- include (list[str] | None) – Specifies one or more additional datasets to include in the response. Options include: - snapshots,- metadata,- uncommittedblobs,- copy`, ``deleted.
- delimiter (str) – filters objects based on the delimiter (for e.g ‘.csv’) 
 
 
 - load_file(file_path, container_name, blob_name, create_container=False, **kwargs)[source]¶
- Upload a file to Azure Blob Storage. - Parameters
- file_path (str) – Path to the file to load. 
- container_name (str) – Name of the container. 
- blob_name (str) – Name of the blob. 
- create_container (bool) – Attempt to create the target container prior to uploading the blob. This is useful if the target container may not exist yet. Defaults to False. 
- kwargs – Optional keyword arguments that - BlobClient.upload_blob()takes.
 
 
 - load_string(string_data, container_name, blob_name, create_container=False, **kwargs)[source]¶
- Upload a string to Azure Blob Storage. - Parameters
- string_data (str) – String to load. 
- container_name (str) – Name of the container. 
- blob_name (str) – Name of the blob. 
- create_container (bool) – Attempt to create the target container prior to uploading the blob. This is useful if the target container may not exist yet. Defaults to False. 
- kwargs – Optional keyword arguments that - BlobClient.upload()takes.
 
 
 - get_file(file_path, container_name, blob_name, **kwargs)[source]¶
- Download a file from Azure Blob Storage. 
 - read_file(container_name, blob_name, **kwargs)[source]¶
- Read a file from Azure Blob Storage and return as a string. 
 - upload(container_name, blob_name, data, blob_type='BlockBlob', length=None, create_container=False, **kwargs)[source]¶
- Creates a new blob from a data source with automatic chunking. - Parameters
- container_name (str) – The name of the container to upload data 
- blob_name (str) – The name of the blob to upload. This need not exist in the container 
- data (Any) – The blob data to upload 
- blob_type (str) – The type of the blob. This can be either - BlockBlob,- PageBlobor- AppendBlob. The default value is- BlockBlob.
- length (int | None) – Number of bytes to read from the stream. This is optional, but should be supplied for optimal performance. 
- create_container (bool) – Attempt to create the target container prior to uploading the blob. This is useful if the target container may not exist yet. Defaults to False. 
 
 
 - download(container_name, blob_name, offset=None, length=None, **kwargs)[source]¶
- Downloads a blob to the StorageStreamDownloader. - Parameters
 
 - create_container(container_name)[source]¶
- Create container object if not already existing. - Parameters
- container_name (str) – The name of the container to create 
 
 - delete_container(container_name)[source]¶
- Delete a container object. - Parameters
- container_name (str) – The name of the container 
 
 - delete_blobs(container_name, *blobs, **kwargs)[source]¶
- Marks the specified blobs or snapshots for deletion. - Parameters
- container_name (str) – The name of the container containing the blobs 
- blobs – The blobs to delete. This can be a single blob, or multiple values can be supplied, where each value is either the name of the blob (str) or BlobProperties. 
 
 
 - delete_file(container_name, blob_name, is_prefix=False, ignore_if_missing=False, delimiter='', **kwargs)[source]¶
- Delete a file, or all blobs matching a prefix, from Azure Blob Storage. - Parameters
- container_name (str) – Name of the container. 
- blob_name (str) – Name of the blob. 
- is_prefix (bool) – If blob_name is a prefix, delete all matching files 
- ignore_if_missing (bool) – if True, then return success even if the blob does not exist. 
- kwargs – Optional keyword arguments that - ContainerClient.delete_blobs()takes.
 
 
 
- class airflow.providers.microsoft.azure.hooks.wasb.WasbAsyncHook(wasb_conn_id='wasb_default', public_read=False)[source]¶
- Bases: - WasbHook- An async hook that connects to Azure WASB to perform operations. - Parameters
- wasb_conn_id (str) – reference to the wasb connection 
- public_read (bool) – whether an anonymous public read access should be used. default is False 
 
 - async check_for_blob_async(container_name, blob_name, **kwargs)[source]¶
- Check if a blob exists on Azure Blob Storage. 
 - async get_blobs_list_async(container_name, prefix=None, include=None, delimiter='/', **kwargs)[source]¶
- List blobs in a given container. - Parameters
- container_name (str) – the name of the container 
- prefix (str | None) – filters the results to return only blobs whose names begin with the specified prefix. 
- include (list[str] | None) – specifies one or more additional datasets to include in the response. Options include: - snapshots,- metadata,- uncommittedblobs,- copy`, ``deleted.
- delimiter (str) – filters objects based on the delimiter (for e.g ‘.csv’) 
 
 
 
