Azure Blob Storage Transfer Operator

The Blob service stores text and binary data as objects in the cloud. The Blob service offers the following three resources: the storage account, containers, and blobs. Within your storage account, containers provide a way to organize sets of blobs. For more information about the service visit Azure Blob Storage API documentation.

Before you begin

Before using Blob Storage within Airflow you need to authenticate your account with Token, Login and Password. Please follow Azure instructions to do it.

TOKEN should be added to the Connection in Airflow in JSON format, Login and Password as plain text. You can check how to do such connection.

See following example. Set values for these fields:

Conn Id: wasb_default
Login: Storage Account Name
Password: KEY1
Extra: {"sas_token": "TOKEN"}

Transfer Data from Blob Storage to Google Cloud Storage

Operator transfers data from Azure Blob Storage to specified bucket in Google Cloud Storage

To get information about jobs within a Azure Blob Storage use: AzureBlobStorageToGCSOperator Example usage:

airflow/providers/microsoft/azure/example_dags/example_azure_blob_to_gcs.pyView Source

transfer_files_to_gcs = AzureBlobStorageToGCSOperator(
    task_id="transfer_files_to_gcs",
    # AZURE args
    wasb_conn_id="wasb_default",
    container_name=AZURE_CONTAINER_NAME,
    blob_name=BLOB_NAME,
    file_path=GCP_OBJECT_NAME,
    # GCP args
    gcp_conn_id="google_cloud_default",
    bucket_name=GCP_BUCKET_NAME,
    object_name=GCP_OBJECT_NAME,
    filename=GCP_BUCKET_FILE_PATH,
    gzip=False,
    delegate_to=None,
    impersonation_chain=None,
)

Was this entry helpful?