airflow.contrib.operators.adls_to_gcs¶
Module Contents¶
- 
class airflow.contrib.operators.adls_to_gcs.AdlsToGoogleCloudStorageOperator(src_adls, dest_gcs, azure_data_lake_conn_id, google_cloud_storage_conn_id, delegate_to=None, replace=False, gzip=False, *args, **kwargs)[source]¶
- Bases: - airflow.contrib.operators.adls_list_operator.AzureDataLakeStorageListOperator- Synchronizes an Azure Data Lake Storage path with a GCS bucket - Parameters
- src_adls (str) – The Azure Data Lake path to find the objects (templated) 
- dest_gcs (str) – The Google Cloud Storage bucket and prefix to store the objects. (templated) 
- replace (bool) – If true, replaces same-named files in GCS 
- gzip (bool) – Option to compress file for upload 
- azure_data_lake_conn_id (str) – The connection ID to use when connecting to Azure Data Lake Storage. 
- google_cloud_storage_conn_id (str) – The connection ID to use when connecting to Google Cloud Storage. 
- delegate_to (str) – The account to impersonate, if any. For this to work, the service account making the request must have domain-wide delegation enabled. 
 
 - Examples:
- The following Operator would copy a single file named - hello/world.avrofrom ADLS to the GCS bucket- mybucket. Its full resulting gcs path will be- gs://mybucket/hello/world.avro- copy_single_file = AdlsToGoogleCloudStorageOperator( task_id='copy_single_file', src_adls='hello/world.avro', dest_gcs='gs://mybucket', replace=False, azure_data_lake_conn_id='azure_data_lake_default', google_cloud_storage_conn_id='google_cloud_default' ) - The following Operator would copy all parquet files from ADLS to the GCS bucket - mybucket.- copy_all_files = AdlsToGoogleCloudStorageOperator( task_id='copy_all_files', src_adls='*.parquet', dest_gcs='gs://mybucket', replace=False, azure_data_lake_conn_id='azure_data_lake_default', google_cloud_storage_conn_id='google_cloud_default' ) The following Operator would copy all parquet files from ADLS path ``/hello/world``to the GCS bucket ``mybucket``. :: copy_world_files = AdlsToGoogleCloudStorageOperator( task_id='copy_world_files', src_adls='hello/world/*.parquet', dest_gcs='gs://mybucket', replace=False, azure_data_lake_conn_id='azure_data_lake_default', google_cloud_storage_conn_id='google_cloud_default' )