airflow.contrib.operators.gcs_list_operator
¶
Module Contents¶
-
class
airflow.contrib.operators.gcs_list_operator.
GoogleCloudStorageListOperator
(bucket, prefix=None, delimiter=None, google_cloud_storage_conn_id='google_cloud_default', delegate_to=None, *args, **kwargs)[source]¶ Bases:
airflow.models.BaseOperator
List all objects from the bucket with the give string prefix and delimiter in name.
- This operator returns a python list with the name of objects which can be used by
xcom in the downstream task.
- Parameters
bucket (str) – The Google cloud storage bucket to find the objects. (templated)
prefix (str) – Prefix string which filters objects whose name begin with this prefix. (templated)
delimiter (str) – The delimiter by which you want to filter the objects. (templated) For e.g to lists the CSV files from in a directory in GCS you would use delimiter=’.csv’.
google_cloud_storage_conn_id (str) – The connection ID to use when connecting to Google cloud storage.
delegate_to (str) – The account to impersonate, if any. For this to work, the service account making the request must have domain-wide delegation enabled.
- Example:
The following Operator would list all the Avro files from
sales/sales-2017
folder indata
bucket.GCS_Files = GoogleCloudStorageListOperator( task_id='GCS_Files', bucket='data', prefix='sales/sales-2017/', delimiter='.avro', google_cloud_storage_conn_id=google_cloud_conn_id )