Google Cloud Storage Operators¶
Prerequisite Tasks¶
To use these operators, you must do a few things:
Select or create a Cloud Platform project using Cloud Console.
Enable billing for your project, as described in Google Cloud documentation.
Enable API, as described in Cloud Console documentation.
Install API libraries via pip.
pip install 'apache-airflow[gcp]'
Detailed information is available Installation
GoogleCloudStorageToBigQueryOperator¶
Use the
GoogleCloudStorageToBigQueryOperator
to execute a BigQuery load job.
load_csv = gcs_to_bq.GoogleCloudStorageToBigQueryOperator(
task_id='gcs_to_bq_example',
bucket='cloud-samples-data',
source_objects=['bigquery/us-states/us-states.csv'],
destination_project_dataset_table='airflow_test.gcs_to_bq_table',
schema_fields=[
{'name': 'name', 'type': 'STRING', 'mode': 'NULLABLE'},
{'name': 'post_abbr', 'type': 'STRING', 'mode': 'NULLABLE'},
],
write_disposition='WRITE_TRUNCATE',
dag=dag)
GoogleCloudStorageBucketCreateAclEntryOperator¶
Creates a new ACL entry on the specified bucket.
For parameter definition, take a look at
GoogleCloudStorageBucketCreateAclEntryOperator
Arguments¶
Some arguments in the example DAG are taken from the OS environment variables:
GCS_ACL_BUCKET = os.environ.get('GCS_ACL_BUCKET', 'example-bucket')
GCS_ACL_OBJECT = os.environ.get('GCS_ACL_OBJECT', 'example-object')
GCS_ACL_ENTITY = os.environ.get('GCS_ACL_ENTITY', 'example-entity')
GCS_ACL_BUCKET_ROLE = os.environ.get('GCS_ACL_BUCKET_ROLE', 'example-bucket-role')
GCS_ACL_OBJECT_ROLE = os.environ.get('GCS_ACL_OBJECT_ROLE', 'example-object-role')
Using the operator¶
gcs_bucket_create_acl_entry_task = GoogleCloudStorageBucketCreateAclEntryOperator(
bucket=GCS_ACL_BUCKET,
entity=GCS_ACL_ENTITY,
role=GCS_ACL_BUCKET_ROLE,
task_id="gcs_bucket_create_acl_entry_task"
)
Templating¶
template_fields = ('bucket', 'entity', 'role', 'user_project')
More information¶
See Google Cloud Storage Documentation to create a new ACL entry for a bucket.
GoogleCloudStorageObjectCreateAclEntryOperator¶
Creates a new ACL entry on the specified object.
For parameter definition, take a look at
GoogleCloudStorageObjectCreateAclEntryOperator
Arguments¶
Some arguments in the example DAG are taken from the OS environment variables:
GCS_ACL_BUCKET = os.environ.get('GCS_ACL_BUCKET', 'example-bucket')
GCS_ACL_OBJECT = os.environ.get('GCS_ACL_OBJECT', 'example-object')
GCS_ACL_ENTITY = os.environ.get('GCS_ACL_ENTITY', 'example-entity')
GCS_ACL_BUCKET_ROLE = os.environ.get('GCS_ACL_BUCKET_ROLE', 'example-bucket-role')
GCS_ACL_OBJECT_ROLE = os.environ.get('GCS_ACL_OBJECT_ROLE', 'example-object-role')
Using the operator¶
gcs_object_create_acl_entry_task = GoogleCloudStorageObjectCreateAclEntryOperator(
bucket=GCS_ACL_BUCKET,
object_name=GCS_ACL_OBJECT,
entity=GCS_ACL_ENTITY,
role=GCS_ACL_OBJECT_ROLE,
task_id="gcs_object_create_acl_entry_task"
)
Templating¶
template_fields = ('bucket', 'object_name', 'entity', 'role', 'generation',
'user_project')
More information¶
See Google Cloud Storage insert documentation to create a ACL entry for ObjectAccess.