Google Cloud Platform Connection

The Google Cloud Platform connection type enables the GCP Integrations.

Authenticating to GCP

There are three ways to connect to GCP using Airflow.

  1. Use Application Default Credentials, such as via the metadata server when running on Google Compute Engine.

  2. Use a service account key file (JSON format) on disk - Keyfile Path.

  3. Use a service account key file (JSON format) from connection configuration - Keyfile JSON.

Default Connection IDs

The following connection IDs are used by default.

bigquery_default

Used by the BigQueryHook hook.

google_cloud_datastore_default

Used by the DatastoreHook hook.

google_cloud_default

Used by those hooks:

Configuring the Connection

Project Id (optional)

The Google Cloud project ID to connect to. It is used as default project id by operators using it and can usually be overridden at the operator level.

Keyfile Path

Path to a service account key file (JSON format) on disk.

Not required if using application default credentials.

Keyfile JSON

Contents of a service account key file (JSON format) on disk. It is recommended to Secure your connections if using this method to authenticate.

Not required if using application default credentials.

Scopes (comma separated)

A list of comma-separated Google Cloud scopes to authenticate with.

Number of Retries

Integer, number of times to retry with randomized exponential backoff. If all retries fail, the googleapiclient.errors.HttpError represents the last request. If zero (default), we attempt the request only once.

When specifying the connection in environment variable you should specify it using URI syntax, with the following requirements:

  • scheme part should be equals google-cloud-platform (Note: look for a hyphen character)

  • authority (username, password, host, port), path is ignored

  • query parameters contains information specific to this type of connection. The following keys are accepted:

    • extra__google_cloud_platform__project - Project Id

    • extra__google_cloud_platform__key_path - Keyfile Path

    • extra__google_cloud_platform__key_dict - Keyfile JSON

    • extra__google_cloud_platform__scope - Scopes

    • extra__google_cloud_platform__num_retries - Number of Retries

Note that all components of the URI should be URL-encoded.

For example:

export AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT='google-cloud-platform://?extra__google_cloud_platform__key_path=%2Fkeys%2Fkey.json&extra__google_cloud_platform__scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform&extra__google_cloud_platform__project=airflow&extra__google_cloud_platform__num_retries=5'

Was this entry helpful?