airflow.providers.amazon.aws.transfers.google_api_to_s3

This module allows you to transfer data from any Google API endpoint into a S3 Bucket.

Module Contents

Classes

GoogleApiToS3Operator

Basic class for transferring data from a Google API endpoint into a S3 Bucket.

class airflow.providers.amazon.aws.transfers.google_api_to_s3.GoogleApiToS3Operator(*, google_api_service_name, google_api_service_version, google_api_endpoint_path, google_api_endpoint_params, s3_destination_key, google_api_response_via_xcom=None, google_api_endpoint_params_via_xcom=None, google_api_endpoint_params_via_xcom_task_ids=None, google_api_pagination=False, google_api_num_retries=0, s3_overwrite=False, gcp_conn_id='google_cloud_default', aws_conn_id='aws_default', google_impersonation_chain=None, **kwargs)[source]

Bases: airflow.models.BaseOperator

Basic class for transferring data from a Google API endpoint into a S3 Bucket.

This discovery-based operator use GoogleDiscoveryApiHook to communicate with Google Services via the Google API Python Client. Please note that this library is in maintenance mode hence it won’t fully support Google Cloud in the future. Therefore it is recommended that you use the custom Google Cloud Service Operators for working with the Google Cloud Platform.

See also

For more information on how to use this operator, take a look at the guide: Google Sheets to Amazon S3 transfer operator

Parameters
  • google_api_service_name (str) – The specific API service that is being requested.

  • google_api_service_version (str) – The version of the API that is being requested.

  • google_api_endpoint_path (str) –

    The client libraries path to the api call’s executing method. For example: ‘analyticsreporting.reports.batchGet’

    Note

    See https://developers.google.com/apis-explorer for more information on which methods are available.

  • google_api_endpoint_params (dict) – The params to control the corresponding endpoint result.

  • s3_destination_key (str) –

    The url where to put the data retrieved from the endpoint in S3.

  • google_api_response_via_xcom (str | None) – Can be set to expose the google api response to xcom.

  • google_api_endpoint_params_via_xcom (str | None) – If set to a value this value will be used as a key for pulling from xcom and updating the google api endpoint params.

  • google_api_endpoint_params_via_xcom_task_ids (str | None) – Task ids to filter xcom by.

  • google_api_pagination (bool) –

    If set to True Pagination will be enabled for this request to retrieve all data.

    Note

    This means the response will be a list of responses.

  • google_api_num_retries (int) – Define the number of retries for the Google API requests being made if it fails.

  • s3_overwrite (bool) – Specifies whether the s3 file will be overwritten if exists.

  • gcp_conn_id (str) – The connection ID to use when fetching connection info.

  • aws_conn_id (str | None) – The connection id specifying the authentication information for the S3 Bucket.

  • google_impersonation_chain (str | Sequence[str] | None) – Optional Google service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

template_fields: Sequence[str] = ('google_api_endpoint_params', 's3_destination_key', 'google_impersonation_chain', 'gcp_conn_id')[source]
template_ext: Sequence[str] = ()[source]
ui_color = '#cc181e'[source]
execute(context)[source]

Transfers Google APIs json data to S3.

Parameters

context (airflow.utils.context.Context) – The context that is being provided when executing.

Was this entry helpful?