apache-airflow-providers-google

Package apache-airflow-providers-google

Google services including:

Release: 2.1.0

Provider package

This is a provider package for google provider. All classes for this provider package are in airflow.providers.google python package.

Installation

Note

On November 2020, new version of PIP (20.3) has been released with a new, 2020 resolver. This resolver does not yet work with Apache Airflow and might lead to errors in installation - depends on your choice of extras. In order to install Airflow you need to either downgrade pip to version 20.2.4 pip install --upgrade pip==20.2.4 or, in case you use Pip 20.3, you need to add option --use-deprecated legacy-resolver to your pip install command.

You can install this package on top of an existing airflow 2.* installation via pip install apache-airflow-providers-google

PIP requirements

PIP package

Version required

PyOpenSSL

google-ads

>=4.0.0,<8.0.0

google-api-core

>=1.25.1,<2.0.0

google-api-python-client

>=1.6.0,<2.0.0

google-auth-httplib2

>=0.0.1

google-auth

>=1.0.0,<2.0.0

google-cloud-automl

>=2.1.0,<3.0.0

google-cloud-bigquery-datatransfer

>=3.0.0,<4.0.0

google-cloud-bigtable

>=1.0.0,<2.0.0

google-cloud-container

>=0.1.1,<2.0.0

google-cloud-datacatalog

>=3.0.0,<4.0.0

google-cloud-dataproc

>=2.2.0,<3.0.0

google-cloud-dlp

>=0.11.0,<2.0.0

google-cloud-kms

>=2.0.0,<3.0.0

google-cloud-language

>=1.1.1,<2.0.0

google-cloud-logging

>=2.1.1,<3.0.0

google-cloud-memcache

>=0.2.0

google-cloud-monitoring

>=2.0.0,<3.0.0

google-cloud-os-login

>=2.0.0,<3.0.0

google-cloud-pubsub

>=2.0.0,<3.0.0

google-cloud-redis

>=2.0.0,<3.0.0

google-cloud-secret-manager

>=0.2.0,<2.0.0

google-cloud-spanner

>=1.10.0,<2.0.0

google-cloud-speech

>=0.36.3,<2.0.0

google-cloud-storage

>=1.30,<2.0.0

google-cloud-tasks

>=2.0.0,<3.0.0

google-cloud-texttospeech

>=0.4.0,<2.0.0

google-cloud-translate

>=1.5.0,<2.0.0

google-cloud-videointelligence

>=1.7.0,<2.0.0

google-cloud-vision

>=0.35.2,<2.0.0

google-cloud-workflows

>=0.1.0,<2.0.0

grpcio-gcp

>=0.2.2

json-merge-patch

~=0.2

pandas-gbq

Cross provider package dependencies

Those are dependencies that might be needed in order to use all the features of the package. You need to install the specified backport providers package in order to use them.

You can install such cross-provider dependencies when installing from PyPI. For example:

pip install apache-airflow-providers-google[amazon]

Dependent package

Extra

apache-airflow-providers-amazon

amazon

apache-airflow-providers-apache-beam

apache.beam

apache-airflow-providers-apache-cassandra

apache.cassandra

apache-airflow-providers-cncf-kubernetes

cncf.kubernetes

apache-airflow-providers-facebook

facebook

apache-airflow-providers-microsoft-azure

microsoft.azure

apache-airflow-providers-microsoft-mssql

microsoft.mssql

apache-airflow-providers-mysql

mysql

apache-airflow-providers-oracle

oracle

apache-airflow-providers-postgres

postgres

apache-airflow-providers-presto

presto

apache-airflow-providers-salesforce

salesforce

apache-airflow-providers-sftp

sftp

apache-airflow-providers-ssh

ssh

Changelog

2.1.0

Features

  • Corrects order of argument in docstring in GCSHook.download method (#14497)

  • Refactor SQL/BigQuery/Qubole/Druid Check operators (#12677)

  • Add GoogleDriveToLocalOperator (#14191)

  • Add 'exists_ok' flag to BigQueryCreateEmptyTable(Dataset)Operator (#14026)

  • Add materialized view support for BigQuery (#14201)

  • Add BigQueryUpdateTableOperator (#14149)

  • Add param to CloudDataTransferServiceOperator (#14118)

  • Add gdrive_to_gcs operator, drive sensor, additional functionality to drive hook  (#13982)

  • Improve GCSToSFTPOperator paths handling (#11284)

Bug Fixes

  • Fixes to dataproc operators and hook (#14086)

  • #9803 fix bug in copy operation without wildcard  (#13919)

2.0.0

Breaking changes

Updated google-cloud-* libraries

This release of the provider package contains third-party library updates, which may require updating your DAG files or custom hooks and operators, if you were using objects from those libraries. Updating of these libraries is necessary to be able to use new features made available by new versions of the libraries and to obtain bug fixes that are only available for new versions of the library.

Details are covered in the UPDATING.md files for each library, but there are some details that you should pay attention to.

Library name

Previous constraints

Current constraints

Upgrade Documentation

google-cloud-automl

>=0.4.0,<2.0.0

>=2.1.0,<3.0.0

Upgrading google-cloud-automl

google-cloud-bigquery-datatransfer

>=0.4.0,<2.0.0

>=3.0.0,<4.0.0

Upgrading google-cloud-bigquery-datatransfer

google-cloud-datacatalog

>=0.5.0,<0.8

>=3.0.0,<4.0.0

Upgrading google-cloud-datacatalog

google-cloud-dataproc

>=1.0.1,<2.0.0

>=2.2.0,<3.0.0

Upgrading google-cloud-dataproc

google-cloud-kms

>=1.2.1,<2.0.0

>=2.0.0,<3.0.0

Upgrading google-cloud-kms

google-cloud-logging

>=1.14.0,<2.0.0

>=2.0.0,<3.0.0

Upgrading google-cloud-logging

google-cloud-monitoring

>=0.34.0,<2.0.0

>=2.0.0,<3.0.0

Upgrading google-cloud-monitoring

google-cloud-os-login

>=1.0.0,<2.0.0

>=2.0.0,<3.0.0

Upgrading google-cloud-os-login

google-cloud-pubsub

>=1.0.0,<2.0.0

>=2.0.0,<3.0.0

Upgrading google-cloud-pubsub

google-cloud-tasks

>=1.2.1,<2.0.0

>=2.0.0,<3.0.0

Upgrading google-cloud-task

The field names use the snake_case convention

If your DAG uses an object from the above mentioned libraries passed by XCom, it is necessary to update the naming convention of the fields that are read. Previously, the fields used the CamelSnake convention, now the snake_case convention is used.

Before:

set_acl_permission = GCSBucketCreateAclEntryOperator(
    task_id="gcs-set-acl-permission",
    bucket=BUCKET_NAME,
    entity="user-{{ task_instance.xcom_pull('get-instance')['persistenceIamIdentity']"
    ".split(':', 2)[1] }}",
    role="OWNER",
)

After:

set_acl_permission = GCSBucketCreateAclEntryOperator(
    task_id="gcs-set-acl-permission",
    bucket=BUCKET_NAME,
    entity="user-{{ task_instance.xcom_pull('get-instance')['persistence_iam_identity']"
    ".split(':', 2)[1] }}",
    role="OWNER",
)

Features

  • Add Apache Beam operators (#12814)

  • Add Google Cloud Workflows Operators (#13366)

  • Replace 'google_cloud_storage_conn_id' by 'gcp_conn_id' when using 'GCSHook' (#13851)

  • Add How To Guide for Dataflow (#13461)

  • Generalize MLEngineStartTrainingJobOperator to custom images (#13318)

  • Add Parquet data type to BaseSQLToGCSOperator (#13359)

  • Add DataprocCreateWorkflowTemplateOperator (#13338)

  • Add OracleToGCS Transfer (#13246)

  • Add timeout option to gcs hook methods. (#13156)

  • Add regional support to dataproc workflow template operators (#12907)

  • Add project_id to client inside BigQuery hook update_table method (#13018)

Bug fixes

  • Fix four bugs in StackdriverTaskHandler (#13784)

  • Decode Remote Google Logs (#13115)

  • Fix and improve GCP BigTable hook and system test (#13896)

  • updated Google DV360 Hook to fix SDF issue (#13703)

  • Fix insert_all method of BigQueryHook to support tables without schema (#13138)

  • Fix Google BigQueryHook method get_schema() (#13136)

  • Fix Data Catalog operators (#13096)

1.0.0

Initial version of the provider.

Was this entry helpful?