Content¶
References
Resources
Guides
Commits
apache-airflow-providers-apache-beam
¶This is a provider package for apache.beam
provider. All classes for this provider package
are in airflow.providers.apache.beam
python package.
You can install this package on top of an existing Airflow 2 installation (see Requirements
below)
for the minimum Airflow version supported) via
pip install apache-airflow-providers-apache-beam
PIP package |
Version required |
---|---|
|
|
|
|
Those are dependencies that might be needed in order to use all the features of the package. You need to install the specified provider packages in order to use them.
You can install such cross-provider dependencies when installing from PyPI. For example:
pip install apache-airflow-providers-apache-beam[google]
Dependent package |
Extra |
---|---|
|
You can download officially released packages and verify their checksums and signatures from the Official Apache Download site
The apache-airflow-providers-apache-beam 4.1.0 sdist package (asc, sha512)
The apache-airflow-providers-apache-beam 4.1.0 wheel package (asc, sha512)
This release of provider is only available for Airflow 2.3+ as explained in the Apache Airflow providers support policy.
Move min airflow version to 2.3.0 for all providers (#27196)
Add backward compatibility with old versions of Apache Beam (#27263)
This release of provider is only available for Airflow 2.2+ as explained in the Apache Airflow providers support policy https://github.com/apache/airflow/blob/main/README.md#support-for-providers
Added missing project_id to the wait_for_job (#24020)
Support impersonation service account parameter for Dataflow runner (#23961)
chore: Refactoring and Cleaning Apache Providers (#24219)
Add recipe for BeamRunGoPipelineOperator (#22296)
Fix mistakenly added install_requires for all providers (#22382)
Auto-apply apply_default decorator (#15667)
Warning
Due to apply_default decorator removal, this version of the provider requires Airflow 2.1.0+.
If your Airflow version is < 2.1.0, and you want to install this provider version, first upgrade
Airflow to at least version 2.1.0. Otherwise your Airflow package version will be upgraded
automatically and you will have to manually run airflow upgrade db
to complete the migration.
google
provider¶In 2.0.0 version of the provider we’ve changed the way of integrating with the google
provider.
The previous versions of both providers caused conflicts when trying to install them together
using PIP > 20.2.4. The conflict is not detected by PIP 20.2.4 and below but it was there and
the version of Google BigQuery
python client was not matching on both sides. As the result, when
both apache.beam
and google
provider were installed, some features of the BigQuery
operators
might not work properly. This was cause by apache-beam
client not yet supporting the new google
python clients when apache-beam[gcp]
extra was used. The apache-beam[gcp]
extra is used
by Dataflow
operators and while they might work with the newer version of the Google BigQuery
python client, it is not guaranteed.
This version introduces additional extra requirement for the apache.beam
extra of the google
provider
and symmetrically the additional requirement for the google
extra of the apache.beam
provider.
Both google
and apache.beam
provider do not use those extras by default, but you can specify
them when installing the providers. The consequence of that is that some functionality of the Dataflow
operators might not be available.
Unfortunately the only complete
solution to the problem is for the apache.beam
to migrate to the
new (>=2.0.0) Google Python clients.
This is the extra for the google
provider:
extras_require = (
{
# ...
"apache.beam": ["apache-airflow-providers-apache-beam", "apache-beam[gcp]"],
# ...
},
)
And likewise this is the extra for the apache.beam
provider:
extras_require = ({"google": ["apache-airflow-providers-google", "apache-beam[gcp]"]},)
You can still run this with PIP version <= 20.2.4 and go back to the previous behaviour:
pip install apache-airflow-providers-google[apache.beam]
or
pip install apache-airflow-providers-apache-beam[google]
But be aware that some BigQuery
operators functionality might not be available in this case.
Improve Apache Beam operators - refactor operator - common Dataflow logic (#14094)
Corrections in docs and tools after releasing provider RCs (#14082)
Remove WARNINGs from BeamHook (#14554)
Initial version of the provider.