Object Storage XCom Backend¶
The default XCom backend is the BaseXCom
class, which stores XComs in the Airflow database. This is fine for small values, but can be problematic for large values, or for large numbers of XComs.
To enable storing XComs in an object store, you can set the xcom_backend
configuration option to airflow.providers.common.io.xcom.backend.XComObjectStorageBackend
. You will also need to set xcom_objectstorage_path
to the desired location. The connection
id is obtained from the user part of the url the you will provide, e.g. xcom_objectstorage_path = s3://conn_id@mybucket/key
. Furthermore, xcom_objectstorage_threshold
is required
to be something larger than -1. Any object smaller than the threshold in bytes will be stored in the database and anything larger will be be
put in object storage. This will allow a hybrid setup. If an xcom is stored on object storage a reference will be
saved in the database. Finally, you can set xcom_objectstorage_compression
to fsspec supported compression methods like zip
or snappy
to
compress the data before storing it in object storage.
So for example the following configuration will store anything above 1MB in S3 and will compress it using gzip:
[core]
xcom_backend = airflow.providers.common.io.xcom.backend.XComObjectStorageBackend
[common.io]
xcom_objectstorage_path = s3://conn_id@mybucket/key
xcom_objectstorage_threshold = 1048576
xcom_objectstorage_compression = gzip
Note
Compression requires the support for it is installed in your python environment. For example, to use snappy
compression, you need to install python-snappy
. Zip, gzip and bz2 work out of the box.