Apache Cassandra Operators

Apache Cassandra is an open source distributed NoSQL database that can be used when you need scalability and high availability without compromising performance. It offers linear scalability and fault-tolerance on commodity hardware or cloud infrastructure which makes it the perfect platform for mission-critical data. It supports multi-datacenter replication with lower latencies.

Prerequisite

To use operators, you must configure a Cassandra Connection.

Waiting for a Table to be created

The CassandraTableSensor operator is used to check for the existence of a table in a Cassandra cluster.

Use the table parameter to poke until the provided table is found. Use dot notation to target a specific keyspace.

airflow/providers/apache/cassandra/example_dags/example_cassandra_dag.pyView Source

table_sensor = CassandraTableSensor(
    task_id="cassandra_table_sensor",
    cassandra_conn_id="cassandra_default",
    table="keyspace_name.table_name",
)

Waiting for a Record to be created

The CassandraRecordSensor operator is used to check for the existence of a record of a table in the Cassandra cluster.

Use the table parameter to mention the keyspace and table for the record. Use dot notation to target a specific keyspace.

Use the keys parameter to poke until the provided record is found. The existence of record is identified using key value pairs. In the given example, we're are looking for value v1 in column p1 and v2 in column p2.

airflow/providers/apache/cassandra/example_dags/example_cassandra_dag.pyView Source

record_sensor = CassandraRecordSensor(
    task_id="cassandra_record_sensor",
    cassandra_conn_id="cassandra_default",
    table="keyspace_name.table_name",
    keys={"p1": "v1", "p2": "v2"},
)

Reference

For further information, look at Cassandra Query Language (CQL) SELECT statement.

Was this entry helpful?