airflow.contrib.operators.cassandra_to_gcs

Module Contents

class airflow.contrib.operators.cassandra_to_gcs.CassandraToGoogleCloudStorageOperator(cql, bucket, filename, schema_filename=None, approx_max_file_size_bytes=1900000000, cassandra_conn_id='cassandra_default', google_cloud_storage_conn_id='google_cloud_default', delegate_to=None, *args, **kwargs)[source]

Bases:airflow.models.BaseOperator

Copy data from Cassandra to Google cloud storage in JSON format

Note: Arrays of arrays are not supported.

template_fields = ['cql', 'bucket', 'filename', 'schema_filename'][source]
template_ext = ['.cql'][source]
ui_color = #a0e08c[source]
CQL_TYPE_MAP[source]
execute(self, context)[source]
_query_cassandra(self)[source]

Queries cassandra and returns a cursor to the results.

_write_local_data_files(self, cursor)[source]

Takes a cursor, and writes results to a local file.

Returns

A dictionary where keys are filenames to be used as object names in GCS, and values are file handles to local files that contain the data for the GCS objects.

_write_local_schema_file(self, cursor)[source]

Takes a cursor, and writes the BigQuery schema for the results to a local file system.

Returns

A dictionary where key is a filename to be used as an object name in GCS, and values are file handles to local files that contains the BigQuery schema fields in .json format.

_upload_to_gcs(self, files_to_upload)[source]
classmethod generate_data_dict(cls, names, values)[source]
classmethod convert_value(cls, name, value)[source]
classmethod convert_array_types(cls, name, value)[source]
classmethod convert_user_type(cls, name, value)[source]

Converts a user type to RECORD that contains n fields, where n is the number of attributes. Each element in the user type class will be converted to its corresponding data type in BQ.

classmethod convert_tuple_type(cls, name, value)[source]

Converts a tuple to RECORD that contains n fields, each will be converted to its corresponding data type in bq and will be named ‘field_<index>’, where index is determined by the order of the tuple elements defined in cassandra.

classmethod convert_map_type(cls, name, value)[source]

Converts a map to a repeated RECORD that contains two fields: ‘key’ and ‘value’, each will be converted to its corresponding data type in BQ.

classmethod generate_schema_dict(cls, name, type)[source]
classmethod get_bq_fields(cls, name, type)[source]
classmethod is_simple_type(cls, type)[source]
classmethod is_array_type(cls, type)[source]
classmethod is_record_type(cls, type)[source]
classmethod get_bq_type(cls, type)[source]
classmethod get_bq_mode(cls, type)[source]