Apache Spark Operators¶
ConsumeFromTopicOperator¶
An operator that consumes from Kafka one or more Kafka topic(s) and processes the messages. The operator creates a Kafka consumer that reads a batch of messages from the cluster and processes them using the user supplied callable function. The consumer will continue to read in batches until it reaches the end of the log or reads a maximum number of messages is reached.
For parameter definitions take a look at ConsumeFromTopicOperator
.
Using the operator¶
t2 = ConsumeFromTopicOperator(
kafka_config_id="t2",
task_id="consume_from_topic",
topics=["test_1"],
apply_function="hello_kafka.consumer_function",
apply_function_kwargs={"prefix": "consumed:::"},
commit_cadence="end_of_batch",
max_messages=10,
max_batch_size=2,
)
Reference¶
For further information, see the Apache Kafka Consumer documentation.
ProduceToTopicOperator¶
An operator that produces messages to a Kafka topic. The operator creates a Kafka consumer that reads a batch of messages from the cluster and processes them using the user supplied callable function. The consumer will continue to read in batches until it reaches the end of the log or reads a maximum number of messages is reached.
For parameter definitions take a look at ProduceToTopicOperator
.
Using the operator¶
t1 = ProduceToTopicOperator(
kafka_config_id="t1-3",
task_id="produce_to_topic",
topic="test_1",
producer_function="hello_kafka.producer_function",
)
Reference¶
For further information, see the Apache Kafka Producer documentation.