在Kafka(0.11.0.1)Streams中,一个演示应用程序与Streams应用程序一起玩
// Serializers/deserializers (serde) for String and Long types
final Serde<String> stringSerde = Serdes.String();
final Serde<Long> longSerde = Serdes.Long();
// Construct a `KStream` from the input topic "streams-plaintext-input", where message values
// represent lines of text (for the sake of this example, we ignore whatever may be stored
// in the message keys).
KStream<String, String> textLines = builder.stream(stringSerde, stringSerde, "streams-plaintext-input");
KTable<String, Long> wordCounts = textLines
// Split each text line, by whitespace, into words.
.flatMapValues(value -> Arrays.asList(value.toLowerCase().split("\\W+")))
// Group the text words as message keys
.groupBy((key, value) -> value)
// Count the occurrences of each word (message key).
.count("Counts")
// Store the running counts as a changelog stream to the output topic.
wordCounts.to(stringSerde, longSerde, "streams-wordcount-output");
第5步,在处理一些数据后,我们可以在接收器主题流-单词计数-输出中看到压缩的KV对(例如流2),
> bin/kafka-console-consumer.sh --bootstrap-server localhost:9092
--topic streams-wordcount-output \
--from-beginning \
--formatter kafka.tools.DefaultMessageFormatter \
--property print.key=true \
--property print.value=true \
--property key.deserializer=org.apache.kafka.common.serialization.StringDeserializer \
--property value.deserializer=org.apache.kafka.common.serialization.LongDeserializer
all 1
streams 1
lead 1
to 1
kafka 1
hello 1
kafka 2
streams 2
问题是上述数据中的KTable wordCounts如何以Key-Value样式将数据写入主题流-单词计数-输出?
主题流-wordcount-输出的选项清理. policy似乎是默认值,删除
,而不是紧凑
(通过bin/kafka-configs.sh)
所有输入和输出主题都“超出”Kafka Streams的范围。创建和配置这些主题是用户的责任。
因此,您的主题"stream s-wordcount-out"
将具有您在创建主题时指定的配置。
比照https://docs.confluent.io/current/streams/developer-guide.html#managing-topics-of-a-kafka-streams-application