# Ubuntu 22.04
# Java 1.8
$ sudo apt install openjdk-8-jdk
$ java -version
# Kafka 3.2.0
$ curl "https://dlcdn.apache.org/kafka/3.2.0/kafka_2.13-3.2.0.tgz" -o ~/Downloads/kafka.tgz
$ mkdir ~/kafka && cd ~/kafka
$ tar -xvzf ~/Downloads/kafka.tgz --strip 1
$ git clone https://github.com/Chaoyism/Kafka-Demo.git
# Start the ZooKeeper service in a terminal session.
# Note: Soon, ZooKeeper will no longer be required by Apache Kafka.
$ bin/zookeeper-server-start.sh config/zookeeper.properties
# Start the Kafka broker service in another terminal session.
$ bin/kafka-server-start.sh config/server.properties
# You need a Kafka topic with three or more partitions running.
# If you don't have one, create one using the following command:
# topicDemo is the name of the topic, remember to modify the corresponding name in application.yaml if you name the topic in another way.
$ bin/kafka-topics.sh --create --topic topicDemo --partitions 3 --bootstrap-server localhost:9092
# Create topic topicDemo
Open the folder as a project in Intellij IDEA, build and run locally.
Access Port 8080 and begin sending the messages.
http://localhost:8080/kafka/send
See the Throughput (for 0~99999).
This part is to respond requirements in the PRD.
Write 0 to 9,999,999 to Kafka. The range of the numbers can be adjusted in application.yaml.
In producer, ACK=all will make the producer to resend a message when it did not receive a feedback from the consumer, ensuring that each number will exist.
In producer, enable.idempotence=true will assign an ID to each message. When the producer did not receive an ACK and resend a message, the broker will check if it has received a message with the same id. If it has, then it will discard the message and send an ACK back to the producer, so that duplications can be avoided.
The kafkaTemplate.send() function can specify the target partition id. In this application, number n will be sent to partition (n-1) % 3, so the balanced loads can be achieved.
Because the numbers are sent in ascending order. The latter number will be sent only when the former one has been received and stored (ACK).
Refer to consumeMsg() in Controller.java.
In consumer, enable-auto-commit=false will disable the auto committing.
In consumer, a HashSet is maintained to record the numbers committed, so if the batch is resent, then the committed message in the hashset can be discarded.
It takes 2792ms to produce and consume 100,000 messages.
https://kafka.apache.org/quickstart
https://developers.redhat.com/articles/2022/04/05/developers-guide-using-kafka-java-part-1#kafka_architecture https://www.cnblogs.com/lovesqcc/p/14379440.html
https://www.youtube.com/watch?v=MFMzxdpn6v4 https://stackoverflow.com/questions/56667985/could-not-resolve-placeholder-kafka-bootstrap-servers-in-string-value-kafka https://spring.io/projects/spring-kafka https://www.tutorialspoint.com/spring_boot/spring_boot_apache_kafka.htm https://github.com/spring-projects/spring-kafka
https://ohmyweekly.github.io/notes/2021-04-14-getting-started-with-kafka-and-rust-part1/ https://www.confluent.io/blog/getting-started-with-rust-and-kafka/#:~:text=Rust%20libraries%20for%20Kafka,is%20a%20wrapper%20for%20librdkafka. https://docs.rs/kafka/latest/kafka/