Coder Social home page Coder Social logo

vagrant-kafka's Introduction

Vagrant - Kafka

Vagrant configuration to setup a partitioned Apache Kafka installation with clustered Apache Zookeeper.

This configuration will start and provision six CentOS6 VMs:

  • Three hosts forming a three node Apache Zookeeper Quorum (Replicated ZooKeeper)
  • Three Apache Kafka nodes with one broker each

Each host is a Centos 6.9 64-bit VM provisioned with JDK 8 and Kafka 1.1.0.

Here we will be using the verion of Zookeeper that comes pre-packaged with Kafka. This will be Zookeeper version 3.4.10 for the version of Kafka we use.

Prerequisites

  • Vagrant (tested with 2.0.2) [make sure you are on 2.x.x version of Vagrant]
  • VirtualBox (tested with 5.1.12)

Setup

To start it up, just git clone this repo and execute vagrant up. This will take a while the first time as it downloads all required dependencies for you.

Kafka is installed on all hosts and can be easily accessed through the environment variable $KAFKA_HOME

Here is the mapping of VMs to their private IPs:

VM Name Host Name IP Address
zookeeper1 vkc-zk1 10.30.3.2
zookeeper2 vkc-zk2 10.30.3.3
zookeeper3 vkc-zk3 10.30.3.4
broker1 vkc-br1 10.30.3.30
broker2 vkc-br2 10.30.3.20
broker3 vkc-br3 10.30.3.10

Hosts file entries:

10.30.3.2	vkc-zk1
10.30.3.3 	vkc-zk2
10.30.3.4 	vkc-zk3
10.30.3.30 	vkc-br1
10.30.3.20 	vkc-br2
10.30.3.10 	vkc-br3

Zookeeper servers bind to port 2181. Kafka brokers bind to port 9092.

Let's test it!

First test that all nodes are up vagrant status. The result should be similar to this:

Current machine states:

zookeeper1                running (virtualbox)
zookeeper2                running (virtualbox)
zookeeper3                running (virtualbox)
broker1                   running (virtualbox)
broker2                   running (virtualbox)
broker3                   running (virtualbox)


This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run 'vagrant status NAME''.

Login to any host with e.g., vagrant ssh broker1. Some scripts have been included for convenience:

  • Create a new topic /vagrant/scripts/create-topic.sh <topic name> (create as many as you see fit)

    Note: If this step fails, exit the VM and run vagrant up --provision (if error persists, please file an issue)

  • Topics can be listed with /vagrant/scripts/list-topics.sh

  • Start a console producer /vagrant/scripts/producer.sh <topic name>. Type few messages and seperate them with new lines (ctl-C to exit).

  • /vagrant/scripts/consumer.sh <topic name>: this will create a console consumer, getting messages from the topic created before. It will read all the messages each time starting from the beginning.

Now anything you type in producer, it will show on the consumer.

Teardown

To destroy all the VMs

vagrant destroy -f

Insights

Zookeeper (ZK)

Kafka is using ZK for its coordination, bookkeeping, and configuration. Here are some commands you can run on any of the nodes to see some of the internal ZK structures created by Kafka.

Open a ZK shell

$KAFKA_HOME/bin/zookeeper-shell.sh 10.30.3.2:2181

(you can use the IP of any of the ZK servers)

Inside the shell we can browse the zNodes similar to a Linux filesystem:

ls /
[cluster, controller, controller_epoch, brokers, zookeeper, admin, isr_change_notification, consumers, log_dir_event_notification, latest_producer_id_block, config]

ls /brokers/topics
[t1, t2, __consumer_offsets]

ls /brokers/ids
[1, 2, 3]

We can see that there are two topics created (t1, t2) and we already know that we have three brokers with ids 1,2,3.

After you have enough fun browsing ZK, type ctl-C to exit the shell.

Get ZK version

First we need to install nc:

sudo yum install nc -y

To get the version of ZK type:

echo status | nc 10.30.3.2 2181

You can replace 10.30.3.2 with any ZK IP 10.30.3.<2,3,4> and execute the above command from any node within the cluster.

Q: Which Zookeeper server is the leader?

Here is a simple script that asks each server for its mode:

for i in 2 3 4; do
   echo "10.30.3.$i is a "$(echo status | nc 10.30.3.$i 2181 | grep ^Mode | awk '{print $2}')
done

Kafka

Let's explore other ways to ingest data to Kafa from the command line.

Login to any of the 6 nodes

vagrant ssh zookeeper1

Create a topic

 /vagrant/scripts/create-topic.sh test-one

Send data to the Kafka topic

echo "Yet another line from stdin" | $KAFKA_HOME/bin/kafka-console-producer.sh \
   --topic test-one --broker-list vkc-br1:9092,vkc-br2:9092,vkc-br3:9092

You can then test that the line was added by running the consumer

/vagrant/scripts/consumer.sh test-one
Add a continued stream of data

Running vmstat will periodically export stats about the VM you are attached to.

>vmstat -a 1 -n 100

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free  inact active   si   so    bi    bo   in   cs us sy id wa st
 0  0    960 113312 207368 130500    0    0    82   197  130  176  0  1 99  0  0
 0  0    960 113312 207368 130500    0    0     0     0   60   76  0  0 100  0  0
 0  0    960 113304 207368 130540    0    0     0     0   58   81  0  0 100  0  0
 0  0    960 113304 207368 130540    0    0     0     0   53   76  0  1 99  0  0
 0  0    960 113304 207368 130540    0    0     0     0   53   78  0  0 100  0  0
 0  0    960 113304 207368 130540    0    0     0    16   64   90  0  0 100  0  0

Redirecing this output to Kafka creates a basic form of a streaming producer.

vmstat -a 1 -n 100 | $KAFKA_HOME/bin/kafka-console-producer.sh \
   --topic test-one --broker-list vkc-br1:9092,vkc-br2:9092,vkc-br3:9092 &

While the producer runs in the background you can start the consumer to see what happens

/vagrant/scripts/consumer.sh test-one

You should be seeing the output of vmstat in the consumer console.

When you are all done, kill the consumer by ctl-C. The producer will terminate by itself after 100 seconds.

Offsets

The create-topic.sh script creates a topic with replication factor 3 and 1 number of partitions.

Assuming you have completed the vmstat example above using topic test-one:

/vagrant/scripts/get-offset-info.sh test-one
test-one:0:102

There is one partition (id 0) and the last offset was 102 (from vmstat: 100 lines of reports + 2 header lines) We asked Kafka for the last offset written so far using --time -1 (as seen in get-offset-info.sh). You can change the time to -2 to get the first offset.

vagrant-kafka's People

Contributors

caddac avatar eladleev avatar eucuepo avatar imarios avatar luiscarlin avatar padawin avatar xtruan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vagrant-kafka's Issues

Getting error when create or listing topics

Running the create_topics.sh or list-topics.sh script give me the following error over and over until I ctrl + c to quit:

[vagrant@broker1 ~]$ /vagrant/scripts/create_topic.sh myTopic
[2017-01-15 18:49:40,997] WARN Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.
ClientCnxn)
java.net.NoRouteToHostException: No route to host
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)

Error 404 when downloading Kafka

The link https://poole.im/files/kafka-0.8.0-9.x86_64.rpm no longer works (that's right, https - since the http redirects here). Moreover, when it is https then it should have the --no-check-certificate parameter.

Error during jdk install

Dear Eugenio,
first of all, thank you for this vagrant configuration.

Following issue occurred during installation on each broker:

Example broker:

==> broker2: iptables: Unloading modules:
==> broker2: [  OK  ]
==> broker2: installing jdk and kafka ...
==> broker2: error:
==> broker2: not an rpm package
==> broker2: error:
==> broker2: /vagrant/rpm/jdk-8u73-linux-x64.rpm: not an rpm package (or package manifest):
==> broker2: kafka_2.10-0.9.0.1/

Due to error in jdk installation new Kafka topic can't be created:

[vagrant@broker1 scripts]$ ./create_topic.sh test
/home/vagrant/kafka_2.10-0.9.0.1/bin/kafka-run-class.sh: line 167: exec: java: not found

Vagrant Version: 1.9.5
Windows 8.1
Oracle VirtualBox 5.1.22

Can't run scripts to create or list topics

I've succesfully ssh'd into one of the brokers, but when trying to run the script (or any of the scripts) /vagrant/scripts/create_topic.sh I get the error -bash: /vagrant/scripts/create_topic.sh: /bin/bash^M: bad interpreter: No such file or directory

I'm running this on Windows 10 in Powershell, but I've also tried in Cmd. Looks like a line ending issue?

Cannot connect to zookeeper node with openjdk-jre

I was getting this error when trying to create a topic after starting the VMs.

[2017-01-15 18:49:40,997] WARN Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.
ClientCnxn)
java.net.NoRouteToHostException: No route to host
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)

This initial issue was caused by zookeeper not being started when the VM came up. I fixed this by manually starting ZK on each VM. However, this led to a new issue- I was timing out after 30 seconds trying to create a topic.

After perusing some of the previously closed issues (specifically #5 and #6) I found that some fiddling had been done to the version of Java being installed. In issue #6 it was suggested to query the JVM using jps, a tool that is available with the JDK but not the JRE. I found that the JRE is installed when issuing -su -c "yum -y install java-1.8.0-openjdk". To get the JDK, you must append -devel.

After changing the yum installation to install the JDK, running a vagrant destroy and a vagrant up, and lastly manually starting the ZK nodes again, topic creation worked and I was able to query the JVM and see QuorumPeerMain.

zookeeper1: chmod: cannot access `/vagrant/scripts/*.sh': No such file or directory

Ran into this on Windows (two different machines). This one is 1809.

    zookeeper1: Installed:
    zookeeper1:   java-1.8.0-openjdk-devel.x86_64 1:1.8.0.222.b10-0.el6_10
    zookeeper1:
    zookeeper1: Dependency Installed:
    zookeeper1:   atk.x86_64 0:1.30.0-1.el6
[…]
    zookeeper1:   xorg-x11-fonts-Type1.noarch 0:7.2-11.el6
    zookeeper1:
    zookeeper1: Complete!
    zookeeper1: kafka_2.11-1.1.0/
    zookeeper1: kafka_2.11-1.1.0/LICENSE
    zookeeper1: kafka_2.11-1.1.0/NOTICE
    zookeeper1: kafka_2.11-1.1.0/bin/
    zookeeper1: kafka_2.11-1.1.0/bin/connect-distributed.sh
    zookeeper1: kafka_2.11-1.1.0/bin/connect-standalone.sh
    zookeeper1: kafka_2.11-1.1.0/bin/kafka-acls.sh
    zookeeper1: kafka_2.11-1.1.0/bin/kafka-broker-api-versions.sh
    zookeeper1: kafka_2.11-1.1.0/bin/kafka-configs.sh
[…]
    zookeeper1: kafka_2.11-1.1.0/libs/kafka-streams-examples-1.1.0.jar
    zookeeper1: done installing JDK and Kafka...
    zookeeper1: chmod: cannot access `/vagrant/scripts/*.sh': No such file or directory
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.
C:\oss\vagrant-kafka>

Oracle lincense not accepted when downloading JDK

The license need to be accepted in order to download the JDK from Oracle.

To make that happen, just add this cookie to wget:
Cookie: oraclelicense=accept-securebackup-cookie

Resulting in:

wget -O /vagrant/rpm/jdk-7u45-linux-x64.rpm --no-cookies --no-check-certificate --header "Cookie: oraclelicense=accept-securebackup-cookie" "http://download.oracle.com/otn-pub/java/jdk/7u45-b18/jdk-7u45-linux-x64.rpm"

The java rpm is an empty file

I installed vagrant-kafka, but it logged

    broker3: installing JDK and Kafka...
    broker3: error: 
    broker3: not an rpm package
    broker3: error: 
    broker3: /vagrant/rpm/jdk-8u151-linux-x64.rpm: not an rpm package (or package manifest): 

None of the vms had java, so I got

$ vagrant ssh zookeeper1
[vagrant@zookeeper1 ~]$ /vagrant/scripts/create-topic.sh test-one
/home/vagrant/kafka_2.11-1.0.0/bin/kafka-run-class.sh: line 270: exec: java: not found

broker2 installation fails

Tried installing vagrant-kafka on macOS Sierra using VirtualBox 5.1.30 r118389 (Qt5.6.3) and Guest Additions 5.0.16. Installation went smoothly until broker2. Seems to be a case of an incorrect or missing RPM. Pasting pertinent log section below:

==> broker2: Importing base box 'puppetlabs/centos-6.6-64-puppet'...
==> broker2: Matching MAC address for NAT networking...
==> broker2: Checking if box 'puppetlabs/centos-6.6-64-puppet' is up to date...
==> broker2: Setting the name of the VM: vagrant-kafka-master_broker2_1513063725650_57316
==> broker2: Fixed port collision for 22 => 2222. Now on port 2203.
==> broker2: Clearing any previously set network interfaces...
==> broker2: Preparing network interfaces based on configuration...
    broker2: Adapter 1: nat
    broker2: Adapter 2: intnet
==> broker2: Forwarding ports...
    broker2: 22 (guest) => 2203 (host) (adapter 1)
==> broker2: Running 'pre-boot' VM customizations...
==> broker2: Booting VM...
==> broker2: Waiting for machine to boot. This may take a few minutes...
    broker2: SSH address: 127.0.0.1:2203
    broker2: SSH username: vagrant
    broker2: SSH auth method: private key
==> broker2: Machine booted and ready!
==> broker2: Checking for guest additions in VM...
    broker2: The guest additions on this VM do not match the installed version of
    broker2: VirtualBox! In most cases this is fine, but in rare cases it can
    broker2: prevent things such as shared folders from working properly. If you see
    broker2: shared folder errors, please make sure the guest additions within the
    broker2: virtual machine match the version of VirtualBox you have installed on
    broker2: your host and reload your VM.
    broker2:
    broker2: Guest Additions Version: 5.0.16
    broker2: VirtualBox Version: 5.1
==> broker2: Setting hostname...
==> broker2: Configuring and enabling network interfaces...
    broker2: SSH address: 127.0.0.1:2203
    broker2: SSH username: vagrant
    broker2: SSH auth method: private key
==> broker2: Mounting shared folders...
    broker2: /vagrant => /Users/bdiego/CO/Projects/Kafka/vagrant-kafka-master
==> broker2: Running provisioner: shell...
    broker2: Running: /var/folders/rd/m7t7bhts30qf7dn782_s6lh80000gp/T/vagrant-shell20171212-1689-1nb2rx3.sh
    broker2: Downloading kafka...0.9.0.1
    broker2: iptables: Setting chains to policy ACCEPT: filter
    broker2: [  OK  ]
    broker2: iptables: Flushing firewall rules:
    broker2: [  OK  ]
    broker2: iptables: Unloading modules:
    broker2: [  OK  ]
    broker2: installing jdk and kafka ...
    broker2: error:
    broker2: not an rpm package
    broker2: error:
    broker2: /vagrant/rpm/jdk-8u73-linux-x64.rpm: not an rpm package (or package manifest):
    broker2:
    broker2: gzip:
    broker2: stdin: unexpected end of file
    broker2: tar: Child returned status 1
    broker2: tar: Error is not recoverable: exiting now
    broker2: chown:
    broker2: cannot access `kafka_2.10-0.9.0.1'
    broker2: : No such file or directory
    broker2: done installing jdk and Kafka
==> broker2: Running provisioner: shell...
    broker2: Running: /var/folders/rd/m7t7bhts30qf7dn782_s6lh80000gp/T/vagrant-shell20171212-1689-nqf6xf.sh
    broker2: /tmp/vagrant-shell: line 5: /home/vagrant/kafka_2.10-0.9.0.1/bin/kafka-server-start.sh: No such file or directory
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.