Coder Social home page Coder Social logo

logkafka's Introduction

logkafka - Collect logs and send lines to Apache Kafka 0.8+

Introduction 中文文档

logkafka sends log file contents to kafka 0.8 line by line. It treats one line of file as one kafka message.

See FAQ if you wanna deploy it in production environment.

logkafka

Gitter

Features

  • log collecting configuration management with zookeeper
  • [log path with timeformat (collect files chronologically)](docs/Features.md#Log Path Pattern)
  • log file rotating
  • batching messages
  • compression (none, gzip, snappy)
  • [message regex filter](docs/Features.md#Regex Filter)
  • [user-defined line delimiter](docs/Features.md#Line Delimiter)
  • user-defined monitor

Differences with other log aggregation and monitoring tools

The main differences with flume, fluentd, logstash are

  • Management of log collecting configs and state:

    flume, fluentd, logstash keep log file configs and state locally: start local server for management of log file configs and state.

    logkafka keep log file configs and state in zookeeper: watch the node in zookeeper for configs changing, record file position in local position file, and push position info to zookeeper periodically.

  • Order of log collecting

    flume, fluentd, logstash all have INPUT type 'tail', they collecting all files simultaneously, without considering chronological order of log files.

    logkafka will collect files chronologically.

Users of logkafka

details...

Requirements

  • librdkafka

  • libzookeeper_mt

  • libuv

  • libpcre2

  • PHP 5.3 and above (with zookeeper extension)

Build

Two methods, choose accordingly.

  1. Install librdkafka(>0.8.6), libzookeeper_mt, libuv(>v1.6.0), libpcre2(>10.20) manually, then

    cmake -H. -B_build -DCMAKE_INSTALL_PREFIX=_install
    cd _build
    make -j4
    make install
    
  2. Just let cmake handle the dependencies ( cmake version >= 3.0.2 ).

    cmake -H. -B_build -DCMAKE_INSTALL_PREFIX=_install \
                       -DINSTALL_LIBRDKAFKA=ON \
                       -DINSTALL_LIBZOOKEEPER_MT=ON \
                       -DINSTALL_LIBUV=ON \
                       -DINSTALL_LIBPCRE2=ON
    cd _build
    make -j4
    make install
    

Usage

Note: If you already have kafka and zookeeper installed, you can start from step 2 and replace zk connection string with your own in the following steps, default is 127.0.0.1:2181.

  1. Deploy Kafka and Zookeeper in local host

    tools/grid bootstrap
    
  2. Start logkafka

    • local conf

    Customizing _install/conf/logkafka.conf to your needs

     zookeeper.connect = 127.0.0.1:2181
     pos.path       = ../data/pos.myClusterName
     line.max.bytes = 1048576
     ...
    
    • run

    Run in the foreground

    _install/bin/logkafka -f _install/conf/logkafka.conf -e _install/conf/easylogging.conf
    

    Or as a daemon

    _install/bin/logkafka --daemon -f _install/conf/logkafka.conf -e _install/conf/easylogging.conf
    
  3. Configs Management

Use UI or command line tools.

3.1 UI (with kafka-manager)

We add logkafka as one kafka-manager extension. You need to install and start kafka-manager, add cluster with logkafka enabled, then you can manage logkafka with the 'Logkafka' menu.

  • How to add cluster with logkafka enabled

logkafka

  • How to create new config

logkafka

  • How to delete configs

logkafka

  • How to list configs and monitor sending progress

logkafka

3.2 Command line tools

We use php script (tools/log_config.php) to create/delete/list collecting configurations in zookeeper nodes.

If you do not know how to install php zookeeper module, check this.

  • How to create configs

    Example:

    Collect apache access log on host "test.qihoo.net" to kafka brokers with zk connection string "127.0.0.1:2181". The topic is "apache_access_log".

    php tools/log_config.php --create \
                             --zookeeper_connect=127.0.0.1:2181 \
                             --logkafka_id=test.qihoo.net \
                             --log_path=/usr/local/apache2/logs/access_log.%Y%m%d \
                             --topic=apache_access_log
    

    Note:

    • [hosname, log_path] is the key of one config.
  • How to delete configs

    php tools/log_config.php --delete \
                             --zookeeper_connect=127.0.0.1:2181 \
                             --logkafka_id=test.qihoo.net \
                             --log_path=/usr/local/apache2/logs/access_log.%Y%m%d
    
  • How to list configs and monitor sending progress

    php tools/log_config.php --list --zookeeper_connect=127.0.0.1:2181
    

    shows

    logkafka_id: test.qihoo.net
     log_path: /usr/local/apache2/logs/access_log.%Y%m%d
     Array
     (
         [conf] => Array
             (
                 [logkafka_id] => test.qihoo.net
                 [log_path] => /usr/local/apache2/logs/access_log.%Y%m%d
                 [topic] => apache_access_log
                 [partition] => -1
                 [key] =>
                 [required_acks] => 1
                 [compression_codec] => none
                 [batchsize] => 1000
                 [message_timeout_ms] => 0
                 [follow_last] => 1
                 [valid] => 1
             )
    
     )
    
More details about configuration management, see `php tools/log_config.php --help`.

Benchmark

We test with 2 brokers, 2 partitions

Name Description
rtt min/avg/max/mdev 0.478/0.665/1.004/0.139 ms
message average size 1000 bytes
batchsize 1000
required_acks 1
compression_codec none
message_timeout_ms 0
peak rates 20.5 Mb/s

Third Party

The most significant third party packages are:

  • confuse

  • easylogging

  • tclap

  • rapidjson

Thanks to the creators of these packages.

Developers

Make sure you have lcov installed, check this

compile with unittest and debug type

cmake -H. -B_build -DCMAKE_INSTALL_PREFIX=_install \
                   -Dtest=ON \
                   -DCMAKE_BUILD_TYPE=Debug
cd _build
make
make logkafka_coverage  # run unittest

TODO

  1. Multi-line mode

logkafka's People

Contributors

zheolong avatar chancey avatar henriquemcastro avatar

Watchers

James Cloos avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.