Coder Social home page Coder Social logo

Comments (2)

viktort avatar viktort commented on June 28, 2024

Hi @salihkardan

I don't think this is the same issue I had to resolve flume side. It looks like you need to define your own regex interceptor and get the timestamp out of the log data.

eg: This is an example of my log data from one of the services

 {"service":"example_service","event":"server_restart","timestamp":"1386090510581","uuid":"5kneh567-4bd8-49a1-8cd8-4cf142fb0bff","port":"8091","source_ip":"127.0.0.1","info":"example_service is alive on port 8989"}

then I extract the required data to add to event headers, ready for the flume-ng-cassandra-sink by:

orion.sources.spoolDir.type = spooldir
orion.sources.spoolDir.spoolDir = /mnt/spoolingDirLocation
orion.sources.spoolDir.inputCharset = UTF-8
orion.sources.spoolDir.deserializer.maxLineLength = 209715200
orion.sources.spoolDir.deletePolicy = never
orion.sources.spoolDir.interceptors = addSrc addHost addTimestamp addUUID

orion.sources.spoolDir.interceptors.addSrc.type = regex_extractor
orion.sources.spoolDir.interceptors.addSrc.regex = \"service\"\:\"([^"]*)
orion.sources.spoolDir.interceptors.addSrc.serializers = s1
orion.sources.spoolDir.interceptors.addSrc.serializers.s1.name = src

orion.sources.spoolDir.interceptors.addUUID.type = regex_extractor
orion.sources.spoolDir.interceptors.addUUID.regex = \"uuid\"\:\"([^"]*)
orion.sources.spoolDir.interceptors.addUUID.serializers = s1
orion.sources.spoolDir.interceptors.addUUID.serializers.s1.name = key

orion.sources.spoolDir.interceptors.addHost.type = org.apache.flume.interceptor.HostInterceptor$Builder
orion.sources.spoolDir.interceptors.addHost.preserveExisting = false
orion.sources.spoolDir.interceptors.addHost.useIP = true
orion.sources.spoolDir.interceptors.addHost.hostHeader = host

orion.sources.spoolDir.interceptors.addTimestamp.type = regex_extractor
orion.sources.spoolDir.interceptors.addTimestamp.regex = \"timestamp\"\:\"([^"]*)
orion.sources.spoolDir.interceptors.addTimestamp.serializers = s1
orion.sources.spoolDir.interceptors.addTimestamp.serializers.s1.name = timestamp

Expanding on #8 I reported, this was what I believe, a bug with the Spooling Directory Source in Flume 1.4

When my logs contained any special chars represented by 2 bytes in UTF-8 in my case the £ sign, check out the rest here http://www.utf8-chartable.de/ , when the data was being read in from spooling dir in chunks and one of those chunks happened to pull in the first byte of the special char, it was dropping the rest of the data in the file.

This is now fixed in Flume 1.5 which is still under development. I've cloned the dir and im compiling and packaging the code from that until 1.5 is released. This fixed the above issues and I havent seen any other bugs so happy to go to production with it.

Let me know if you require further info.

Viktor

from flume-ng-cassandra-sink.

viktort avatar viktort commented on June 28, 2024

I have started to see this error in production as well. Seems odd as timestamp event header is present but ends up as null at the point where its parsed to a long in FlumeLogEvent.java

public long getTimestamp() {
    return Long.parseLong(getHeader(HEADER_TIMESTAMP));
}

@salihkardan - did you manage to get to the bottom of this?

Getting this error in production very sporadically - investigating further now but any info would be great. cc @btoddb

from flume-ng-cassandra-sink.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.