Coder Social home page Coder Social logo

cloudwatchfh2hec's People

Contributors

pauld-splunk avatar ptdavies17 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cloudwatchfh2hec's Issues

gzip file error

I'm getting an error when lambda tries to run the codes after invoking it. Can someone help me:

Not a gzipped file: IOError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 231, in handler
records = list(processRecords(event['records'],streamARN))
File "/var/task/lambda_function.py", line 115, in processRecords
data = json.loads(f.read())
File "/usr/lib64/python2.7/gzip.py", line 260, in read
self._read(readsize)
File "/usr/lib64/python2.7/gzip.py", line 302, in _read
self._read_gzip_header()
File "/usr/lib64/python2.7/gzip.py", line 196, in _read_gzip_header
raise IOError, 'Not a gzipped file'
IOError: Not a gzipped file

That is the error I'm seeing on CloudWatch. Maybe I need to change something in script?

Proper way to support non VPC and cloudtrail sourcetypes

RE: How to Ingest Any Log from AWS Cloudwatch Logs via Firehose

I was wondering how to best use CloudwatchFH2HEC.py to ship other log sourcestypes besides VPC and cloudtrail logs (the only two sourcestypes defined in the example script). Which of the approaches below would do you recommend if any? Ideally I could use the same transform function for all Firehose to HEC log shipping.

  1. Add a case statement to match additional cloudwatch log group names to their destination sourcetypes
  2. don't set the sourcetypes at all and let Splunk handle it somehow
  3. set SPLUNK_SOURCETYPE=aws:firehose:json

Alternatively I could create separate lambda functions for each sourcetype and pass different values for SPLUNK_SOURCETYPE in the environment variable configuration... but that feels like an anti-pattern.

List of example sourcetypes/use-cases from cloudwatch logs

multiple host/source/sourcetype values being set

I might have missed some settings during configuration and I'm ending up with multiple values for host/source/sourcetype

    host =	http-inputs-company.splunkcloud.com	host = arn:aws:firehose:us-east-1:123456789:deliverystream/splunk	
    source = http:HEC	source =	Destination:/aws/lambda/mylanbda	
    sourcetype = httpevent	sourcetype =	aws:cloudwatchlogs:lambda	

Above prevents using host/source/sourcetype search terms (as it doesn't match multiple values?)

Is this the expected behaviour?

New py3 version is missing import, returns wrong string

There are a couple of issues with the new Python 3 version of Cloudwatch2FH2HEC.py. import os was removed even though os is still used. Also, the return value of transformLogEvent was changed such that the return_message variable is now unused.

Some records failed while calling PutRecordBatch to Firehose stream, retrying. Individual error codes: ServiceUnavailableException

Hi there,
Thanks for your work on putting this script together it's helping us hugely!
We are seeing an error (where we are processing 3-4000 records of various sizes) where we are presumably trying to send too much data to the Firehose (metrics seem to show it throttles on bytes per second limit being hit) and hitting the limits and doing that 20 times therefore the function is erroring out.

We see lots of "Some records failed while calling PutRecordBatch to Firehose stream, retrying. Individual error codes: ServiceUnavailableException," in the CLoudWatch logs for the function.

Sometimes it seems to get through eventually, but sometimes it hits the 20 retries limit and errors with:

"[ERROR] RuntimeError: Could not put records after 20 attempts. Individual error codes: ServiceUnavailableException"

Looking online it seems to general fix for such issues is to implement a back-off and retry process as per: https://docs.aws.amazon.com/firehose/latest/APIReference/API_PutRecordBatch.html and https://docs.aws.amazon.com/general/latest/gr/api-retries.html.

I was planning on implementing this into your code, but before doing so wondered if there was an easier fix you knew of?

Thanks in advance

Application Load Balancer

Is it required to use a classic load balancer? We were unable to get an ALB to work (sticky session error) however the classic load balancer did work.

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.