Coder Social home page Coder Social logo

girishg4t / dd-downloader Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 4.0 157 KB

A tool to download large amount of datadog logs in parallel & with custom header. Give it a star if you like it

Home Page: https://dev.to/girishg4t/how-to-download-large-amount-of-datadog-logs-in-parallel-471m

License: Apache License 2.0

Makefile 4.35% Go 91.07% Shell 4.58%
cli datadog datadog-api datadog-logs downloader

dd-downloader's Introduction

dd-log-downloader

Download large amount of datadog logs in csv format as per template, this data can be inserted into database or any other tool for analysis, since there is 100k limit for downloading the data, hence this tool. I have created a blog for the same blog

Usage

Step 1: Clone the repo

$ git clone https://github.com/girishg4t/dd-downloader.git

Step 2: Build binary using make

$ make

Step 3: Run the command

$ dd-downloader generate config --name=config.yaml # generates the sample yaml file with date range of 10min
$ dd-downloader validate --config-file=./sample_templates/event_sent.yaml # just validate if the mapping and template is correct
$ dd-downloader run sync --config-file=templates/queued_event.yaml --file=output.csv # will download logs one after the other in chucks of 5000
$ dd-downloader run parallel --config-file=templates/private_event.yaml --file=output.csv  # will run 10 parallel threads to reduce the time of download

Flow:

flow

Prerequisite

You need to create the yaml config file as per examples

Things to keep in mind

auth:
- dd_api_key => need to specify datadog api key
- dd_app_key => need to specify datadog app key

more details are here datadog

datadog_filter:
- query => logs will be filtered based on this query, verify it in datadog before using
- from => from which date the logs need to be downloaded, supports only unix milliseconds
- to => to which date, supports only unix milliseconds 

more details are here datadog

mapping:
- field: This is used for header in csv file
- dd_field: datadog log field need to be mapped to above csv header, (check the logs in datadog and get the fields you want to map)
- inner_field: Since plane data can be mapped easily, however for mapping the Array you need to use this field

eg. 
for below yaml mapping
field 'date' is taken from 'log.Attributes.Attributes' same for 'session_id'
for inner object we need to specify '.' and for array we need to specify '-'
in below log from datadog we need to map reqId which is inside the array of data 
{
  "data": [
    {
      "event": {
        "snid": "dasgadsgasdgasd",
        "data": {
          "act": "ASDGASDGDDD",
          "srcId": "dsgdsgdgsdg",
          "dstid": "dasgdasdgdgdgg",
          "pid": "adgasdhsdhh",
          "quality": "DAG",
          "sid": "dddadahsdfhdfh"
        },
        "ets": 1686307199869,
        "etyp": "ASDGASD",
        "rqid": "AAAAA"
      },
      "reqId": "AAAAA"
    }
  ]
}

is mapped like this

- field: "-"
    dd_field: "data"
    inner_field:
    - field: "req_id"
        dd_field: "reqId"
    - field: "event_ts"
        dd_field: "event.ets"
    - field: "event_type"
        dd_field: "event.etyp"
    - field: "dest_id"
        dd_field: "event.data.dstid"
    - field: "source_id"
        dd_field: "event.data.srcId"

Sample YAML file :

apiVersion: datadog/v1
kind: DataDog
spec:
  auth:
    dd_site: "datadoghq.com"
    dd_api_key: "xxxxxxxxxx"
    dd_app_key: "xxxxxxxxxx"
  datadog_filter:
    query: 'service:frontend "socket: not able to connect to server" @type:SERVER_EVENT '
    from: 1686306900000
    to: 1686306960000
  mapping:
    - field: "date"
      dd_field: "date"
    - field: "session_id"
      dd_field: "session_id"
    - field: "-"
      dd_field: "data"
      inner_field:
        - field: "req_id"
          dd_field: "reqId"
        - field: "event_ts"
          dd_field: "event.ets"
        - field: "event_type"
          dd_field: "event.etyp"
        - field: "dest_id"
          dd_field: "event.data.dstid"
        - field: "source_id"
          dd_field: "event.data.srcId"

dd-downloader's People

Contributors

cordovinian avatar cpb avatar girishg4t avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.