Coder Social home page Coder Social logo

zix99 / rare Goto Github PK

View Code? Open in Web Editor NEW
253.0 5.0 13.0 3.71 MB

Realtime regex-extraction and aggregation into common CLI formats such as histograms, bar graphs, numerical summaries, tables, and more!

Home Page: https://rare.zdyn.net/

License: GNU General Public License v3.0

Go 100.00%
regex grep sed log-parser analyzer nginx apache awk regex-extractor histogram

rare's People

Contributors

dependabot[bot] avatar eskriett avatar herbygillot avatar obaudys avatar zix99 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

rare's Issues

Panic in coloring logic when using nested groups in 'filter' mode

First of all, let me say, this is an awesome project. Nice work! I was in the process of writing something similar but much much worse; I think I might use this as a library instead!

I've manage to cause a panic in the following scenario:

echo '1,2,3,4,5,6,7,8,9,0' | rare filter -m "(^[^,]*)(,([^,]*)){5}"
panic: runtime error: slice bounds out of range [11:10]

goroutine 1 [running]:
rare/pkg/color.WrapIndices(0xc000500000, 0x13, 0xc0000b4590, 0x6, 0x6, 0x206860, 0x1207860)
	/Users/ondrejb/Documents/git/rare/pkg/color/coloring.go:95 +0x832
rare/cmd.filterFunction(0xc00024c840, 0x0, 0xc0004752d0)
	/Users/ondrejb/Documents/git/rare/cmd/filter.go:33 +0x276
github.com/urfave/cli.HandleAction(0x13b20c0, 0x144c828, 0xc00024c840, 0xc00024c840, 0x0)
	/Users/ondrejb/go/pkg/mod/github.com/urfave/[email protected]/app.go:523 +0xfd
github.com/urfave/cli.Command.Run(0x142c0a8, 0x6, 0x1429f04, 0x1, 0x0, 0x0, 0x0, 0x1441dfa, 0x44, 0x0, ...)
	/Users/ondrejb/go/pkg/mod/github.com/urfave/[email protected]/command.go:174 +0x58e
github.com/urfave/cli.(*App).Run(0xc0004d6000, 0xc000090040, 0x4, 0x4, 0x0, 0x0)
	/Users/ondrejb/go/pkg/mod/github.com/urfave/[email protected]/app.go:276 +0x7d4
main.cliMain(0xc000090040, 0x4, 0x4, 0x0, 0x0)
	/Users/ondrejb/Documents/git/rare/main.go:101 +0x666
main.main()
	/Users/ondrejb/Documents/git/rare/main.go:105 +0x49

To save you parsing manually, the three groups here are the 1st column of a csv, the 6th column but including the leading comma, and the 6th field without the leading comma.

When I make the second (outer) group non capturing, ie. (^[^,]*)(?:,([^,]*)){5} everything works fine and I get the 1st and 6th field (groups {1} and {3}). Obviously, if I use the --nocolor option, or if I use an expression eg. -e '{1} {2} {3}, everything is fine.

I haven't looked deep into the code yet, but obviously since the match groups overlap, the starting index of the inner group lies inside the outer group, and the colouring logic doesn't account for this scenario.

I'd suggest that the inner match should take precedence when colouring the matching text (ie. inner match colours "overwrite" the outer group)

I'll have a crack at making a pull request to fix this myself soon.

Why not use grok?

Hello, grok is a generally common log parsing language that allows for a clear combination of regular expressions. It is used in tools like logstash and vector. I was just curious why you opted for traditional regex and match groups rather than using grok.

Thanks, Cam.

Slice bounds out of range panic

I tried the following filter for my log parsing.

rare h -b  -m "\[(INFO)|(ERROR)|(WARNING)|(CRITICAL)\]" *.log  -R

And rare threw the following panic.

Screen Shot 2019-11-18 at 17 19 41

Sort heatmap columns numerically

First off, great work with the heatmaps feature!

I've been using heatmaps for numerical data, in particular nginx response times. I usually convert them it integers in rare by matching eg 0.053 (seconds) with (\d+)\.(\d{3}) and then using the expression {sumi {multi {1} 1000} {3}} to convert to milliseconds.

I've found that the table and heatmap sort the column names as strings, and not numerically if possible. The results in meaningless heatmaps:
image

I made a small change to a local checkout of rare to basically test if the column headers could be converted to integers and then sort them numerically if they can:

index 1be0be8..040b889 100644
--- a/pkg/aggregation/table.go
+++ b/pkg/aggregation/table.go
@@ -103,8 +103,23 @@ func (s *TableAggregator) OrderedColumns() []string {
 func (s *TableAggregator) OrderedColumnsByName() []string {
        keys := s.Columns()
 
+       // check if keys can be sorted numerically:
+       numeric := true
+       for _,k := range keys {
+               if _, err := strconv.Atoi(k); err != nil {
+                       numeric = false
+                       break
+               }
+       }
+
        sort.Slice(keys, func(i, j int) bool {
-               return keys[i] < keys[j]
+               if numeric {
+                       k0, _ := strconv.Atoi(keys[i])
+                       k1, _ := strconv.Atoi(keys[j])
+                       return k0 < k1
+               } else {
+                       return keys[i] < keys[j]
+               }
        })
 
        return keys

The same heatmap is now much more meaningful:
image

Would you be interested in incorporating the above diff into rare?

Invalid syntax in tap

Hey,

Just tried to install from Homebrew and I get an error:

โฏ brew tap zix99/rare
==> Tapping zix99/rare
Cloning into '/usr/local/Homebrew/Library/Taps/zix99/homebrew-rare'...
remote: Enumerating objects: 45, done.
remote: Counting objects: 100% (45/45), done.
remote: Compressing objects: 100% (30/30), done.
remote: Total 45 (delta 14), reused 0 (delta 0), pack-reused 0
Receiving objects: 100% (45/45), 6.65 KiB | 3.33 MiB/s, done.
Resolving deltas: 100% (14/14), done.
Error: Invalid formula: /usr/local/Homebrew/Library/Taps/zix99/homebrew-rare/rare.rb
rare: Calling bottle :unneeded is disabled! There is no replacement.
Please report this issue to the zix99/rare tap (not Homebrew/brew or Homebrew/core):
  /usr/local/Homebrew/Library/Taps/zix99/homebrew-rare/rare.rb:9

Error: Cannot tap zix99/rare: invalid syntax in tap!

Is there anything I need to do other than

brew tap zix99/rare && brew install rare

I'm doing this on a 2019 Mac Book Pro (Intel)
Thank you

Update README to require go1.16

README currently states that to build go 1.11 or higher is required:

Requires GO 1.11 or higher (Uses go modules)

Looks like rare is now making us of embed (#32) which requires 1.16

Regression: 'rare filter' now ignores '-e' form of '--extract' flag

Before:
image

Now:
image

Now but with --extract:
image

I've taken screenshots to preserve image highlighting; here's the cut-and-paste friendly test case:
echo "a: 1 b: 2 c: 3" | ./rare f -m 'a: (\d+) b: (\d+) c: (\d+)' -e '{1} {2} {3}'

Everything seems to work fine using the short '-e' with histogram, analyze, etc; it's just filter that is impacted.

Memory and CPU usage

I'm curious the Readme could included the memory and CPU usage between standard unix tools ga and rare?

Another take to benchmark on embedded device is useful.

Can't change polling speed/batch options when piping output to rare

Hey, I just discovered this and really liked it, however I found a bug when trying to follow large standard output.
if I have the following python file which just generates a bunch of numbers:

import random

BOXES = 100
BALLS = 100

def sim():
    boxes = [0 for a in range(BOXES)]
    for ball in range(BALLS):
        boxes[random.randint(0, BOXES-1)] += 1
    return len([b for b in boxes if b == 0])

for i in range(10000000):
    print(sim())

And I want to use rare to show the distribution of the output live while the simulation runs, I would do
python3 ./random_100_box_ball.py | rare histo -f -x --sortkey -n 30 --batch 1

However, the updates through the pipe come once a second, even after setting the batch size to something small like 1.

Ubuntu.2022-04-24.16-50-45.mp4

If I instead piped the output to a file, and then read from the file using rare, it produces the live following that I was after

python3 random_100_box_ball.py > test.txt
rare histo -f --sortkey -n 30 --batch 1 ./test.txt
Ubuntu.2022-04-24.16-51-49.mp4

batch settings are respected and the updates can be tuned when not using pipe. I was wondering if you knew why this was the case? And if you could get pipes to emulate the behaviour seen when you instead follow a file?

N.B. I'm running ubuntu 20.04 WSL2 on Windows 10, using the prebuilt binary (rare, but rare-pcre suffers from the same issue)

04:55:37 ~  -> rare --version
rare version 0.2.1, 29f1bd5; regex: re2
04:55:39 ~  -> rare-pcre --version
rare-pcre version 0.2.1, 29f1bd5; regex: libpcre2

float not supported?

I wanted to draw a histogram for float values but I got BAD-TYPE error. Are decimal values not supported at all or do we need to process them with a trick?

Integer values work:

$ echo -e "1\n10\n15\n20" | rare h -e "{bucket {0}  10}"
10                  2
20                  1
0                   1

Matched: 4 / 4 (Groups: 3)
11 B (0 B/s) | <stdin>

But when I use decimal values, I get error

 echo -e "1.5\n10.2\n15.5\n20.0" | rare h -e "{bucket {0}  10}"
<BAD-TYPE>          4

Matched: 4 / 4 (Groups: 1)
19 B (0 B/s) | <stdin>

Support using a log scale for heatmaps

It would be useful to support using a log scale for generating heatmaps with wide distributions.

Edit: by the way, thanks a ton for this tool. I've avoided learning gnuplot for years and rare looks like it will help me do that indefinitely ๐Ÿ˜„.

Missing LICENSE

I see you have no LICENSE file for this project. I see the README.md has the GPL-3.0-or-later header. I see .goreleaser.yml says GPLv2.

I would suggest adding the GPL-3.0-or-later license as a LICENSE or COPYING file, changing GPLv2 to GPL-3.0-or-later, and adding license headers to all files. I could send a patch for the first two if you would like.

Is there a way to sort histogram in reverse order?

Here's for example a snippet that shows a histogram, tracking monthly changes in a git repo:

git log --first-parent --pretty=format:"%ad%x09%s" --date="format:%Y-%m" | \
awk '{ print $1 }' | rare histogram -a --sort date

It spits out something like:

2021-10             10         [ 4.6%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
2021-11             42         [19.3%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
2021-12             8          [ 3.7%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Œ
2022-01             4          [ 1.8%] โ–ˆโ–ˆโ–ˆโ–ˆโ–Š
2022-02             7          [ 3.2%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–
2022-03             2          [ 0.9%] โ–ˆโ–ˆโ–
2022-04             19         [ 8.7%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‹
2022-05             3          [ 1.4%] โ–ˆโ–ˆโ–ˆโ–‹
2022-06             9          [ 4.1%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Š
2022-07             7          [ 3.2%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–
2022-08             4          [ 1.8%] โ–ˆโ–ˆโ–ˆโ–ˆโ–Š
2022-09             10         [ 4.6%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
2022-10             15         [ 6.9%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‰
2022-11             27         [12.4%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–
2022-12             3          [ 1.4%] โ–ˆโ–ˆโ–ˆโ–‹
2023-01             5          [ 2.3%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
2023-02             10         [ 4.6%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
2023-03             8          [ 3.7%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Œ
2023-04             8          [ 3.7%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Œ
2023-05             11         [ 5.0%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
2023-06             6          [ 2.8%] โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–

Now, how can I get it sorted in descending order? I can't use cmd-line sort, it breaks the histogram.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.