Coder Social home page Coder Social logo

go-graphite / buckytools Goto Github PK

View Code? Open in Web Editor NEW
19.0 19.0 8.0 7.53 MB

Go implementation of useful tools for dealing with Graphite's Whisper DBs and Carbon hashing

License: Other

Go 98.08% Makefile 0.64% Dockerfile 0.08% Shell 0.30% Python 0.90%

buckytools's People

Contributors

alexakulov avatar asymmetricia avatar auguzun avatar azhiltsov avatar bom-d-van avatar civil avatar claudio-benfatto avatar deniszh avatar dolvany avatar grobian avatar grzkv avatar iain-buclaw-sociomantic avatar jaxonpickett avatar jjneely avatar pdbogen avatar thorsieger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

buckytools's Issues

BuckyTools should preserve or sync mtime of metric files

Currently, every metric sync postpones cleanup of the data, because if metric was backfilled, its mtime changes to more recent.

Metric mtime should be either preserved or synced across the cluster so that cleanups happen properly.

Make bucky go-carbon aware

go-carbon with enabled carbonserver
every
scan-frequency https://github.com/lomik/go-carbon/blame/master/README.md#L305
does a directory walk in order to build in-memory index.
This index is available in go-carbon with /metrics/list handler https://github.com/lomik/go-carbon/blob/master/carbonserver/list.go#L40

It would be nice to teach the buckyd to utilize this functionality instead of forcing it to traverse the directory tree.
Can be implemented as a parameter at buckyd invocation with host:port of go-carbon, leaving -f invocation on bucky client intact.

compressed flag is required at buckyd invocation

Currently bucky supports compressed whisper but by default it creates an uncompressed file if it backfilling from uncompressed whisper, this might inflate a disk space usage during backfilling from another (uncompressed) clusters which is unexpected.
In order to make bucky behavior inline with go-carbon I suggest to introduce a bool '-compressed' flag for buckyd with default to 'false'.
This flag should guide buckyd how to handle any situation when a new whisper file need to be created
It should not affect how it handles existing files.

Cluster in an unhealthy state

Hello everyone,

Our team has been working on running the go-graphite stack on Kubernetes. We are running the following services:

  • carbon-relay-ng
  • go-carbon
  • carbonapi

We are looking to rebalance our cluster go-graphite cluster. We found buckytools but haven't gotten too far since the tool always reports the cluster in an unhealthy state

/ $ /usr/sbin/bucky servers
Buckd daemons are using port: 4242
Hashing algorithm: [carbon: 5 nodes, 100 replicas, 500 ring members go-carbon-0.go-carbon.goal:2004=None go-carbon-1.go-carbon.goal:2004=None go-carbon-2.go-carbon.goal:2004=None go-carbon-3.go-carbon.goal:2004=None go-carbon-4.go-carbon.goal:2004=None]
Number of replicas: 100
Found these servers:
        go-carbon-0.go-carbon.goal
        go-carbon-1.go-carbon.goal
        go-carbon-2.go-carbon.goal
        go-carbon-3.go-carbon.goal
        go-carbon-4.go-carbon.goal

Is cluster healthy: false
2021/10/28 21:29:14 Cluster is inconsistent.

Could you advice how to bring the cluster to a healthy state?

/ $ bucky -version
bucky <sub-command> [options]
Copyright 2015 - 2017 42 Lines, Inc
Original Author: Jack Neely <[email protected]>
Version: 0.4.2

        Bucky is a CLI designed to work with large consistent hashing
        Graphite clusters that have the buckyd daemon installed.  Sub-
        commands will allow you to perform high level operations such
        as backups and restores of specific metrics, backfilling, and
        even rebalancing.

        Use the "help" sub-command for available commands.

Make bucky route traffic between the nodes in a distributed manner

The cluster nodes are routinely rebalanced or re-populated, e.g. after boxes are added, removed, replaced, or they die. During these operations, bucky moves the metrics around. Currently, all the traffic goes through a single master node when doing that. This makes such actions slow.

Bucky could be optimized to send data between the nodes when moving metrics around instead of bridging through a single master node.

E.g. let's say we have a cluster with nodes A, B, and C. We add node D. A serves as master. Now we would do data movement like this:
A -> D
B -> A -> D
C -> A -> D
and we could make it look like this:
A -> D
B -> D
C -> D

This will reduce network load and disk load on A making the operation faster.

signal SIGSEGV: segmentation violation code=0x1

When one of nodes disabled, bucky server command ended with crash:

[grph-st3 ~]# bucky servers
2019/12/27 12:27:35 Error retrieving URL: Get http://grph-st2.infra:4242/hashring: dial tcp 10.10.10.2:4242: connect: connection refused
2019/12/27 12:27:35 Cluster unhealthy: grph-st2.infra:4242: Get http://grph-st2.infra:4242/hashring: dial tcp 10.10.10.2:4242: connect: connection refused
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x6a0554]

goroutine 1 [running]:
main.isHealthy(0xc000112080, 0xc0001de100, 0x3, 0x4, 0x3)
	/root/go/src/github.com/go-graphite/buckytools/cmd/bucky/cluster.go:116 +0x184
main.GetClusterConfig(0x75e8dd, 0xe, 0x6, 0x75c300, 0x7)
	/root/go/src/github.com/go-graphite/buckytools/cmd/bucky/cluster.go:96 +0x541
main.serversCommand(0x75c836, 0x7, 0x777488, 0x75cfa6, 0x9, 0x769090, 0x29, 0x76fc8c, 0x196, 0xc0000608a0, ...)
	/root/go/src/github.com/go-graphite/buckytools/cmd/bucky/servers.go:26 +0x4b
main.main()
	/root/go/src/github.com/go-graphite/buckytools/cmd/bucky/main.go:101 +0x1de
[grph-st3 ~]# 

If all nodes are alive - all good:

[grph-st1 ~]$ bucky servers
Buckd daemons are using port: 4242
Hashing algorithm: [fnv1a: 4 nodes, 100 replicas, 400 ring members grph-st1.infra:2003=a grph-st2.infra:2003=b grph-st3.infra:2003=c grph-st4.infra:2003=d]
Number of replicas: 100
Found these servers:
	grph-st1.infra
	grph-st2.infra
	grph-st3.infra
	grph-st4.infra

Is cluster healthy: true

buckyd and bucky don't agree on the carbon hashring

Hello,

I've raised an issue on the incorrect repo and would like to bring it to the right one. Below my original post on jjneely/buckytools. The content below is just added for everyone to have context on my initial issue jjneely/buckytools#38


I've found 2 issues which I would love to discuss:

BuckyD and bucky configuration

buckyd will accept the members of the hashring via non-option cli arguments as buckyd <graphite1:port> <graphite2:port> ....
bucky calls for the cluster configuration and it will get graphite1:hashringport instead of graphite1:4242 because of this mismatch, bucky won't be able to reach the buckyd members

/usr/sbin/bucky servers -h go-carbon-0.go-carbon.graphite:4242
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-0.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:45902->172.16.76.119:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-0.go-carbon.graphite:2004: Get "http://go-carbon-0.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:45902->172.16.76.119:2004: read: connection reset by peer

2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-1.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:47772->172.16.27.82:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-1.go-carbon.graphite:2004: Get "http://go-carbon-1.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:47772->172.16.27.82:2004: read: connection reset by peer

2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-2.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:40946->172.16.58.32:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-2.go-carbon.graphite:2004: Get "http://go-carbon-2.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:40946->172.16.58.32:2004: read: connection reset by peer

2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-3.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:43570->172.16.127.44:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-3.go-carbon.graphite:2004: Get "http://go-carbon-3.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:43570->172.16.127.44:2004: read: connection reset by peer

2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-4.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:51674->172.16.27.91:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-4.go-carbon.graphite:2004: Get "http://go-carbon-4.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:51674->172.16.27.91:2004: read: connection reset by peer

2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-5.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:56712-172.16.14.79:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-5.go-carbon.graphite:2004: Get "http://go-carbon-5.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:56712->172.16.14.79:2004: read: connection reset by peer`

Buckd daemons are using port: 4242
Hashing algorithm: [carbon: 6 nodes, 100 replicas, 600 ring members go-carbon-0.go-carbon.graphite:2004=None go-carbon-1.go-carbon.graphite:2004=None go-carbon-2.go-carbon.graphite:2004=None go-carbon-3.go-carbon.graphite:2004=None go-carbon-4.go-carbon.graphite:2004=None go-carbon-5.go-carbon.graphite:2004=None]
Number of replicas: 100
Found these servers:
        go-carbon-0.go-carbon.graphite:2004
        go-carbon-1.go-carbon.graphite:2004
        go-carbon-2.go-carbon.graphite:2004
        go-carbon-3.go-carbon.graphite:2004
        go-carbon-4.go-carbon.graphite:2004
        go-carbon-5.go-carbon.graphite:2004

Is cluster healthy: false
2021/11/10 01:01:57 Cluster is inconsistent.

I've tracked the issue to line https://github.com/go-graphite/buckytools/blob/master/cmd/bucky/cluster.go#L88. The the port value for the cluster member is set to the same port as the hashring one instead of 4242 (or whichever port is specified by user).

To test this theory, I've forked and patched the code to set it to default 4242 and cluster is reported as healthy with the correct hashring values as below

/ # /usr/sbin/bucky servers -h go-carbon-5.go-carbon.graphite:4242
Buckd daemons are using port: 4242
Hashing algorithm: [carbon: 6 nodes, 100 replicas, 600 ring members go-carbon-0.go-carbon.graphite:2004=None go-carbon-1.go-carbon.graphite:2004=None go-carbon-2.go-carbon.graphite:2004=None go-carbon-3.go-carbon.graphite:2004=None go-carbon-4.go-carbon.graphite:2004=None go-carbon-5.go-carbon.graphite:2004=None]
Number of replicas: 100
Found these servers:
        go-carbon-0.go-carbon.graphite:4242
        go-carbon-1.go-carbon.graphite:4242
        go-carbon-2.go-carbon.graphite:4242
        go-carbon-3.go-carbon.graphite:4242
        go-carbon-4.go-carbon.graphite:4242
        go-carbon-5.go-carbon.graphite:4242

Is cluster healthy: true

Is this a real issue or just a misconfiguration on my side?

Inconsistent metric count will almost match active metric count

  1. bucky is reporting metrics as inconsistent on our cluster and the number is nearly the same as the active metrics one which is very odd. Taking a closer look, this line https://github.com/go-graphite/buckytools/blob/master/cmd/bucky/inconsistent.go#L69 does check the port values and these don't match because one is 2004 and the other is 4242.

The original code does not take the ports into account, just the hostnames
https://github.com/jjneely/buckytools/blob/master/cmd/bucky/inconsistent.go#L64

Is my assumption that these rings won't match because of this correct?

bucky modify panic : truncate : invalid argument

Hello everyone,

I use buckytools for managing a graphite infrastructure (carbon-c-relay -> go-carbon -> carbonapi).
I'm trying to use the modify command to resize whisper files but the program failed.

General informations :

  • Buckytools version : 0.4.2-gg

  • Go : go version go1.17.2 linux/amd64

Output of the command :

/go $ bucky modify -index 1 -retention 60s:10d -f /opt/graphite/storage/whisper/haggar/agent/0/metrics/1.wsp 
panic: truncate /opt/graphite/storage/whisper/haggar/agent/0/metrics/1.wsp: invalid argument

goroutine 1 [running]:
main.modifyCommand({{0x6cc5ca, 0x6}, 0x6e8878, {0x6cd25c, 0x9}, {0x6dafe1, 0x2c}, {0x6e1e1c, 0x36e}, 0xc000100720})
	/go/src/github.com/go-graphite/buckytools/cmd/bucky/modify.go:84 +0x44e
main.main()
	/go/src/github.com/go-graphite/buckytools/cmd/bucky/main.go:101 +0x198

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.