go-graphite / buckytools Goto Github PK
View Code? Open in Web Editor NEWGo implementation of useful tools for dealing with Graphite's Whisper DBs and Carbon hashing
License: Other
Go implementation of useful tools for dealing with Graphite's Whisper DBs and Carbon hashing
License: Other
Currently, every metric sync postpones cleanup of the data, because if metric was backfilled, its mtime changes to more recent.
Metric mtime should be either preserved or synced across the cluster so that cleanups happen properly.
go-carbon with enabled carbonserver
every
scan-frequency https://github.com/lomik/go-carbon/blame/master/README.md#L305
does a directory walk in order to build in-memory index.
This index is available in go-carbon with /metrics/list handler https://github.com/lomik/go-carbon/blob/master/carbonserver/list.go#L40
It would be nice to teach the buckyd to utilize this functionality instead of forcing it to traverse the directory tree.
Can be implemented as a parameter at buckyd invocation with host:port of go-carbon, leaving -f invocation on bucky client intact.
Currently bucky supports compressed whisper but by default it creates an uncompressed file if it backfilling from uncompressed whisper, this might inflate a disk space usage during backfilling from another (uncompressed) clusters which is unexpected.
In order to make bucky behavior inline with go-carbon I suggest to introduce a bool '-compressed' flag for buckyd with default to 'false'.
This flag should guide buckyd how to handle any situation when a new whisper file need to be created
It should not affect how it handles existing files.
Hello everyone,
Our team has been working on running the go-graphite stack on Kubernetes. We are running the following services:
We are looking to rebalance our cluster go-graphite cluster. We found buckytools but haven't gotten too far since the tool always reports the cluster in an unhealthy state
/ $ /usr/sbin/bucky servers
Buckd daemons are using port: 4242
Hashing algorithm: [carbon: 5 nodes, 100 replicas, 500 ring members go-carbon-0.go-carbon.goal:2004=None go-carbon-1.go-carbon.goal:2004=None go-carbon-2.go-carbon.goal:2004=None go-carbon-3.go-carbon.goal:2004=None go-carbon-4.go-carbon.goal:2004=None]
Number of replicas: 100
Found these servers:
go-carbon-0.go-carbon.goal
go-carbon-1.go-carbon.goal
go-carbon-2.go-carbon.goal
go-carbon-3.go-carbon.goal
go-carbon-4.go-carbon.goal
Is cluster healthy: false
2021/10/28 21:29:14 Cluster is inconsistent.
Could you advice how to bring the cluster to a healthy state?
/ $ bucky -version
bucky <sub-command> [options]
Copyright 2015 - 2017 42 Lines, Inc
Original Author: Jack Neely <[email protected]>
Version: 0.4.2
Bucky is a CLI designed to work with large consistent hashing
Graphite clusters that have the buckyd daemon installed. Sub-
commands will allow you to perform high level operations such
as backups and restores of specific metrics, backfilling, and
even rebalancing.
Use the "help" sub-command for available commands.
The cluster nodes are routinely rebalanced or re-populated, e.g. after boxes are added, removed, replaced, or they die. During these operations, bucky moves the metrics around. Currently, all the traffic goes through a single master node when doing that. This makes such actions slow.
Bucky could be optimized to send data between the nodes when moving metrics around instead of bridging through a single master node.
E.g. let's say we have a cluster with nodes A
, B
, and C
. We add node D
. A
serves as master. Now we would do data movement like this:
A
-> D
B
-> A
-> D
C
-> A
-> D
and we could make it look like this:
A
-> D
B
-> D
C
-> D
This will reduce network load and disk load on A
making the operation faster.
When one of nodes disabled, bucky server
command ended with crash:
[grph-st3 ~]# bucky servers
2019/12/27 12:27:35 Error retrieving URL: Get http://grph-st2.infra:4242/hashring: dial tcp 10.10.10.2:4242: connect: connection refused
2019/12/27 12:27:35 Cluster unhealthy: grph-st2.infra:4242: Get http://grph-st2.infra:4242/hashring: dial tcp 10.10.10.2:4242: connect: connection refused
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x6a0554]
goroutine 1 [running]:
main.isHealthy(0xc000112080, 0xc0001de100, 0x3, 0x4, 0x3)
/root/go/src/github.com/go-graphite/buckytools/cmd/bucky/cluster.go:116 +0x184
main.GetClusterConfig(0x75e8dd, 0xe, 0x6, 0x75c300, 0x7)
/root/go/src/github.com/go-graphite/buckytools/cmd/bucky/cluster.go:96 +0x541
main.serversCommand(0x75c836, 0x7, 0x777488, 0x75cfa6, 0x9, 0x769090, 0x29, 0x76fc8c, 0x196, 0xc0000608a0, ...)
/root/go/src/github.com/go-graphite/buckytools/cmd/bucky/servers.go:26 +0x4b
main.main()
/root/go/src/github.com/go-graphite/buckytools/cmd/bucky/main.go:101 +0x1de
[grph-st3 ~]#
If all nodes are alive - all good:
[grph-st1 ~]$ bucky servers
Buckd daemons are using port: 4242
Hashing algorithm: [fnv1a: 4 nodes, 100 replicas, 400 ring members grph-st1.infra:2003=a grph-st2.infra:2003=b grph-st3.infra:2003=c grph-st4.infra:2003=d]
Number of replicas: 100
Found these servers:
grph-st1.infra
grph-st2.infra
grph-st3.infra
grph-st4.infra
Is cluster healthy: true
Hello,
I've raised an issue on the incorrect repo and would like to bring it to the right one. Below my original post on jjneely/buckytools
. The content below is just added for everyone to have context on my initial issue jjneely/buckytools#38
I've found 2 issues which I would love to discuss:
buckyd
will accept the members of the hashring via non-option cli arguments as buckyd <graphite1:port> <graphite2:port> ...
.
bucky
calls for the cluster configuration and it will get graphite1:hashringport
instead of graphite1:4242
because of this mismatch, bucky won't be able to reach the buckyd members
/usr/sbin/bucky servers -h go-carbon-0.go-carbon.graphite:4242
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-0.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:45902->172.16.76.119:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-0.go-carbon.graphite:2004: Get "http://go-carbon-0.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:45902->172.16.76.119:2004: read: connection reset by peer
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-1.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:47772->172.16.27.82:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-1.go-carbon.graphite:2004: Get "http://go-carbon-1.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:47772->172.16.27.82:2004: read: connection reset by peer
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-2.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:40946->172.16.58.32:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-2.go-carbon.graphite:2004: Get "http://go-carbon-2.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:40946->172.16.58.32:2004: read: connection reset by peer
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-3.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:43570->172.16.127.44:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-3.go-carbon.graphite:2004: Get "http://go-carbon-3.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:43570->172.16.127.44:2004: read: connection reset by peer
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-4.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:51674->172.16.27.91:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-4.go-carbon.graphite:2004: Get "http://go-carbon-4.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:51674->172.16.27.91:2004: read: connection reset by peer
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-5.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:56712-172.16.14.79:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-5.go-carbon.graphite:2004: Get "http://go-carbon-5.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:56712->172.16.14.79:2004: read: connection reset by peer`
Buckd daemons are using port: 4242
Hashing algorithm: [carbon: 6 nodes, 100 replicas, 600 ring members go-carbon-0.go-carbon.graphite:2004=None go-carbon-1.go-carbon.graphite:2004=None go-carbon-2.go-carbon.graphite:2004=None go-carbon-3.go-carbon.graphite:2004=None go-carbon-4.go-carbon.graphite:2004=None go-carbon-5.go-carbon.graphite:2004=None]
Number of replicas: 100
Found these servers:
go-carbon-0.go-carbon.graphite:2004
go-carbon-1.go-carbon.graphite:2004
go-carbon-2.go-carbon.graphite:2004
go-carbon-3.go-carbon.graphite:2004
go-carbon-4.go-carbon.graphite:2004
go-carbon-5.go-carbon.graphite:2004
Is cluster healthy: false
2021/11/10 01:01:57 Cluster is inconsistent.
I've tracked the issue to line https://github.com/go-graphite/buckytools/blob/master/cmd/bucky/cluster.go#L88. The the port value for the cluster member is set to the same port as the hashring one instead of 4242
(or whichever port is specified by user).
To test this theory, I've forked and patched the code to set it to default 4242 and cluster is reported as healthy with the correct hashring values as below
/ # /usr/sbin/bucky servers -h go-carbon-5.go-carbon.graphite:4242
Buckd daemons are using port: 4242
Hashing algorithm: [carbon: 6 nodes, 100 replicas, 600 ring members go-carbon-0.go-carbon.graphite:2004=None go-carbon-1.go-carbon.graphite:2004=None go-carbon-2.go-carbon.graphite:2004=None go-carbon-3.go-carbon.graphite:2004=None go-carbon-4.go-carbon.graphite:2004=None go-carbon-5.go-carbon.graphite:2004=None]
Number of replicas: 100
Found these servers:
go-carbon-0.go-carbon.graphite:4242
go-carbon-1.go-carbon.graphite:4242
go-carbon-2.go-carbon.graphite:4242
go-carbon-3.go-carbon.graphite:4242
go-carbon-4.go-carbon.graphite:4242
go-carbon-5.go-carbon.graphite:4242
Is cluster healthy: true
Is this a real issue or just a misconfiguration on my side?
bucky
is reporting metrics as inconsistent on our cluster and the number is nearly the same as the active metrics one which is very odd. Taking a closer look, this line https://github.com/go-graphite/buckytools/blob/master/cmd/bucky/inconsistent.go#L69 does check the port values and these don't match because one is 2004 and the other is 4242.The original code does not take the ports into account, just the hostnames
https://github.com/jjneely/buckytools/blob/master/cmd/bucky/inconsistent.go#L64
Is my assumption that these rings won't match because of this correct?
Hello everyone,
I use buckytools for managing a graphite infrastructure (carbon-c-relay -> go-carbon -> carbonapi).
I'm trying to use the modify command to resize whisper files but the program failed.
General informations :
Buckytools version : 0.4.2-gg
Go : go version go1.17.2 linux/amd64
Output of the command :
/go $ bucky modify -index 1 -retention 60s:10d -f /opt/graphite/storage/whisper/haggar/agent/0/metrics/1.wsp
panic: truncate /opt/graphite/storage/whisper/haggar/agent/0/metrics/1.wsp: invalid argument
goroutine 1 [running]:
main.modifyCommand({{0x6cc5ca, 0x6}, 0x6e8878, {0x6cd25c, 0x9}, {0x6dafe1, 0x2c}, {0x6e1e1c, 0x36e}, 0xc000100720})
/go/src/github.com/go-graphite/buckytools/cmd/bucky/modify.go:84 +0x44e
main.main()
/go/src/github.com/go-graphite/buckytools/cmd/bucky/main.go:101 +0x198
So happy to see someone has picked up the project. Wondering if supporting replication is on the roadmap.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.