We should support the StatsD protocol and aggregation. However, unlike StatsD, the met

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Looking at the statsd spec from here: <a href="https://github.com/b/

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Add support for StatsD style aggregator,about influxdata/telegraf

Comments (41)

skyrocknroll commented on May 8, 2024 1

@sparrc wherever i have used , counters are always associated with time. Like requests per second.
Some actions per second. So it would be better if we clear of counter values after each flush. For gauge maintaining values across each flush does make sense.

So default behavior

counter --> reset to 0 after each flush.
guage -> maintaining the value between flush.
But providing everything as configurable is awesome :)
for more details
https://github.com/etsy/statsd/blob/master/exampleConfig.js
https://github.com/etsy/statsd/blob/master/docs/metric_types.md

from telegraf.

nstott commented on May 8, 2024

Looking at the statsd spec from here:

https://github.com/b/statsd_spec

@pauldix are you thinking of a line format something like this?

cpu_load_short,host=server01,region=us-west:2.34|g
cpu_load_short,host=server01,region=us-west:3.42|g
errors,host=server01,region=us-west:1|c

where the server adds the timestamp either when it receives the message, or perhaps in the case of counters, adding the timestamp when it flushes to a sink might be more appropriate

from telegraf.

liyichao commented on May 8, 2024

It may be good if telegraf can add hostname as a tag instead of application sending hostname, because application may run in a container.

from telegraf.

pauldix commented on May 8, 2024

@nstott yeah, that's exactly what I was thinking. Telegraf should specify timestamps when it flushes. In general when writing to InfluxDB it's best to specify timestamps. That way if there is a partial write in a cluster, you can just write again and it's idempotent.

@liyichao the issue is that you'd have one telegraf server collecting all the metrics for all of your hosts (like what you do with StatsD). Essentially one of your telegraf installs would become your statsd server.

from telegraf.

nstott commented on May 8, 2024

I'll see if i can knock something out in the next few days for this

from telegraf.

alvaromorales commented on May 8, 2024

from telegraf.

skyrocknroll commented on May 8, 2024

This is one of the awesome feature to have 👍

from telegraf.

zp-markusp commented on May 8, 2024

from telegraf.

rvrignaud commented on May 8, 2024

from telegraf.

caquino commented on May 8, 2024

+1, having a replacement for StatsD/datadog-agent-statsd will make the migration from other services way easier.

from telegraf.

ranjib commented on May 8, 2024

@pauldix is anyone actively working on it. if not i can take a stab at it. this will be a really useful feature. Im currently running an additional statsd agent (statsdaemon) along side telegraf for this.
@sparrc comments?

from telegraf.

sparrc commented on May 8, 2024

@ranjib I am hoping to work on this today

from telegraf.

pauldix commented on May 8, 2024

With the 0.9.5 release coming we'll have support for many fields and we'll stop pushing people to only have a single field per measurement. We should support writing data to multiple fields. I'm thinking that we can support the StatsD protocol like I mentioned above, but we should also make it possible to write values into different fields. I'm thinking it should look exactly like the line protocol.

from telegraf.

skyrocknroll commented on May 8, 2024

+1 @pauldix #39 (comment)

from telegraf.

skyrocknroll commented on May 8, 2024

does somebody working on this ? Is the any ETA or target release ?

from telegraf.

ranjib commented on May 8, 2024

@skyrocknroll #237

from telegraf.

sparrc commented on May 8, 2024

It's something I'm working on right now. At the moment I have counters, gauges, and sets working. I still have a ways to go with timers, as they're a bit more complicated.

I'm hoping to have timers working by the end of the week, life permitting ;-)

from telegraf.

skyrocknroll commented on May 8, 2024

Thank you @ranjib

@sparrc
Thank you for your kind update. Right now just to maintain the count we are inserting lot of records. If influxdb statsd is there then our No of records will reduce to 1/1000 th :) and performance will improve a lot.

Eagerly waiting for the release :)

from telegraf.

sparrc commented on May 8, 2024

@skyrocknroll Since InfluxDB is a bit more powerful than Graphite, the default behavior is going to be a little different than a typical statsd server.

to give you a little preview, counters would look something like this:

Metrics sent:

$ echo "deploys.test.myservice:1|c" | nc -C -w 1 -u localhost 8125
[10s later...]
$ echo "deploys.test.myservice:1|c" | nc -C -w 1 -u localhost 8125

Telegraf debug output:

> [] statsd_deploys_test_myservice_counter value=1
2015/10/05 11:49:25 Cranking default (10s) interval, gathered 1 metrics from 1 plugins in 142.169µs
> [] statsd_deploys_test_myservice_counter value=2
2015/10/05 11:49:35 Cranking default (10s) interval, gathered 1 metrics from 1 plugins in 99.549µs
> [] statsd_deploys_test_myservice_counter value=2
2015/10/05 11:49:45 Cranking default (10s) interval, gathered 1 metrics from 1 plugins in 59.998µs

As you can see, counters will be maintained and reported at each collection interval, and they will not be cleared by default.

Since I've never used statsd in production, I'd love to hear what you (and anyone else in this thread) thinks of that behavior.

Thanks a bunch!

from telegraf.

sparrc commented on May 8, 2024

My problem resetting the counter is this: InfluxDB provides you with the ability to calculate rates of change on counters that are always-increasing (like this: SELECT non_negative_derivative(value, 1s) FROM statsd_deploys_myservice_counter)

If the counter reset, this obviously wouldn't work, and calculating rates of change on the counter requires knowledge of the flushing interval. This also means that the flushing interval can never be changed once the data starts being collected. With an ever-increasing counter, you are able to change the collection interval completely arbitrarily, because you simply have timestamps associated with different points in the counters' upward trajectory.

To me this makes more sense because it is also generally how OS-level counters work, ie: network bytes & packets received and sent, CPU ticks, etc.

Let me know what you think, the general idea here is that working with InfluxDB is less limited than working with Graphite since it's query language is more featured. Statsd was a protocol built with graphite in mind, and I'd like our implementation to support InfluxDB better.

from telegraf.

skyrocknroll commented on May 8, 2024

@sparrc I agree with you. one more question. How we are planning to write data using this ?
Pointing influxdb client to telegraf statsd or we should use separate influx-statsd client which supports tags & fields along with measurement .

from telegraf.

sparrc commented on May 8, 2024

It will be a "plugin" on one of your telegraf instances. That telegraf instance will open up a port and listen for UDP packets, where you can send your normal statsd-style packets. On the regular telegraf interval, the statsd server will be flushed and all data will be sent to InfluxDB.

from telegraf.

skyrocknroll commented on May 8, 2024

@sparrc Will the line format support tags & fields of influxdb ? Right now we are not using any of statsd influxdb writer because those doesn't understand influxdb tags & fields.

from telegraf.

sparrc commented on May 8, 2024

yes, it will support a way to create a mapping of a statsd "bucket" to an influxdb measurement with tags: https://github.com/influxdb/telegraf/blob/statsd/plugins/statsd/README.md

from telegraf.

zp-markusp commented on May 8, 2024

Why don't you take advantages of influxdb and use the line protocol syntax? So that you are able to define tags on the fly and don't rely on any hardcoded dot separated order?

Regards, Markus

from telegraf.

skyrocknroll commented on May 8, 2024

@sparrc as @zp-markusp said we were looking exactly the same feature. We see influx tags & fields unbeatable feature. If we use the same line protocol then we get all the dynamism of tags and filed and also counters & gauge at the telegraf level.

Or may be we need both of it . Plain statsd for statsd protocol and statsd features with the line protocol.

Plain statstd strips away all the awesomeness of tags & fields.

Datatog has both plain statsd and also datadog-statsd which supports tags.

from telegraf.

justin8 commented on May 8, 2024

It would be very useful to support both. Being able to use it as a drop in replacement for things like datadog would be really useful, with the added benefit that you can alter your apps to utilize tags afterwards. It would make the barrier for entry incredibly low.

from telegraf.

sparrc commented on May 8, 2024

Thanks everyone for the input, especially for the datadog-statsd link, that is very useful and it seems like they have created a good system for adding tags to statsd lines.

As I see it, there are two options we can support: datadog-statsd is closer to plain statsd and simply adds a list of tags after a |# character. influx-statsd would be similar to what @nstott wrote above. It is less similar to plain statsd but more similar to the InfluxDB line protocol.

I'm leaning towards only supporting datadog-statsd because then users can more easily migrate between influxdb and datadog, and it also allows people to use existing datadog statsd clients. If we create our own statsd protocol, we're contributing to this problem

@justin8 @skyrocknroll @zp-markusp @pauldix @nathanielc What would you prefer between these two tag formatting options? should we support both?

datadog-statsd

cpu.load.short:2.34|g|#host:server01,region:us-west

influx-statsd

cpu.load.short,host=server01,region=us-west:2.34|g

from telegraf.

skyrocknroll commented on May 8, 2024

@sparrc I would like to go with influx-statsd because it will give us consistency across whole influxdb ecosystem.It looks very similar to influxdb line protocol. Also @pauldix #39 (comment) was mentioning about supporting multiple values. If we are going to design influxdb-statstd lets provision a way to support multiple field values also.

But right now i don't see strong importance on supporting multiple field values. But others may help on this.
I am thinking of something like this if we support multiple field values.

temperature,machine=unit42,type=assembly internal=32|g,external=100|c

from telegraf.

zp-markusp commented on May 8, 2024

From a gut feeling perspective I would prefer influx-statsd as this could be implemented without changing the statsd library on the application side as it follows the pattern string{identifier}{value}{statsd type}. So just the identifier has to be exchanged.

from telegraf.

skyrocknroll commented on May 8, 2024

@zp-markusp +1 One way is we can try to parse the identifier on the telegraf side and if it has tags then lets use it as measurment & tags otherwise we can use whole identifier as measurement in influxdb.

from telegraf.

zp-markusp commented on May 8, 2024

For example the standard statsd output from logstash could be used.

from telegraf.

nathanielc commented on May 8, 2024

I say influx-statsd since its a subset of the statsd protocol, like @zp-markusp said. It won't require a new client.

I think you should also do something similar to the graphite plugin in InfluxDB that allows you to transform a metric name into a measurement, fields, and tags set. See https://github.com/influxdb/influxdb/tree/master/services/graphite#templates

This will allow for users that already have lots of tag data in the metric name,
i.e us-west.server01.cpu.short.load:2.34|g

from telegraf.

sparrc commented on May 8, 2024

okay, good point @skyrocknroll about supporting multiple fields, how about this:

measurement[,tag1=key1,tag2=key2]:[field=]value[,field2=value2]|type

so an example would look like:

cpu.usage,host=server01,region=us-west:idle=10.0,user=50.0,system=40.0|g
=> statsd_cpu_usage_gauge,host='server01',region='us-west' idle=10,user=50,system=50

field names and tags are optional, so you could also just do this:

cpu.usage.idle:10.0|g
=> statsd_cpu_usage_idle_gauge value=10

@nathanielc thanks for pointing me to that, I did not realize that we already had a graphite template transformation setup, I was going to have telegraf have a configuration table for transforming the statsd bucket into tags like this: https://github.com/influxdb/telegraf/blob/statsd/plugins/statsd/README.md#statsd-bucket---influxdb-mapping, but I may want to borrow from the influxdb graphite template instead.

from telegraf.

zp-markusp commented on May 8, 2024

@sparrc does it make sense to hard code the gauge as suffix to the name?
I would propose to either ignore it or add it as a tag (statsd-type=gauge)

from telegraf.

nathanielc commented on May 8, 2024

I don't see a strong need to support multiple fields either. StatsD is an event counter, seems odd to want to send multiple fields for a single event. But as long as it is backwards compatible with the StatsD protocol (like your example) I don't see an issue supporting it.

from telegraf.

sparrc commented on May 8, 2024

@zp-markusp I like that idea more too, I'll change the behavior to add a metric_type tag 👍

from telegraf.

justin8 commented on May 8, 2024

Bit late to reply to this one now; but the way it seems to be heading sounds great! Backwards compatible with extra features/tags 👍

from telegraf.

sparrc commented on May 8, 2024

This is now in master and can be gotten by building from source, see README here for documentation and usage details: https://github.com/influxdb/telegraf/tree/master/plugins/statsd

more feedback is much appreciated, thanks all

from telegraf.

penguincp commented on May 8, 2024

According to #1876 (commented by sparrc on Oct 11, 2016), multiple field support (e.g. cpu.usage,host=server01,region=us-west:idle=10.0,user=50.0,system=40.0|g) was removed and will not be supported in the future, why?

from telegraf.

danielnelson commented on May 8, 2024

@penguincp The statsd protocol is incompatible with multiple fields, we do support multiple tags and you can use a stat for each field. If you would like to discuss this further please open a new issue or ask a questions at the InfluxData Community site.

from telegraf.

Add support for StatsD style aggregator about telegraf HOT 41 CLOSED

Comments (41)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent