Introduction With each Go release the set of metrics exported by t

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

CC <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

As <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

proposal: runtime/metrics: define a recommended set of metrics about go HOT 14 OPEN

mknyszek commented on May 28, 2024 10

proposal: runtime/metrics: define a recommended set of metrics

from go.

Comments (14)

dashpole commented on May 28, 2024 1

Out of curiosity, would OpenTelemetry use the Tags []string field at all, or is there a desire to tightly control the metrics exported by default?

We would likely use it as part of the tests for the package as a way to verify that we are exposing all of the recommended metrics to users. We would probably not use it for programmatic generation of new metrics.

from go.

ArthurSens commented on May 28, 2024 1

Hello hello, I'm Arthur from prometheus/client_golang team 👋

We have a different set of default metrics and I believe we can't just change the default exposed metrics without a major version bump. One approach we could use is to have a configuration option in client_golang "ExposeRecommendedMetrics", but I predict we'll have questions like "why the recommended metrics aren't the default?".

With that said, I like the idea of the Go team providing instructions about what metrics are worth paying the price to collect and store. I also like the idea of those instructions being programmatically available, we just need to evaluate if there's a need for a major version bump in client_golang and if it's worth the effort

from go.

dashpole commented on May 28, 2024 1

I'm asking about general stability (e.g. across go versions, or across multiple instances of an applications).

from go.

mknyszek commented on May 28, 2024 1

@dashpole No, they're not guaranteed to remain stable and have changed across Go versions. We've definitely removed buckets before.

from go.

mknyszek commented on May 28, 2024

@dashpole Out of curiosity, would OpenTelemetry use the Tags []string field at all, or is there a desire to tightly control the metrics exported by default? Same question for @bwplotka for Prometheus. The purpose of this field is to stay in line with the spirit of this package, which is to make as much information programmatically available as possible.

from go.

mknyszek commented on May 28, 2024

CC @rhysh @bboreham @felixge @prattmic

from go.

MikeMitchellWebDev commented on May 28, 2024

As @rsc said in 2021 (#43555), Expvar is a bit left behind at this point. JSON is a very popular format for developers. Can any decision about runtime.Metrics take Expvar and runtime.Memstats into consideration?

from go.

mknyszek commented on May 28, 2024

@MikeMitchellWebDev It's true that expvar is missing runtime/metrics data, but it's unclear which metrics should be added and how. Please reply on #43555; this proposal is not the right place to discuss changing expvar. See also #61638 which isn't directly related, but maybe should be considered as well in any rethink of expvar.

Lastly, note that runtime.MemStats is generally a subset of runtime/metrics, and the latter should be preferred in general (for a number of reasons, including additional metrics as well as better performance). The only reason it is not officially deprecated is because it provides stronger guarantees than runtime/metrics. I admit I haven't been very good about updating the MemStats documentation to make this clear. I'll try to find some time this week to fix that.

EDIT: To be clear, I don't mean deprecating runtime.MemStats. That requires a proper proposal. Just documentation comparing and contrasting runtime.MemStats with the runtime/metrics package.

from go.

bboreham commented on May 28, 2024

@ArthurSens for some historical perspective see prometheus/client_golang#955 and prometheus/client_golang#1033

My suggestion was that client_golang would offer three main options for what is exported: "exactly what it did in the previous version", "the Go recommended metrics" and "everything that Go exposes". I imagine we can cope with the questions.

from go.

ArthurSens commented on May 28, 2024

@ArthurSens for some historical perspective see prometheus/client_golang#955 and prometheus/client_golang#1033

My suggestion was that client_golang would offer three main options for what is exported: "exactly what it did in the previous version", "the Go recommended metrics" and "everything that Go exposes". I imagine we can cope with the questions.

Thanks for the extra context! Yeah, I agree we can offer the recommended metrics in some way :)

from go.

dashpole commented on May 28, 2024

@mknyszek for "Recommended" histogram metrics (currently just /sched/latencies:seconds), are the bucket boundaries on histograms guaranteed to remain stable (i.e. no buckets removed)?

from go.

arl commented on May 28, 2024

@mknyszek for "Recommended" histogram metrics (currently just /sched/latencies:seconds), are the bucket boundaries on histograms guaranteed to remain stable (i.e. no buckets removed)?

The proposal states that the 'Recommended' set follows the guarantees of the "runtime/metrics" package:

	// For a given metric name, the value of Buckets is guaranteed not to change
	// between calls until program exit.

from: https://cs.opensource.google/go/go/+/refs/tags/go1.22.3:src/runtime/metrics/histogram.go;l=26-27

from go.

bwplotka commented on May 28, 2024

Thanks for this, great work!

What's the end goal?

This is especially true for projects like open-telemetry/semantic-conventions#535 and Prometheus which want to export some broadly-applicable Go runtime metrics by default, but the full set is overwhelming and not particularly user-friendly.

I wonder what is the exact intention and the end-goal behind this proposal. Is it to:

A. Convince the common instrumentation SDKs to give the Go team control over the default published metrics for the Go runtime? So the largest amount of Go applications possible have those common metrics OOTB, and adopt potential metrics changes as soon as they are rebuilt with a new Go version?

or...

B. To support a certain amount of users who wants to stay with the Go runtime "default" metrics that might change on Go version to version basis and there are fine with that.
C. Suggest what SDKs should add manually to the default set of metrics.

Picking a healthy, limited "recommended/default" set from the Go team is definitely helping for all of those. I love the recommendation mechanism too, easy to use to me. As co-maintainer of the Prometheus client_golang I fully support @ArthurSens words around adding a programmatic option e.g. WithGoCollectorRecommendedMetrics() that uses this. However, that will get you to the B only.

I wonder if A is realistic. Then if A is not possible at the moment, because e.g. OpenTelemetry and/or Prometheus client_golang (potentially popular metric SDKs) want to keep the influence on what's default (the current status quo), than is this proposal still viable?

I think to motivate SDKs to pursue A with Go team, we need to learn more about pros & cons here. What user will get out of it vs SDK adding manually some Go runtime metrics to default based on user feedback and the recent changes to recommended set? Some cons would be potentially different stability guarantees across Go team vs Otel vs Prometheus.

To sum up, is it A? Can we unpack pros & cons here for SDKs to assess those?

Recommended Metrics

TL;DR: Those make sense. /sched/latencies:seconds feels the most controversial for Prometheus, (usefulness vs cardinality) but only until we can put it in the new type (native histogram), then it should be fine.

Just to evaluate your proposed metrics and contribute to pros & cons of using Go recommended metrics as default, I diffed what client_golang has now vs recommended.

NOTE: All _memstats_ metrics come actually from the new Go runtime metrics, we just kept the name for stability (it's hard to rename metric from the user perspective).

Default runtime metrics from client_golang	Recommended Go runtime
go_gc_duration_seconds
go_goroutines	/sched/goroutines:goroutines
go_info
go_memstats_last_gc_time_seconds
go_threads
go_memstats_alloc_bytes
go_memstats_alloc_bytes_total	/gc/heap/allocs:bytes
go_memstats_sys_bytes	/memory/classes/total:bytes
go_memstats_lookups_total
go_memstats_mallocs_total	kind of /gc/heap/allocs:objects but with tiny allocs
go_memstats_frees_total
go_memstats_heap_alloc_bytes
go_memstats_heap_sys_bytes
go_memstats_heap_idle_bytes
go_memstats_heap_inuse_bytes
go_memstats_heap_released_bytes	/memory/classes/heap/released:bytes
go_memstats_heap_objects
go_memstats_stack_inuse_bytes	/memory/classes/heap/stacks:bytes
go_memstats_stack_sys_bytes
go_memstats_mspan_inuse_bytes
go_memstats_mspan_sys_bytes
go_memstats_mcache_inuse_bytes
go_memstats_mcache_sys_bytes
go_memstats_buck_hash_sys_bytes
go_memstats_gc_sys_bytes
go_memstats_other_sys_bytes
go_memstats_next_gc_bytes	/gc/heap/goal:bytes
	/gc/gogc:percent
	/gc/gomemlimit:bytes
	/sched/gomaxprocs:threads
	/sched/latencies:seconds

To sum up, I think I Prometheus is really close to recommended ones, plus I would propose adding /gc/gogc:percent, /gc/gomemlimit:bytes and /sched/gomaxprocs:threads to Prometheus go collector runtime default as those are important runtime variables to consider.

With that.. it's only /sched/latencies:seconds left, so having a current default PLUS (deduplicated) recommended set as our default might be a potential option to consider depending on:

the pros & cons discussion I proposed above
stability guarantees
other team members sentiments

from go.

mknyszek commented on May 28, 2024

I wonder what is the exact intention and the end-goal behind this proposal. Is it to:

A. Convince the common instrumentation SDKs to give the Go team control over the default published metrics for the Go runtime? So the largest amount of Go applications possible have those common metrics OOTB, and adopt potential metrics changes as soon as they are rebuilt with a new Go version?

or...

B. To support a certain amount of users who wants to stay with the Go runtime "default" metrics that might change on Go version to version basis and there are fine with that.
C. Suggest what SDKs should add manually to the default set of metrics.

It's really C, in practice. B is nice for those that want it, but I don't think A is practical. Everyone is always going to be free to choose what metrics they collect and/or expose at any layer.

Really I think we're just trying to set a better foundation here than the existing, somewhat haphazard, "collect MemStats and a few other metrics from some of the other runtime and runtime/debug functions" that is fairly widespread at this point. It's a starting point for what I hope will be a slow-but-virtuous cycle of the ecosystem informing the recommended set, and the recommended set informing the ecosystem, so we get a high signal-to-noise ratio for observability.

[...] Those [recommended metrics] make sense. /sched/latencies:seconds feels the most controversial for Prometheus, (usefulness vs cardinality) but only until we can put it in the new type (native histogram), then it should be fine.

FWIW, my thought was that SDKs can just choose to skip inherently high cardinality types programmatically, like Float64Histogram.

To sum up, I think I Prometheus is really close to recommended ones, plus I would propose adding /gc/gogc:percent, /gc/gomemlimit:bytes and /sched/gomaxprocs:threads to Prometheus go collector runtime default as those are important runtime variables to consider.

That's a good sign IMO. I'm supportive of adding those. While they're likely to be exactly the same over time, the fact is that you can mutate automatically at runtime. As above, re: /sched/latencies:seconds, I think it's fine if SDKs want to leave out certain metrics because they pose problems for collection.

from go.

proposal: runtime/metrics: define a recommended set of metrics about go HOT 14 OPEN

Comments (14)

What's the end goal?

Recommended Metrics

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent