tokio-rs / tokio-metrics Goto Github PK
View Code? Open in Web Editor NEWUtilities for collecting metrics from a Tokio application
License: MIT License
Utilities for collecting metrics from a Tokio application
License: MIT License
Hello,
This is a feature request for some way to get the TaskMetrics
for the invocation of a single future. Something like:
let monitor = tokio_metrics::TaskMonitor::new();
let (metrics, other_return_value) = monitor.instrument_single(some_future()).await;
The API usage above is not intended to be the actual API, just illustrating the idea. I want this feature is so that I can record metrics for the overhead every single execution of the some_future()
future.
The ultimate reason is that I'm trying to write a program that measures the latency of remote service calls, and I want to understand what kind of overhead I'm seeing as a result of using an async runtime, as opposed to a simple blocking thread application. I'd like to see this on a per-request basis so that I can confirm that requests with high latency are only the result of the remote system, not a result of a delay in scheduling the task.
I think num_scheduled
is going to equal num_polls - num_tasks
? Need to double-check this, but if so, it doesn't need to be a field in the Metrics
struct; it could be computed in a method, instead.
Should it even be exposed? @carllerche points out that this metric matters much more at the runtime level, since there are multiple ways tasks may be scheduled. For task metrics, what matters more is time spent scheduled. At least internally, we need to account for num_scheduled
so we can compute mean_time_scheduled
, but maybe num_scheduled
doesn't actually need to be exposed.
Is storing durations as u64 nanoseconds enough? It’s 584 years, but if you have 5000 tasks, you’ll burn through it in 42 days of uptime. That sounds doable. At minimum, we should make sure it doesn’t panic on overflow/underflow.
Hi there!
I'd like to start exposing tokio runtime metrics as part of my application's prometheus metrics. Unfortunately, there is a number of conceptual differences, which make tokio-metrics not really suitable for this.
Prometheus usually scrapes applications' metrics by calling an HTTP endpoint in equal time intervals. In my practice I've encountered scrape intervals between 15 secs and 5 minutes, it is determined by a trade-off in resolution requirements and available storage resources. In any case, all metric changes between two scrapes are not observable via Prometheus, usually the best practice for that is to implement most metrics as non-decreasing counters and derive frequency properties from that.
Also, since each metric scrape is a network interaction, it can be failed and retried without guarantees that the request really made it through to the process or not. Due to that it's important for a metrics endpoint to be stateless, which is violated in how intervals
iterator is implemented. Ideally, there would be no state change at all when retrieving the current state of metrics.
Do you think that tokio-metrics is a good place to implement that kind of stuff or do you believe it targets a different type of metrics here?
For each task metric, it's fairly easy to write a crisp, self-contained example that reliably induces a change in a metric. For runtime metrics, it's currently not so easy to do this, because:
We could resolve the first obstacle to provide some mechanism to flush metrics on-demand. For the second obstacle, I'm not sure there's much we can do.
Hey there,
is there some kind of (maybe optionally feature gated) integrated UI planned for this?
I'm not really good at web stuff, but I guess I'll integrate a small Chart.js driven one without data retention into my project for now. Should I share that once it's done?
Implement Instrumented<T>
for T: Stream
.
~/github.com/tokio-metrics:main@e66d2ff$ RUSTFLAGS="--cfg tokio_unstable" cargo test --all-features
Finished test [unoptimized + debuginfo] target(s) in 0.15s
Running unittests (target/debug/deps/tokio_metrics-ec134d5a58bb3238)
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Doc-tests tokio-metrics
running 40 tests
test src/task.rs - task::TaskMetrics::first_poll_count (line 607) ... ok
test src/task.rs - task::TaskMetrics::instrumented_count (line 538) ... ok
test src/task.rs - task::TaskMetrics::mean_poll_duration (line 2001) ... ok
test src/task.rs - task::TaskMetrics::dropped_count (line 568) ... ok
test src/task.rs - task::TaskMetrics::total_fast_poll_count (line 1089) ... ok
test src/task.rs - task::TaskMetrics::mean_slow_poll_duration (line 2222) ... ok
test src/task.rs - task::TaskMetrics::mean_fast_poll_duration (line 2131) ... ok
test src/task.rs - task::TaskMetrics::slow_poll_ratio (line 2046) ... ok
test src/task.rs - task::TaskMetrics::mean_idle_duration (line 1881) ... ok
test src/task.rs - task::TaskMetrics::total_fast_poll_duration (line 1144) ... ok
test src/task.rs - task::TaskMetrics::total_first_poll_delay (line 648) ... ok
test src/task.rs - task::TaskMetrics::total_first_poll_delay (line 697) ... ok
test src/task.rs - task::TaskMetrics::total_first_poll_delay (line 731) ... FAILED
test src/task.rs - task::TaskMetrics::total_idle_duration (line 811) ... ok
test src/task.rs - task::TaskMetrics::total_idled_count (line 770) ... ok
test src/task.rs - task::TaskMonitor (line 306) ... ignored
test src/task.rs - task::TaskMonitor (line 321) ... ignored
test src/task.rs - task::TaskMetrics::total_poll_count (line 989) ... ok
test src/task.rs - task::TaskMetrics::total_poll_duration (line 1054) ... ok
test src/task.rs - task::TaskMetrics::total_scheduled_count (line 850) ... ok
test src/task.rs - task::TaskMetrics::mean_first_poll_delay (line 1811) ... ok
test src/task.rs - task::TaskMetrics::total_slow_poll_count (line 1211) ... ok
test src/task.rs - task::TaskMetrics::total_slow_poll_duration (line 1269) ... ok
test src/task.rs - task::TaskMonitor (line 71) - compile ... ok
test src/task.rs - task::TaskMonitor (line 362) ... FAILED
test src/task.rs - task::TaskMonitor (line 388) ... FAILED
test src/task.rs - task::TaskMonitor (line 413) ... ok
test src/lib.rs - (line 12) ... ok
test src/task.rs - task::TaskMonitor::cumulative (line 1571) ... ok
test src/task.rs - task::TaskMonitor (line 452) ... ok
test src/task.rs - task::TaskMonitor::instrument (line 1488) ... ok
test src/task.rs - task::TaskMonitor::instrument (line 1510) ... ok
test src/task.rs - task::TaskMonitor::instrument (line 1530) ... ok
test src/task.rs - task::TaskMonitor (line 281) ... ok
test src/task.rs - task::TaskMetrics::total_scheduled_duration (line 920) ... ok
test src/task.rs - task::TaskMonitor::intervals (line 1632) ... ok
test src/task.rs - task::TaskMonitor::slow_poll_threshold (line 1467) ... ok
test src/task.rs - task::TaskMonitor::with_slow_poll_threshold (line 1406) ... ok
test src/task.rs - task::TaskMetrics::mean_scheduled_duration (line 1920) ... ok
test src/task.rs - task::TaskMonitor (line 24) ... ok
failures:
---- src/task.rs - task::TaskMetrics::total_first_poll_delay (line 731) stdout ----
Test executable failed (exit code 101).
stderr:
thread 'main' panicked at 'overflow when adding duration to instant', library/std/src/time.rs:409:33
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---- src/task.rs - task::TaskMonitor (line 362) stdout ----
Test executable failed (exit code 101).
stderr:
thread 'main' panicked at 'overflow when adding duration to instant', library/std/src/time.rs:409:33
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---- src/task.rs - task::TaskMonitor (line 388) stdout ----
Test executable failed (exit code 101).
stderr:
thread 'main' panicked at 'overflow when adding duration to instant', library/std/src/time.rs:409:33
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
src/task.rs - task::TaskMetrics::total_first_poll_delay (line 731)
src/task.rs - task::TaskMonitor (line 362)
src/task.rs - task::TaskMonitor (line 388)
test result: FAILED. 35 passed; 3 failed; 2 ignored; 0 measured; 0 filtered out; finished in 7.52s
error: test failed, to rerun pass '--doc'
This is a Mac OSX environment.
The readme should stick to the default imo.
Due to tokio-rs/tokio#5223, some metrics tests were broken. These need to be fixed.
failures:
---- src/task.rs - task::TaskMetrics::mean_scheduled_duration (line 1924) stdout ----
Test executable failed (exit status: 101).
stderr:
thread 'main' panicked at 'assertion failed: interval.mean_scheduled_duration() >= Duration::from_secs(1)', src/task.rs:34:5
stack backtrace:
0: rust_begin_unwind
at /rustc/fc594f15669680fa70d255faec3ca3fb507c3405/library/std/src/panicking.rs:575:5
1: core::panicking::panic_fmt
at /rustc/fc594f15669680fa70d255faec3ca3fb507c3405/library/core/src/panicking.rs:64:14
2: core::panicking::panic
at /rustc/fc594f15669680fa70d255faec3ca3fb507c3405/library/core/src/panicking.rs:111:5
3: rust_out::main::{{closure}}
4: <core::pin::Pin<P> as core::future::future::Future>::poll
5: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}::{{closure}}
6: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}
7: tokio::runtime::scheduler::current_thread::Context::enter
8: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}
9: tokio::runtime::scheduler::current_thread::CoreGuard::enter::{{closure}}
10: tokio::macros::scoped_tls::ScopedKey<T>::set
11: tokio::runtime::scheduler::current_thread::CoreGuard::enter
12: tokio::runtime::scheduler::current_thread::CoreGuard::block_on
13: tokio::runtime::scheduler::current_thread::CurrentThread::block_on
14: tokio::runtime::runtime::Runtime::block_on
15: rust_out::main
16: core::ops::function::FnOnce::call_once
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
---- src/task.rs - task::TaskMetrics::total_scheduled_duration (line 922) stdout ----
Test executable failed (exit status: 101).
stderr:
thread 'main' panicked at 'assertion failed: total_scheduled_duration >= Duration::from_millis(1000)', src/task.rs:30:5
stack backtrace:
0: rust_begin_unwind
at /rustc/fc594f15669680fa70d255faec3ca3fb507c3405/library/std/src/panicking.rs:575:5
1: core::panicking::panic_fmt
at /rustc/fc594f15669680fa70d255faec3ca3fb507c3405/library/core/src/panicking.rs:64:14
2: core::panicking::panic
at /rustc/fc594f15669680fa70d255faec3ca3fb507c3405/library/core/src/panicking.rs:111:5
3: rust_out::main::{{closure}}
4: <core::pin::Pin<P> as core::future::future::Future>::poll
5: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}::{{closure}}
6: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}
7: tokio::runtime::scheduler::current_thread::Context::enter
8: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}
9: tokio::runtime::scheduler::current_thread::CoreGuard::enter::{{closure}}
10: tokio::macros::scoped_tls::ScopedKey<T>::set
11: tokio::runtime::scheduler::current_thread::CoreGuard::enter
12: tokio::runtime::scheduler::current_thread::CoreGuard::block_on
13: tokio::runtime::scheduler::current_thread::CurrentThread::block_on
14: tokio::runtime::runtime::Runtime::block_on
15: rust_out::main
16: core::ops::function::FnOnce::call_once
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
failures:
src/task.rs - task::TaskMetrics::mean_scheduled_duration (line 1924)
src/task.rs - task::TaskMetrics::total_scheduled_duration (line 922)
test result: FAILED. 56 passed; 2 failed; 2 ignored; 0 measured; 0 filtered out; finished in 35.93s
It would be helpful to add Debug
impl for all public types, like TaskMonitor
.
I want to print metrics of Tokio example with master HEAD, then I get below error:
error[E0308]: mismatched types
--> examples/tinyhttp.rs:40:51
|
40 | let runtime_monitor = RuntimeMonitor::new(&handle);
| ------------------- ^^^^^^^ expected struct `tokio::runtime::handle::Handle`, found struct `Handle`
| |
| arguments to this function are incorrect
|
= note: expected reference `&tokio::runtime::handle::Handle`
found reference `&Handle`
= note: perhaps two different versions of crate `tokio` are being used?
note: associated function defined here
--> /root/github/tokio-metrics/src/runtime.rs:1015:12
|
1015 | pub fn new(runtime: &runtime::Handle) -> RuntimeMonitor {
| ^^^
For more information about this error, try `rustc --explain E0308`.
error: could not compile `examples` due to previous error
Full change in Tokio:
diff --git a/.cargo/config b/.cargo/config
index df885898..71097e3c 100644
--- a/.cargo/config
+++ b/.cargo/config
@@ -1,2 +1,5 @@
+[build]
+rustflags = ["--cfg", "tokio_unstable"]
+rustdocflags = ["--cfg", "tokio_unstable"]
# [build]
-# rustflags = ["--cfg", "tokio_unstable"]
\ No newline at end of file
+# rustflags = ["--cfg", "tokio_unstable"]
diff --git a/examples/Cargo.toml b/examples/Cargo.toml
index b35c587b..e628ceb2 100644
--- a/examples/Cargo.toml
+++ b/examples/Cargo.toml
@@ -10,7 +10,7 @@ edition = "2018"
tokio = { version = "1.0.0", path = "../tokio", features = ["full", "tracing"] }
tokio-util = { version = "0.7.0", path = "../tokio-util", features = ["full"] }
tokio-stream = { version = "0.1", path = "../tokio-stream" }
-
+tokio-metrics = { version = "0.1.0", path = "../../tokio-metrics" }
tracing = "0.1"
tracing-subscriber = { version = "0.3.1", default-features = false, features = ["fmt", "ansi", "env-filter", "tracing-log"] }
bytes = "1.0.0"
@@ -24,6 +24,9 @@ httpdate = "1.0"
once_cell = "1.5.2"
rand = "0.8.3"
+
+
+
[target.'cfg(windows)'.dev-dependencies.windows-sys]
version = "0.42.0"
diff --git a/examples/tinyhttp.rs b/examples/tinyhttp.rs
index fa0bc669..0457406a 100644
--- a/examples/tinyhttp.rs
+++ b/examples/tinyhttp.rs
@@ -18,8 +18,10 @@ use futures::SinkExt;
use http::{header::HeaderValue, Request, Response, StatusCode};
#[macro_use]
extern crate serde_derive;
+use std::time::Duration;
use std::{env, error::Error, fmt, io};
use tokio::net::{TcpListener, TcpStream};
+use tokio_metrics::RuntimeMonitor;
use tokio_stream::StreamExt;
use tokio_util::codec::{Decoder, Encoder, Framed};
@@ -33,6 +35,18 @@ async fn main() -> Result<(), Box<dyn Error>> {
let server = TcpListener::bind(&addr).await?;
println!("Listening on: {}", addr);
+ let handle = tokio::runtime::Handle::current();
+ {
+ let runtime_monitor = RuntimeMonitor::new(&handle);
+ tokio::spawn(async move {
+ for interval in runtime_monitor.intervals() {
+ // pretty-print the metric interval
+ println!("{:?}", interval);
+ // wait 500ms
+ tokio::time::sleep(Duration::from_secs(1)).await;
+ }
+ });
+ }
loop {
let (stream, _) = server.accept().await?;
tokio::spawn(async move {
Command:
RUSTFLAGS="--cfg tokio_unstable" cargo run --example tinyhttp
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.