Comments (6)
Hey @shurikp,
Thank you very much for such a detailed bug report! Sorry for the slow reply on my part. I got ill right after releasing 3.0.0, and then got back in to a very busy period in my day job.
I'm going to set up an app like the example you gave and see what we can do. My immediate thoughts (in order of preference) are:
- Checking
env
for anything extra thatRack::Builder
puts in there so we can add it into the path - Comparing
PATH_INFO
with the framework-specific routes (e.g.sinatra.route
) we extract. My thought is that we could compare the number of path segments in the two. IfPATH_INFO
has more, we could take the firstn
segments (wheren
is the difference in the number of segments) fromPATH_INFO
and prefixsinatra.route
with that. My only concern is making sure thatPATH_INFO
contains a canonicalised URL path (i.e. it strips out double//
with nothing between them) so that we're making a fair comparison with the route.
My schedule is pretty busy right now, but I will try and get something together when I have a moment.
from client_ruby.
Okay, I did a little digging, and it's not quite what I expected. It looks like Rack::Builder
makes modifications to SCRIPT_NAME
and PATH_INFO
when it routes a request into an app that you specified in a map
block.
I printed those two values in two places: once in the Prometheus collector middleware, and once in the app itself. In the collector middleware, they look like this:
SCRIPT_NAME before Rack::Builder:
PATH_INFO before Rack::Builder: /app1/vapes/7
whereas from inside the Sinatra app (i.e. after Rack::Builder
has made its modifications) they look like this:
SCRIPT_NAME after Rack::Builder: /app1
PATH_INFO after Rack::Builder: /vapes/7
The reason your monkeypatch works is that when the middleware sees the request environment, everything is in PATH_INFO
, and SCRIPT_INFO
is (unfortunately for our sinatra.route
logit) empty.
My first suggestion isn't useful, as our middleware is working before Rack::Builder
gets its hands on the request environment.
My second one would work, but it does feel somewhat brittle, so I'd like to take some time to see if I can think of a less janky solution to the problem.
EDIT: I just checked, any my worry about canonicalisation was well-placed:
# From another terminal
$ curl localhost:9292/app1////////vapes/7
# Printed by a modified version of Prometheus::Middleware::Collector
SCRIPT_NAME before Rack::Builder:
PATH_INFO before Rack::Builder: /app1////////vapes/7
# Printed from within the app endpoints
# Interestingly, Rack::Builder does canonicalise the path when it modifies SCRIPT_NAME and PATH_INFO
SCRIPT_NAME after Rack::Builder: /app1
PATH_INFO after Rack::Builder: /vapes/7
EDIT2: Rack::URLMap
is the class where the relevant stuff happens (Rack::Builder
uses it).
EDIT3: Canonicalising it might be reasonably simple, if all we're focused on is removing double //
:
[3] pry(main)> "/app1////////vapes//////////////7".gsub(%r{/+}, "/")
=> "/app1/vapes/7"
but in testing that, I discovered a new problem. I decided to test out what happened with trailing slashes (single and multiple), only to find that you need to add a /?
to the end of your Sinatra route to allow an optional trailing slash. When you do that, with our new code, the paths in your metrics end up looking like this:
http_server_request_duration_seconds_count{method="get",path="/vapes/:id/?"} 4.0
which seems kinda naff.
I'm starting to think this whole framework-specific path detection thing was a mistake. Even if we can sort out all of the canonicalisation issues (which I'm increasingly dubious of, the more I find new edge cases), more complex cases of sinatra.route
can end up not really being what you'd want in the path
label (I definitely wouldn't expect that /?
in there, and I bet there's a bunch of other syntax you can use that we haven't explored).
I'm inclined to back that part of #245 out of the gem, and just keep the improvements we made to handling IDs in consecutive path segments.
@dmagliola I'd love your thoughts.
from client_ruby.
So @dmagliola and I ended up spending a couple of hours talking through this and digging further into the source code of Rack::Builder
and Sinatra. Sadly, we came to the conclusion that the framework-specific route detection just isn't feasible to support, and we're going to back it out.
There were two factors that convinced us of that course of action, and I wanted to share them here so it's clear to the community how we reached the decision, and so that we have a clear explanation of why this route didn't work out if it comes up in the future.
First factor
The first factor came from digging further into why we were seeing different values when printing SCRIPT_NAME
and PATH_INFO
from the collector middleware and from inside the Sinatra apps mounted within Rack::Builder
(as per your example).
In order to dispatch to the different apps, Rack::Builder
constructs a Rack::URLMap
(that's the class that does the heavy lifting at runtime). When dispatching to the different applications you've mounted, Rack::URLMap
puts the prefix mapped to the app into SCRIPT_NAME
and the rest of the URL into PATH_INFO
. If it didn't do this, Sinatra wouldn't be able to route the request, because it wouldn't know anything about the app prefix in the URL.
The unfortunate part, when it comes to our middleware, is that it undoes this before returning to the previous middleware.
Let's look at this in the context of the collector middleware. Our middleware runs before Rack::Builder
and the Sinatra apps in the chain. We call our own trace
method with a block that calls the next middleware, we yield
to that block and record statistics about it. Once that returns, we call our record
method, which (amongst other things) generates the path to use in metric labels.
Unfortunately, Rack::Builder
leaves no trace of the path manipulation it performed. The request was passed on to Sinatra with the application prefix (e.g. /app1
) in SCRIPT_INFO
, but any trace of that is hidden from us.
From Sinatra's point of view, the route it handled didn't include the app prefix, so it's missing from sinatra.route
, and we have no way of knowing about the env
manipulation that Rack::Builder
did.
Second factor
I mentioned in a previous comment that route definitions could end up looking quite weird in path
labels:
http_server_request_duration_seconds_count{method="get",path="/vapes/:id/?"} 4.0
We ended up experimenting with more of Sinatra's route syntax, and it turns out that there's much worse in there. For example, if you use Sinatra's regex route matching:
get %r{/vapes/([0-9]+)/?} do
...
end
then you'll end up with path labels that look like:
http_server_request_duration_seconds_count{method="get",path="\\/vapes\\/([0-9]+)\\/?"} 1.0
which neither @dmagliola nor I felt was what users would want.
What we're doing next
Between those two factors, we decided that that the best course of action was to back out the changes from #245 that introduced the framework-specific route detection, and just keep the fix to strip_ids_from_path
.
We also realised that we missed a potential change that would make it easier for users to subclass Prometheus::Middleware::Collector
and provide their own custom logic for path generation by overriding generate_path
. Currently, we call strip_ids_from_path
outside of that method, when in reality it's part of path generation. We're going to move that method call inside generate_path
so that users of the collector have a single method they can override if they want to change how path
labels are generated.
Unfortunately, following the principles we set out for ourselves in the 3.0 release when it came to changing the labels we generate, this is a breaking change, so we'll be releasing a rather empty version 4.0 to communicate the potential breakage to users.
Thank you for bearing with us while we got this all figured out. Hopefully we've found the right path to take this time round.
from client_ruby.
Wow @Sinjo, thanks for such an incredibly detailed investigation and explanation! Unfortunately, I missed all your comments.. This means we'll stick with our monkeypatch until 4.0.0 and then remove it.
from client_ruby.
@shurikp No worries at all - we wanted to understand this properly so we could make a good decision.
Good news though: we released 4.0.0 yesterday! Hopefully that does the trick for you. Let us know if you run into any issues.
from client_ruby.
@Sinjo works great, thank you!
from client_ruby.
Related Issues (20)
- specifying ip-address and portnumber HOT 6
- CPU usage growing over time with DirectFileStore HOT 1
- better path handling HOT 14
- Update README.md with changes to pushgateway client
- Update CHANGELOG.md for 3.0.0 release HOT 1
- Exporter port option doesn't work HOT 2
- Counter metric is reset when it has no labels HOT 8
- ERROR: Permission to prometheus/client_ruby.git denied HOT 2
- The rack example does not work because of `Rack::Lint::LintError` HOT 1
- Aaup HOT 1
- How to metrify ActiveRecord sql queries and Redis jobs (sidekiq/resque)? HOT 1
- pushing metrics with instance grouping_key would result: instance is reserved HOT 6
- Fix `Rack` deprecation warning
- Improving DirectFileStore for performance and scalability HOT 18
- Implement native histograms support HOT 4
- cannot load such file -- prometheus/middleware/collector (LoadError) HOT 2
- Job name for pushgateway can't be a symbol HOT 1
- UTF-8: Implement support in Ruby client library HOT 5
- How can I change how paths are labelled? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from client_ruby.