Coder Social home page Coder Social logo

Comments (6)

Sinjo avatar Sinjo commented on June 11, 2024 2

Hey @shurikp,

Thank you very much for such a detailed bug report! Sorry for the slow reply on my part. I got ill right after releasing 3.0.0, and then got back in to a very busy period in my day job.

I'm going to set up an app like the example you gave and see what we can do. My immediate thoughts (in order of preference) are:

  • Checking env for anything extra that Rack::Builder puts in there so we can add it into the path
  • Comparing PATH_INFO with the framework-specific routes (e.g. sinatra.route) we extract. My thought is that we could compare the number of path segments in the two. If PATH_INFO has more, we could take the first n segments (where n is the difference in the number of segments) from PATH_INFO and prefix sinatra.route with that. My only concern is making sure that PATH_INFO contains a canonicalised URL path (i.e. it strips out double // with nothing between them) so that we're making a fair comparison with the route.

My schedule is pretty busy right now, but I will try and get something together when I have a moment.

from client_ruby.

Sinjo avatar Sinjo commented on June 11, 2024 2

Okay, I did a little digging, and it's not quite what I expected. It looks like Rack::Builder makes modifications to SCRIPT_NAME and PATH_INFO when it routes a request into an app that you specified in a map block.

I printed those two values in two places: once in the Prometheus collector middleware, and once in the app itself. In the collector middleware, they look like this:

SCRIPT_NAME before Rack::Builder:
PATH_INFO before Rack::Builder: /app1/vapes/7

whereas from inside the Sinatra app (i.e. after Rack::Builder has made its modifications) they look like this:

SCRIPT_NAME after Rack::Builder: /app1
PATH_INFO after Rack::Builder: /vapes/7

The reason your monkeypatch works is that when the middleware sees the request environment, everything is in PATH_INFO, and SCRIPT_INFO is (unfortunately for our sinatra.route logit) empty.

My first suggestion isn't useful, as our middleware is working before Rack::Builder gets its hands on the request environment.

My second one would work, but it does feel somewhat brittle, so I'd like to take some time to see if I can think of a less janky solution to the problem.

EDIT: I just checked, any my worry about canonicalisation was well-placed:

# From another terminal
$ curl localhost:9292/app1////////vapes/7

# Printed by a modified version of Prometheus::Middleware::Collector
SCRIPT_NAME before Rack::Builder:
PATH_INFO before Rack::Builder: /app1////////vapes/7

# Printed from within the app endpoints
# Interestingly, Rack::Builder does canonicalise the path when it modifies SCRIPT_NAME and PATH_INFO
SCRIPT_NAME after Rack::Builder: /app1
PATH_INFO after Rack::Builder: /vapes/7

EDIT2: Rack::URLMap is the class where the relevant stuff happens (Rack::Builder uses it).

EDIT3: Canonicalising it might be reasonably simple, if all we're focused on is removing double //:

[3] pry(main)> "/app1////////vapes//////////////7".gsub(%r{/+}, "/")
=> "/app1/vapes/7"

but in testing that, I discovered a new problem. I decided to test out what happened with trailing slashes (single and multiple), only to find that you need to add a /? to the end of your Sinatra route to allow an optional trailing slash. When you do that, with our new code, the paths in your metrics end up looking like this:

http_server_request_duration_seconds_count{method="get",path="/vapes/:id/?"} 4.0

which seems kinda naff.

I'm starting to think this whole framework-specific path detection thing was a mistake. Even if we can sort out all of the canonicalisation issues (which I'm increasingly dubious of, the more I find new edge cases), more complex cases of sinatra.route can end up not really being what you'd want in the path label (I definitely wouldn't expect that /? in there, and I bet there's a bunch of other syntax you can use that we haven't explored).

I'm inclined to back that part of #245 out of the gem, and just keep the improvements we made to handling IDs in consecutive path segments.

@dmagliola I'd love your thoughts.

from client_ruby.

Sinjo avatar Sinjo commented on June 11, 2024 2

So @dmagliola and I ended up spending a couple of hours talking through this and digging further into the source code of Rack::Builder and Sinatra. Sadly, we came to the conclusion that the framework-specific route detection just isn't feasible to support, and we're going to back it out.

There were two factors that convinced us of that course of action, and I wanted to share them here so it's clear to the community how we reached the decision, and so that we have a clear explanation of why this route didn't work out if it comes up in the future.

First factor

The first factor came from digging further into why we were seeing different values when printing SCRIPT_NAME and PATH_INFO from the collector middleware and from inside the Sinatra apps mounted within Rack::Builder (as per your example).

In order to dispatch to the different apps, Rack::Builder constructs a Rack::URLMap (that's the class that does the heavy lifting at runtime). When dispatching to the different applications you've mounted, Rack::URLMap puts the prefix mapped to the app into SCRIPT_NAME and the rest of the URL into PATH_INFO. If it didn't do this, Sinatra wouldn't be able to route the request, because it wouldn't know anything about the app prefix in the URL.

The unfortunate part, when it comes to our middleware, is that it undoes this before returning to the previous middleware.

Let's look at this in the context of the collector middleware. Our middleware runs before Rack::Builder and the Sinatra apps in the chain. We call our own trace method with a block that calls the next middleware, we yield to that block and record statistics about it. Once that returns, we call our record method, which (amongst other things) generates the path to use in metric labels.

Unfortunately, Rack::Builder leaves no trace of the path manipulation it performed. The request was passed on to Sinatra with the application prefix (e.g. /app1) in SCRIPT_INFO, but any trace of that is hidden from us.

From Sinatra's point of view, the route it handled didn't include the app prefix, so it's missing from sinatra.route, and we have no way of knowing about the env manipulation that Rack::Builder did.

Second factor

I mentioned in a previous comment that route definitions could end up looking quite weird in path labels:

http_server_request_duration_seconds_count{method="get",path="/vapes/:id/?"} 4.0

We ended up experimenting with more of Sinatra's route syntax, and it turns out that there's much worse in there. For example, if you use Sinatra's regex route matching:

get %r{/vapes/([0-9]+)/?} do
  ...
end

then you'll end up with path labels that look like:

http_server_request_duration_seconds_count{method="get",path="\\/vapes\\/([0-9]+)\\/?"} 1.0

which neither @dmagliola nor I felt was what users would want.

What we're doing next

Between those two factors, we decided that that the best course of action was to back out the changes from #245 that introduced the framework-specific route detection, and just keep the fix to strip_ids_from_path.

We also realised that we missed a potential change that would make it easier for users to subclass Prometheus::Middleware::Collector and provide their own custom logic for path generation by overriding generate_path. Currently, we call strip_ids_from_path outside of that method, when in reality it's part of path generation. We're going to move that method call inside generate_path so that users of the collector have a single method they can override if they want to change how path labels are generated.

Unfortunately, following the principles we set out for ourselves in the 3.0 release when it came to changing the labels we generate, this is a breaking change, so we'll be releasing a rather empty version 4.0 to communicate the potential breakage to users.

Thank you for bearing with us while we got this all figured out. Hopefully we've found the right path to take this time round.

from client_ruby.

shurikp avatar shurikp commented on June 11, 2024

Wow @Sinjo, thanks for such an incredibly detailed investigation and explanation! Unfortunately, I missed all your comments.. This means we'll stick with our monkeypatch until 4.0.0 and then remove it.

from client_ruby.

Sinjo avatar Sinjo commented on June 11, 2024

@shurikp No worries at all - we wanted to understand this properly so we could make a good decision.

Good news though: we released 4.0.0 yesterday! Hopefully that does the trick for you. Let us know if you run into any issues.

from client_ruby.

shurikp avatar shurikp commented on June 11, 2024

@Sinjo works great, thank you!

from client_ruby.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.