ibraheemdev / matchit Goto Github PK
View Code? Open in Web Editor NEWA high performance, zero-copy URL router.
Home Page: https://docs.rs/matchit
License: MIT License
A high performance, zero-copy URL router.
Home Page: https://docs.rs/matchit
License: MIT License
no method named `check_priorities` found for struct `Router` in the current scope
method not found in `Router<String>`rustc[E0599](https://doc.rust-lang.org/error-index.html#E0599)
tree.rs(112, 1): Error originated from macro call here
Line 61 in fd2b5cd
Let's suppose you have common handler for REST type API on some resource,
like /companies/
.
So usual request pattern is :
Now all of the work can be done by the same handler , if it knows if :id
is present.
I'm suggesting introduction of optional last param, e.g.
router.insert("/companies/:id?", whatever);
router.insert("/companies/*catch_all?", whatever)
Optional params should be supported only if they are in the route end ( which is obvious ) .
This would be easier than writing double insert
in code that uses the library.
Hi,
I encountered this behaviour when working with axum. I'm not sure whether this is intended or not but for me it was unexpected.
This code returns an MissingTrailingSlash
error instead of a NotFound
error:
let mut with_underscore = Router::new();
with_underscore.insert("/frontend_api", "Welcome!")?;
with_underscore.insert("/frontend/aaaa", "A User")?;
with_underscore.at("/frontend/").unwrap();
This seems to happen when adding two routes, one with an underscore at the same position where the other one has a slash.
I tried different combinations of routes and only the version described above returned an unintuitive error:
use matchit::Router;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// ExtraTrailingSlash, expected NotFound
let mut with_underscore = Router::new();
with_underscore.insert("/frontend_api", "Welcome!")?;
with_underscore.insert("/frontend/aaaa", "A User")?;
let with_underscore = with_underscore.at("/frontend/");
match with_underscore{
Ok(_) => {}
Err(e) => {
println!("with_underscore: \t {e}");
}
}
// NotFound
let mut without_underscore = Router::new();
without_underscore.insert("/frontend/api", "Welcome!")?;
without_underscore.insert("/frontend/aaaa", "A User")?;
let without_underscore = without_underscore.at("/frontend/");
match without_underscore{
Ok(_) => {}
Err(e) => {
println!("without_underscore: \t {e}");
}
}
// NotFound
let mut underscore_at_dif_pos = Router::new();
underscore_at_dif_pos.insert("/front_endapi", "Welcome!")?;
underscore_at_dif_pos.insert("/frontend/aaaa", "A User")?;
let underscore_at_dif_pos = underscore_at_dif_pos.at("/frontend/");
match underscore_at_dif_pos{
Ok(_) => {}
Err(e) => {
println!("underscore_at_dif_pos: \t {e}");
}
}
// NotFound
let mut without_second_route = Router::new();
without_second_route.insert("/frontend/aaaa", "A User")?;
let without_second_route = without_second_route.at("/frontend/");
match without_second_route{
Ok(_) => {}
Err(e) => {
println!("without_second_route: \t {e}");
}
}
// ExtraTrailingSlash
let mut only_frontend = Router::new();
only_frontend.insert("/frontend", "A User")?;
let only_frontend = only_frontend.at("/frontend/");
match only_frontend{
Ok(_) => {}
Err(e) => {
println!("only_frontend: \t\t {e}");
}
}
// NotFound
let mut different_name = Router::new();
different_name.insert("/something_else", "Welcome!")?;
different_name.insert("/frontend/aaaa", "A User")?;
let different_name = different_name.at("/frontend/");
match different_name{
Ok(_) => {}
Err(e) => {
println!("different_name: \t {e}");
}
}
Ok(())
}
Cargo.toml:
[dependencies]
matchit = "0.6.0"
The backtracking logic introduced in (c316ba6) is causing incorrect tsr
values. We should check if there is a matching leaf with/without a trailing slash before backtracking.
I am in the process of developing a system that enables the dynamic registration and removal of HTTP routes while the service is in operation.
While Matchit offers impressive functionality, it currently lacks the capability to remove and update routes. Is there a possibility of incorporating these features into the framework?
I understand that there are existing workarounds for this limitation, but having built-in support for these actions in the future would be greatly appreciated.
Thank you for your dedication and hard work!
test case:
#[test]
fn test_non_ascii() {
let mut tree = Node::new();
tree.insert("x:x", ());
tree.path_ignore_case("£", false);
}
I'm working on including a package for this crate in Fedora Linux as a dependency of axum. During the review process, the license was flagged as being potentially inaccurate and / or incomplete - the crate does include license texts for both itself and for httprouter, but the crate metadata lists only "MIT" (this crate's license), and not "BSD-3-Clause" (the httprouter license).
This is not only a "problem" for distribution packages, but it will trip up users of tools like cargo-license
or cargo-deny
, which rely on this metadata being accurate, as well.
Asking for advice from Red Hat lawyers, it seems that at least for our purposes, we need to treat this crate as being licensed under the terms of both MIT and BSD-3-Clause (i.e. license = "MIT AND BSD-3-Clause"
):
The important bits are:
(...) The upstream project
says it's based on httprouter and it could be that some of it is a
close translation from Go to Rust. While the BSD licenses are not
entirely clear on this issue I think you should assume that the Rust
crate copies from httprouter in such a way that the httprouter license
requirements are triggered.
and
(...) you should assume the License: field
should includeBSD-3-Clause
as appropriate.
It is entirely possible that you don't want to make a corresponding change for the crate metadata in Cargo.toml, but it would be great if you could clarify this either way - especially the slightly confusing situation that the license text for httprouter is included in this project, but the license itself is not taken into account in the crate metadata.
I am working on a AWS Lambda project and this is our current router:
async fn http_router(
http: ApiGatewayV2httpRequestContextHttpDescription,
) -> Result<Response<Body>, Error> {
match (http.method.as_ref(), http.path.as_deref()) {
// echo.json
("GET", Some("/api/v1/echo.json")) => http_handler::get_echo_json(http),
// echo.html
("GET", Some("/api/v1/echo.html")) => http_handler::get_echo_html(http),
// Catch all
_ => http_handler::not_found_json(),
}
}
Could I replace this with matchit and route the requests to certain functions? Maybe I need to write an additional function that uses http.path and match on the values like that?
I came here from cloudflare/workers-rs and had a question about possible support for the following scenario:
Let's say I want to configure explicit routes, while also having a fallback handler, e.g.:
/home
/foo
/:fallback
Today, this doesn't seem possible because /:fallback
collides with other routes at the same hierarchy. A workaround is to introduce a short prefix like /f/:fallback
, but it's a little clunky compared to the desired behavior of "match explicit routes when present, and fallback to others otherwise".
Interested in thoughts on the above scenario - would you classify it as outside the bounds of what this routing library is intending to support, or something that may make sense in a future release? Or perhaps I'm missing a clever workaround with wildcards that would help?
Hi,
I maintain a project that needs to conform to the docker registry API, that uses catch-all parameters in the middle of routes (see Trow-Registry/trow#352 (comment), we use axum, which uses matchit).
I wonder how feasible and how much the maintainers of matchit would accept an implementation of cath-all parameters in the middle of routes ?
I think performance would be lower than for standard paths, but I have no idea how the implementation would look like with the radix trie.
Or maybe matchit should be kept lean and we should have a common trait so users can choose their router in axum (to fix tokio-rs/axum#109) ?
Let's consider the following test:
duplicate_conflict {
"/users" => Ok(()),
"/user" => Ok(()),
"/user" => Err(InsertError::Conflict { with: "/user".into() }),
},
If I add this to the main
branch today, it fails: the InsertError
returned for the second /user
insertion has with
set to /users
rather than /user
.
My hunch is that this has something to do with the fact that /user
is a sub-string of /users
.
My use-case is that I'm writing a tower::Service
and I have a Node<Handler>
where Handler
implements tower::Service
. What I want to do is implement poll_ready
for this tower::Service
, but I can't do it due to not being able to get mutable references to all values inside the node.
I think
(whitespace) should be encoded as %20
,which will works on URL
that contains whitespaces.
I test on julienschmidt/httprouter
,the http://127.0.0.1:3000/hell o/name
works, but matchit
not.
Or there are some misusage for me.
We should be able to support routes like /static
and /*catchall
with some tweaks to insert
.
I'm using this library by proxy of Axum, and ran into an issue in that the router would not accept path segments that contained multiple :
characters, greeting me with the error: Invalid route "/Schemas/urn:ietf:params:scim:schemas:core:2.0:User": only one parameter is allowed per path segment
, corresponding to the TooManyParams
error variant defined in this crate.
Could either the syntax be altered to support escaping the :
and *
characters with a prefix like \
, or have the insert method treat the whole path section as a literal if it doesn't start with a :
or *
, or add an .insert_components()
function that takes a slice of enum Component<'a>{Literal(&'a str), Param, Wildcard, Separator}
and uses that to determine the matching criteria?
Why not use a binary search tree just like what Nginx does? In the Nginx way, the paths at the same level are sorted in a balanced binary tree, while the subtree (also balanced binary tree) stores sub nodes with the same common prefix.
In the radix tree, it seems that at the same level, it compares the node value one by one?
I've implemented some routing using astra
+ matchit
, and there's one clone that bothers me.
Basically - I always need to do req.uri().path().clone()
because otherwise Params
will hold a shared reference to req
, and I need &mut req
to read the body.
It seems much more frugal to clone the params
(as they are often empty), than the whole path
, but there's no way for Params
to hold any owned data.
Seems to me that it would be possible to avoid it with converting Params
to be:
struct Params<'k, V> {
kind: ParamKind<'k, V>
}
struct Param<'k, V> {
key: &'k [u8],
value: V,
}
So it's possible to use both Params<'k, &'v [u8]>
and Params<'k, Vec<u8>>
. Quick sanity check says it's working: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=7e61d76d43bb607053bb7a1b3fb40320
The problem I encountered while using Axum was that I submitted an issue to Axium, and the response I received was that it might be a match issue. I don't know if this is the case?
struct Node {
prefix: Arc<[u8]>
}
struct Param<'v> {
key: BytesStr,
value: &'v u8
}
struct BytesStr(Arc<[u8]>);
impl ByteStr {
fn as_str(&self) -> &str {
std::str::from_utf8(&self.0[1..]).unwrap()
}
}
A trade off between less lifetime and reference counted overhead + an extra public type. Current Param(s) type are pretty hard to use if not consumed inline. From what I see most people would do an extra heap allocation to erase the lifetime(s) so in that case this would likely be cheaper.
Example with matchit 0.7.0, rust 1.69.0:
use matchit::Router;
#[test]
fn fixed_route() {
let mut router = Router::new();
router.insert("/path/foo", "foo").unwrap();
router.insert("/path/*rest", "wildcard").unwrap();
assert_eq!(router.at("/path/foo").map(|m| *m.value), Ok("foo"));
assert_eq!(router.at("/path/bar").map(|m| *m.value), Ok("wildcard"));
//assert_eq!(router.at("/path/foo/").map(|m| *m.value), Ok("wildcard")); // Err(ExtraTrailingSlash)
}
#[test]
fn param_route() {
let mut router = Router::new();
router.insert("/path/foo/:arg", "foo").unwrap();
router.insert("/path/*rest", "wildcard").unwrap();
assert_eq!(router.at("/path/foo/myarg").map(|m| *m.value), Ok("foo"));
//assert_eq!(router.at("/path/foo/myarg/").map(|m| *m.value), Ok("wildcard")); // Err(ExtraTrailingSlash)
//assert_eq!(router.at("/path/foo/myarg/bar/baz").map(|m| *m.value), Ok("wildcard")); // Err(NotFound)
}
All of the commented out assertions fail, even though they should match the wildcard route.
As discussed in Tokio's Discord server, it would be useful to add support for wildcard matching in the middle of the path
For instance, this is required to implement Docker Registry API V2
Some examples are provided below:
/v2/<name>/
/v2/<name>/blobs/<digest>
/v2/<name>/manifests/<reference>
Note: <name> can contain any number of segments (at least 2), but is limited to 256 characters in length (including slashes)
Also, it can be useful for various git clients (something like /:repo/*filepath/action
)
{x}
syntax for route parameters (#23 (comment), #17 (comment))Hi it would be really useful for my use case to support suffixes.
e.g. routes like "/users/:id.png"
or even "/users/:id.:ext"
... or even "/users/*.png"
Although it is possible to do at the application level it quickly gets very complicated when multiple extensions are present, and it looks like it might be easier to implement it in Matchit.
According to what I understand from the docs the parameter extends up to the next slash, and so currently is not possible.
How easy would it be to change parameter names to not allow '.' and add something like ParamWithSuffix(String) to the enum NodeType.
I optimistically forked the repo and was planning to implement it but then I realized that I dont understand the code well enough... My fork just has the test case (that currently fails) See below:
use matchit::{InsertError, MatchError, Router};
fn router()->Result<Router<&'static str>,InsertError> {
let mut router = Router::new();
router.insert("/home", "Welcome!")?;
router.insert("/users/:id", "A User")?;
router.insert("/users/:id.png", "A User Photo Currently not working")?;
router.insert("/users/:id/x.png", "Another User Photo that works")?;
Ok(router)
}
#[test]
fn match_img()->Result<(),MatchError>{
let router=router().unwrap();
let matched = router.at("/users/978")?;
assert_eq!(matched.params.get("id"), Some("978"));
assert_eq!(*matched.value, "A User");
Ok(())
}
To consider: how would multiple '.' be handled e.g. schema.foo.yaml
Hi @ibraheemdev! First of all, thanks for your work!
I'm trying to match a mix of dynamic and static routes, which have overlapping parts. But not all the cases are matching.
I guess this is better explained with the following code snippet.
#[cfg(test)]
mod tests {
use matchit::Node;
#[test]
fn test_routes() {
let mut matcher = Node::new();
matcher.insert("/:object/:id", "object with id").unwrap();
matcher.insert("/secret/:id/path", "secret with id and path").unwrap();
let matched = matcher.at("/secret/978/path").unwrap();
assert_eq!(matched.params.get("id"), Some("978"));
let matched = matcher.at("/something/978").unwrap();
assert_eq!(matched.params.get("id"), Some("978"));
assert_eq!(matched.params.get("object"), Some("something"));
let _ = matcher.at("/secret/978").unwrap(); // panics
}
}
It it expected behaviour?
I discovered this while working on axum's handling of trailing slashes.
It appears if you have a node with at least one route, that isn't /
, and you make a request for /
, you get an error where tsr == true
. That would indicate that /
with a route with a trailing slash exists which I guess doesn't make much sense for /
🤔
This reproduces it:
let mut node = matchit::Node::<u64>::new();
// only happens if the node contains at least one route
node.insert("/foo", 1).unwrap();
// passes, as expected since `/not-found/` doesn't exist
let err = node.at("/not-found").unwrap_err();
assert!(!err.tsr());
// fails, I would expect it to pass since no route is defined for `/`
// and `/` with a trailing slash doesn't make sense
let err = node.at("/").unwrap_err();
assert!(!err.tsr());
Reading https://portswigger.net/daily-swig/rust-patches-sneaky-redos-bug
Seems there might be a vulnerability lurking in regex, which was only patched from 1.5.5
onwards?
In any case - any reason not to be on latest 1.10
?
Hi, I was having an issue with Cloudflare worker-rs router but then I realized that the error is coming from this dependency
Issue: cloudflare/workers-rs#362
TL;DR:
How do I create such routing:
Router::new()
.get("/abc/:path", |_req, ctx| {
Response::ok("Hello world")
})
.get("/:path", |_req, ctx| {
Response::ok("Hello world2")
})
.run(req, env).await
Right now this is the error I'm getting:
panicked at 'failed to register Get route for /:path pattern: insertion failed due to conflict with previously registered route: //', /home/gytis/.local/share/cargo/registry/src/index.crates.io-6f17d22bba15001f/worker-0.0.17/src/router.rs:327:21
I'd be okay with a solution using matchit on its own but I am not sure how to wire it with Cloudflare just yet
When I try to add routes to the router, it allows paths not starting with a forward slash, except when a catch-all is used.
This seems deliberate but I don't see why that is or why other routes not starting with a slash are allowed.
I might be missing something but I'd say that there should only be a slash right before the catch-all or it should be at the start of the route. So *all
and foo/*all
are both ok but foo/bar*all
is not.
Here are some examples of what currently works and what doesn't.
let mut router = matchit::Router::new();
router.insert("a", ()).unwrap();
router.insert(":a", ()).unwrap();
router.insert("b/:b", ()).unwrap();
// Only the following two return errors
router.insert("c/*c", ()).unwrap();
router.insert("*d", ()).unwrap();
I'd be happy to provide a PR, but I want to check first if there are some considerations I'm missing.
The catch-all path /*p
currently doesn't match /
, although the documentation states the opposite. Copying the code for catch-alls in the README, the first assertion panics.
let mut m = Router::new();
m.insert("/*p", true)?;
assert_eq!(m.at("/")?.params.get("p"), Some(""));
assert_eq!(m.at("/foo.js")?.params.get("p"), Some("foo.js"));
assert_eq!(m.at("/c/bar.css")?.params.get("p"), Some("c/bar.css"));
Hi!
I did some benchmarks of Profile-Guided Optimization (PGO) on the library and want to share my results here. Test environment is Macbook M1 Pro. The first run was with cargo bench
, and the second - with a PGO-optimized version of the library via cargo pgo optimize bench
. The PGO profile was collected via cargo pgo bench
, the results from this run does not influence the comparison (since I did them in a separate Criterion directory). Link to cargo-pgo
- https://github.com/Kobzol/cargo-pgo .
The results of "Release" vs "Release + PGO" are here: https://pastebin.com/kDtPRK2Y . The only regression was found with "Routers/path-tree", other benchmarks are improved or left untouched.
I hope someone can find these results interesting. Maybe would be a good idea to write a note about PGO in README for the users who want to extract more performance from the library.
path-tree
is another router based on radix trees. It's worth including it in the benchmarks.
https://github.com/viz-rs/path-tree
For something like fall-through routing, it would be useful to return an iterator of values that match the route.
An example might be:
let mut router = Router::new();
router.insert("/users/:id", 1);
router.insert("/users/:name", 2);
router.insert("/users/*", 3);
let mut matches = router.at("/users/ari").iter();
assert_eq!(matches.next(), Some(1));
assert_eq!(matches.next(), Some(2));
assert_eq!(matches.next(), Some(3));
assert_eq!(matches.next(), None);
Currently routes like /:a/:b
and /:a
are supported, but only if they have a common prefix (/:a
). This means a route like /:a/:b
conflicts with /:c
. We can support this by normalizing route parameters positionally when they are first inserted. /:user/:id
would become /:a/:b
, and /:id
becomes /:a
. Param nodes get an additional field representing the user-defined parameter name, which is used to construct the parameter list.
Consider a path, /path/:value/?foo=:bar
parsed with url /path/test/?foo=something
. The library currently parses only test
parameter. It should parse foo
as well.
We're planning to use matchit
in Yew (yewstack/yew#2277) and this would be a very useful improvement.
Hi! I'd like to use matchit in a webassembly module that is recycled after every request completes, and I'd prefer not to pay the cost to build the whole router on every request. Ideally I'd like to create a router at compile time and save it as a non-mut static variable, then use it by calling .at
on it directly. (As a bonus this may allow LTO to eliminate everything but the .at
method, trimming out all of the radix tree building code.)
I'm currently doing this:
static ROUTER: Lazy<Router<fn(&Request) -> String>> = Lazy::new(|| {
let mut router: Router<fn(&Request) -> String> = Router::new();
router.insert("/", index).unwrap();
router.insert("/some/page", some_page).unwrap();
router
});
I asked about this in #rust:matrix.org and presumably it's possible in theory if Vecs were switched out for slices internally (as a start, who knows what else). I think it would be a neat feature, but whether it's worth inflicting this scale of change on the library is an entirely different question. For now I will not worry about this minor 'inefficiency', but I thought I'd bring it up here to get your opinion.
What do you think? Useful? Feasible?
insert
a CatchAll node would insert an empty CatchAll node and an actual CatchAll node. Why?
For example, for "/foo/*foobar", the final node tree is:
Node {
priority: 1,
wild_child: false,
indices: [
47,
],
node_type: Root,
value: None,
prefix: [
47,
102,
111,
111,
],
children: [
Node {
priority: 1,
wild_child: true,
indices: [],
node_type: CatchAll,
value: None,
prefix: [],
children: [
Node {
priority: 1,
wild_child: false,
indices: [],
node_type: CatchAll,
value: Some(
UnsafeCell { .. },
),
prefix: [
47,
42,
102,
111,
111,
98,
97,
114,
],
children: [],
},
],
},
],
}
If we remove the empty node, the performance is better.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.