microsoft / reverse-proxy Goto Github PK
View Code? Open in Web Editor NEWA toolkit for developing high-performance HTTP reverse proxy applications.
Home Page: https://microsoft.github.io/reverse-proxy
License: MIT License
A toolkit for developing high-performance HTTP reverse proxy applications.
Home Page: https://microsoft.github.io/reverse-proxy
License: MIT License
https://github.com/dotnet/proxy/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc
Unfortunately you can't transfer issues cross-org :(. Might be OK to just to a manual copy-pasta transfer though.
We expect several features to be add-ins that plug in as connection middleware. Add a connection middleware to the sample in this repo.
Here's one example from Http2:
https://github.com/dotnet/aspnetcore/blob/09bb7b4ca5a4fbde0283c294c35fac8b485c0074/src/Servers/Kestrel/samples/Http2SampleApp/Program.cs#L41-L54
Other things we expect to need to plug into here:
We don't have to write those components as part of this task, just demonstrate that a connection middleware has access to the necessary inputs and controls (e.g. IPs, sniffing the data stream, drop connections, etc.).
Specifically, this involves being able to filter connections based on information contained in the ClientHello TLS frame. Examples include:
This will likely be useful for adding middleware to specific routes.
dotnet/aspnetcore#14514
The workaround today is creating your own app builder.
https://github.com/dotnet/proxy/commit/8be62673e25861e283c14aafcc60db2b0ed928b7#diff-a899554e36fdb9c1c365b48b296246cdR35-R37
In order to provide the detailed diagnostics proxy users will want, we need to expand the diagnostics and tracing support throughout the networking stack.
This includes both counters (using EventCounters
) and tracing (using EventSource
). It covers the low-level sockets API, as well as higher level APIs like HttpClient.
Another aspect to consider here is dimensions. We believe it is important for users to be able to specify the "scope" of a particular network operation and include custom dimensions like Route identifiers, backend names, etc. For v1 we believe it is sufficient to express these in the EventSource events, and keep the counters high-level.
Tracking some diagnostics improvments
Load balance by taking the "next" backend in the list for each incoming request. We wrap around to the start of the list when we run out of backends.
Since requests will likely be highly-concurrent, there's some complexity here in how the state is maintained. Also, since backends can change, we'll need to consider how this algorithm behaves in that case. Naively, this would be a simple atomic counter that the load balancer does an atomic increment-and-return, then modulus by the number of backends to figure out which backend to use.
Seems odd. We should be running on the same kind of machine in the public one too. Maybe this is flaky?
cc @Tratcher
This is one of the simplest load-balancers. Select a random backend from the set of healthy backends and route to it. It can be surprisingly effective, so it's a useful algorithm to have. It's also trivial to implement so it gives us a way to exercise our abstractions.
when a request fails - retry against a different server
Collect metrics for all stages of the proxy processing, including outbound http requests
This mirrors #11
This is a proxy forwarder protocol primary used by layer 4 proxies to forward information about the original client such as their IP. It does so by pre-pending the connection stream with a text or binary formatted blob of data. It's more efficient than adding X-Forwarded-For and similar headers to every request.
In HttpClient this would be implemented at the proposed Dialer or Bedrock layer, sending the blob just after the connection was established.
We believe there are some gains to be found in reducing allocations in HttpClient, particularly for proxy scenarios where it may be reasonable to bypass header validation and other HttpClient features designed for more general-purpose use cases.
We should install dotnet/arcade into this repo, enable syncing to AzDO and CI/CD.
TODO:
Config based proxies are common and we'll need to support at least basic proxy scenarios from config. Here are some initial considerations:
We use third-party packages (for now) so we should add a NOTICE
file before going public.
Things we use:
We should plug in to the dependency flow and start adopting 5.0 nightly builds.
The current proxy routing design assumes all route data is available up front and can be loaded atomically into the route table. The EndpointDataSource routing model assumes this as well.
For more dynamically-configured services, we probably don't want to fetch the entire route table from the db, but instead have a pull and cache for each hostname model.
How can we lazy-load route, backend, and endpoint data on a per request basis?
Assume we're working with a discrete key like Host, not a complicated best-match scenario like Path, and little or no wildcard support (maybe wildcard subdomains but only for specific depths?).
@davidni is this a scenario you'd evaluated?
@rynowak can such a system work within routing, co-exist side by side routing, or would it completely replace routing?
Follow up to #2
In #35, there was a failing test, but no server-side logs. https://dev.azure.com/dnceng/public/_build/results?buildId=584078&view=logs&j=37a405a8-c7d5-566e-a69b-63119597d487
A proxy will often want to route based on the host. It's possible today but it's clunky.
dotnet/aspnetcore#19354
Http2 has the concept of prioritization of requests within a stream. How do we want to think of these with respect to the proxy front end if it has an HTTP/2 or HTTP/3 connection. As we terminate the end user request, and create new requests, potentially distributed across multiple servers, any prioritization signals will be lost. As we queue requests for processing, or pack responses into the response stream, do we need to be prioritizing which responses get streamed first?
https://blog.cloudflare.com/adopting-a-new-approach-to-http-prioritization/
Does setting HttpCompletionOption.ResponseHeadersRead
in HttpClient.SendAsync
improve performance?
This testing style and library are inconsistent with our other projects.
In Island Gateway we implemented the notion of Transformations
, which takes inspiration from Traefik's middlewares (though we're not calling them middlewares for obvious reasons). See e.g. Traefik's StripPrefix.
A user can specify e.g. Transformations=StripPrefix('/api'), AddPrefix('/v2')
, which results in the listed processing steps being applied sequentially at runtime.
The ability to specify simple transformations like these through config is a requirement for us.
Example of a route using transformations:
Rule = Host('example.com') && Path('/myservice/{**catchall}')
Transformations = StripPrefix('/myservice')
(this is a feature we already implemented in Island Gateway, filing here as requested to track)
Here are things we expect to need expanded on for kestrel's existing config system:
We should enable the IL Linker and Single-File in the Proxy "app" (whatever that ends up being) and make sure we are tracking file size. There's a desire to have a low disk footprint and since the Proxy is a more static component than an arbitrary app.
We need to get YARP running in the benchmark lab (https://msit.powerbi.com/view?r=eyJrIjoiYTZjMTk3YjEtMzQ3Yi00NTI5LTg5ZDItNmUyMGRlOTkwMGRlIiwidCI6IjcyZjk4OGJmLTg2ZjEtNDFhZi05MWFiLTJkN2NkMDExZGI0NyIsImMiOjV9).
Ability to customize headers (beyond x-forward or affinity) from code extensibility for both request to back end, and response to the the client.
Enable custom metadata in backend endpoint definition so a custom load balancer could do A/B switching etc.
Note: This task is about ensuring the plumbing is available to support this scenario, but does not involve creating a custom load balancer
We need a good README.
Motivation: Select a specific NIC for certain outbound requests
dotnet/runtime#28721
There are 3 ways I can see routes being defined for a proxy:
a) In config via a config file, changeable at runtime - this is what most proxies do today
b) Statically defined in code - setting up static endpoints similar to the sample startup.cs
c) Dynamically created by code. This seems to be what most of the Azure partners need. At startup, and periodically the code will need to query a backend to fetch both the Endpoints to listen to, and where each of those should be routed.
[added from discussion below]
d) When there are too many routes to load them all, we need to be able to query a database per request to load and cache routes. From our research, more Azure partners do this than C.
This last bucket is where we need to shine and differentiate ourselves from the competition.
It will need to account for:
For some apps, they may need to do a DB lookup to discover the cert/route based on the URL being passed - so will need to have a cache and fallback mechanism.
Performance would dictate you probably need optimized structures like a trie, that I suspect are behind the existing routing tables. Either we need APIs to be able to dynamically edit the tables, and/or create and replace the tables, or be able to use the same concepts from custom app code for the proxy.
Customers behind certain firewalls or load balancers, such as Azure's SNAT, have a limit on the number of connections they can make per ip/port endpoint.
HttpClient
has a MaxConnectionsPerServer
setting, but the limit is not actually applied per-server, but rather per-pool, with each pool being partitioned by a number of connection properties:
It is unreasonable to apply a meaningful SNAT limit on top of HttpClient
because the client will pool connections and does not expose connection lifetime events to the user.
Creating a simple out of band (not part of proxied requests) to confirm the health status of back ends.
a) Polling back ends to see if they are alive
b) Eliminating dead entries from load balancing
Be able to filter connections at SSL connection time to prevent DoS attacks
The RouteValidator uses a strict regex to check the host matching pattern. I have my doubts that this level of validation is even needed. The source is developer input, not external input, and it's only used in the route matcher (HostAttribute). We should be careful about second guessing what the HostMatcherPolicy can handle. For example the regex doesn't allow IDNs (Unicode hosts).
Preferably we'd use a validation helper provided directly by the framework so they'd be in sync.
Import the code from the IslandGateway project provided by our internal partners.
Mark a back end node unhealthy when a request fails.
There will typically be a bunch of policy about how many failures are required to trip the circuit and when to retry the back end to see if it has recovered. This item is to ensure we have the plumbing to be able do circuit breaking, but not necessarily to create an all encompassing circuit breaking module.
Create and use IOperationLogger<TCategoryName>
instead of IOperationLogger
and make the implementations use ILogger<TCategoryName>
insead of ILogger
.
It's easier to follow logs.
https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
This is a proxy forwarder protocol primary used by layer 4 proxies to forward information about the original client such as their IP. It does so by pre-pending the connection stream with a text or binary formatted blob of data. It's more efficient than adding X-Forwarded-For and similar headers to every request.
In Kestrel this would be implemented as a connection middleware. We helped an internal team build this middleware already and they're using it with an existing azure network services that sends that data.
I'll file a separate issue for outbound support as it would need to integrate with HttpClient.
When running at scale, gateway instances must report their health state indicating whether an instance is willing or not to receive new traffic. See e.g. Azure Load Balancer health probes, which would call the Gateway periodically.
It makes sense for the Reverse Proxy core to support this since this requires careful coordination of startup / teardown activities, and it also influences runtime gateway behavior. E.g during graceful shutdown, Gateway must remain serving new requests, while reporting its upcoming teardown. It must also advise clients to close existing connections by sending a Connection: close
response header when applicable during graceful shutdown, while delaying teardown until (a) all requests are drained; *and (b) Load Balancer has detected the instance is not eligible for new traffic.
We have implemented this in Island Gateway and I can share more details if desired.
In general, we'd like this to adhere to the same style "guidelines" as .NET Core (which aren't formally documented ).
We should also clean up things like tests to avoid using patterns we generally don't use (like FluentAssertions).
Finally we should include an .editorconfig
that can enforce our style and ensure contributors can just open the solution and start working without worrying about style.
Today routing hands the request off to the ProxyInvoker. This orchestrator is a bit monolithic and has limited pluggability. It is given a list of healthy endpoints from the health service, invokes a load balancer service to resolve an endpoint, and then invokes the proxy logic.
In design meetings we've proposed the concept of a pipeline to allow developers to insert customizations into this process. However, there's a specific service contract between each of these components, not a generic contract between all of them, and it's not well suited to making this a generic pipeline.
Options:
A) Try to refactor this to a more generic object model? (e.g. middleware)
B) Insert targeted extensibility at specific points? (e.g. the authentication handler events model)
Additionally, is it possible to do this customization per route? Or is there a central pipeline that can react to endpoint specific metadata?
We should support session affinity as a native feature in the proxy. It's effectively a load balancer directive. Some attribute of the incoming request is used as a key to identify which backend to route to.
Affinity logic needs to run:
It also may need to be able to re-trigger load balancing (in the failure modes below).
Supported "modes" (please do suggest others!):
Source IP
, Source Port
, Dest IP
, Dest Port
, Protocol
)
/negotiate
response, which is also present in the query string of all future requests, which we could use for affinity.Failure modes
These are mostly just my rando thoughts. I'm interested in @davidni 's thoughts, as well as the team's (@Tratcher @halter73)
We need to rename the IslandGateway projects to the new name. Similarly we need to rename the repo. We're going to hold off on renaming the repo.
We should do a pass over the logging statements and fix up a few things:
LoggerMessage.Define
where reasonable.A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.