microsoft / reverse-proxy Goto Github PK

View Code? Open in Web Editor NEW

8.1K 279.0 795.0 21.35 MB

A toolkit for developing high-performance HTTP reverse proxy applications.

Home Page: https://microsoft.github.io/reverse-proxy

License: MIT License

C# 87.68% Batchfile 0.07% Shell 4.81% PowerShell 6.83% CMake 0.62%

reverse-proxy's Issues

Transfer/Recreate issues from dotnet/proxy

https://github.com/dotnet/proxy/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc

Unfortunately you can't transfer issues cross-org :(. Might be OK to just to a manual copy-pasta transfer though.

Proxy is hardened so it can directly face the internet

We expect several features to be add-ins that plug in as connection middleware. Add a connection middleware to the sample in this repo.

Here's one example from Http2:
https://github.com/dotnet/aspnetcore/blob/09bb7b4ca5a4fbde0283c294c35fac8b485c0074/src/Servers/Kestrel/samples/Http2SampleApp/Program.cs#L41-L54

Other things we expect to need to plug into here:

Sniff SNI and rate limit
Rate limit SSL handshakes by IP

We don't have to write those components as part of this task, just demonstrate that a connection middleware has access to the necessary inputs and controls (e.g. IPs, sniffing the data stream, drop connections, etc.).

Specifically, this involves being able to filter connections based on information contained in the ClientHello TLS frame. Examples include:

Cipher Suite selection
Protocol Version
Server Name Indication (SNI)
Application-Level Protocol Negotiation (ALPN; used for HTTP/2 and HTTP/3)

Support Sticky Sessions in Load Balance

Proxy adds standard headers related to request forwarding

X-Forwarded-Proto
X-Forwarded-Host
X-Forwarded-For
X-Client-Cert
Forwarded (yes, there's a chicken-and-egg problem but I see no reason we can't break that cycle by implementing it).

[Tracking] Endpoint routing with middleware

This will likely be useful for adding middleware to specific routes.
dotnet/aspnetcore#14514

The workaround today is creating your own app builder.
https://github.com/dotnet/proxy/commit/8be62673e25861e283c14aafcc60db2b0ed928b7#diff-a899554e36fdb9c1c365b48b296246cdR35-R37

Expand diagnostics and tracing in Networking Stack

In order to provide the detailed diagnostics proxy users will want, we need to expand the diagnostics and tracing support throughout the networking stack.

This includes both counters (using EventCounters) and tracing (using EventSource). It covers the low-level sockets API, as well as higher level APIs like HttpClient.

Another aspect to consider here is dimensions. We believe it is important for users to be able to specify the "scope" of a particular network operation and include custom dimensions like Route identifiers, backend names, etc. For v1 we believe it is sufficient to express these in the EventSource events, and keep the counters high-level.

Diagnostics Improvements

Tracking some diagnostics improvments

We need the ability to collect aggregate metrics based on various dimensions (primarily the specific incoming endpoint, for the server, and outgoing backend
We need additional metrics from HttpClient and the client networking stack
- [TBD: Specific metrics]
We need additional metrics from ASP.NET Core
- [TBD: Specific metrics]

Implement Round-Robin Load Balancing

Load balance by taking the "next" backend in the list for each incoming request. We wrap around to the start of the list when we run out of backends.

Since requests will likely be highly-concurrent, there's some complexity here in how the state is maintained. Also, since backends can change, we'll need to consider how this algorithm behaves in that case. Naively, this would be a simple atomic counter that the load balancer does an atomic increment-and-return, then modulus by the number of backends to figure out which backend to use.

ProxyAsync_UpgradableRequest_Works is flaky

https://dev.azure.com/dnceng/internal/_build/results?buildId=582611&view=ms.vss-test-web.build-test-results-tab&runId=18340542&resultId=100036&paneView=debug

Seems odd. We should be running on the same kind of machine in the public one too. Maybe this is flaky?

cc @Tratcher

Implement Random Load-Balancing

This is one of the simplest load-balancers. Select a random backend from the set of healthy backends and route to it. It can be surprisingly effective, so it's a useful algorithm to have. It's also trivial to implement so it gives us a way to exercise our abstractions.

Retry of "Safe" HTTP requests

when a request fails - retry against a different server

Proxy supplies metrics for monitoring performance

Collect metrics for all stages of the proxy processing, including outbound http requests

Implement outbound PPv2

This mirrors #11

This is a proxy forwarder protocol primary used by layer 4 proxies to forward information about the original client such as their IP. It does so by pre-pending the connection stream with a text or binary formatted blob of data. It's more efficient than adding X-Forwarded-For and similar headers to every request.

In HttpClient this would be implemented at the proposed Dialer or Bedrock layer, sending the blob just after the connection was established.

[Investigation] Reducing allocations in HttpClient

We believe there are some gains to be found in reducing allocations in HttpClient, particularly for proxy scenarios where it may be reasonable to bypass header validation and other HttpClient features designed for more general-purpose use cases.

Enable Arcade, syncing and CI/CD

We should install dotnet/arcade into this repo, enable syncing to AzDO and CI/CD.

TODO:

Install Arcade (#29)
Flow new Arcade dependencies into this repo (Blocked waiting for maestro app to be enabled in the microsoft/reverse-proxy repo)
Add repo to code mirroring (see dotnet/versions#567 for reference).
Set up Azure DevOps YAML: https://github.com/dotnet/arcade/blob/master/Documentation/AzureDevOps/AzureDevOpsOnboarding.md (#31)
Enable public AzDo pipeline (#31)
Enable internal AzDo pipeline

Proxy supports static config files, with hot reload for handling changes

Config based proxies are common and we'll need to support at least basic proxy scenarios from config. Here are some initial considerations:

Define routes based on host and/or path
A restart should not be needed to pick up config changes
You should be able to augment a route with code/middleware. Kestrel does something similar using named endpoints.
List multiple back-ends per route for load balancing
Configure a load balancing strategy

Add 3pn notice

We use third-party packages (for now) so we should add a NOTICE file before going public.

Things we use:

https://www.nuget.org/packages/Superpower

Adopt .NET 5.0 builds

We should plug in to the dependency flow and start adopting 5.0 nightly builds.

Routes can be fully dynamic and discovered on a per request basis for hyper scale hosting

The current proxy routing design assumes all route data is available up front and can be loaded atomically into the route table. The EndpointDataSource routing model assumes this as well.

For more dynamically-configured services, we probably don't want to fetch the entire route table from the db, but instead have a pull and cache for each hostname model.

How can we lazy-load route, backend, and endpoint data on a per request basis?

Assume we're working with a discrete key like Host, not a complicated best-match scenario like Path, and little or no wildcard support (maybe wildcard subdomains but only for specific depths?).

@davidni is this a scenario you'd evaluated?

@rynowak can such a system work within routing, co-exist side by side routing, or would it completely replace routing?

Collect test logs on AzDo

Follow up to #2

In #35, there was a failing test, but no server-side logs. https://dev.azure.com/dnceng/public/_build/results?buildId=584078&view=logs&j=37a405a8-c7d5-566e-a69b-63119597d487

Proxy can front multiple sites and route based on SNI/Host

A proxy will often want to route based on the host. It's possible today but it's clunky.
dotnet/aspnetcore#19354

Prioritization in Http2 requests

Http2 has the concept of prioritization of requests within a stream. How do we want to think of these with respect to the proxy front end if it has an HTTP/2 or HTTP/3 connection. As we terminate the end user request, and create new requests, potentially distributed across multiple servers, any prioritization signals will be lost. As we queue requests for processing, or pack responses into the response stream, do we need to be prioritizing which responses get streamed first?

https://blog.cloudflare.com/adopting-a-new-approach-to-http-prioritization/

Use HttpCompletionOption.ResponseHeadersRead in HttpClient.SendAsync

Does setting HttpCompletionOption.ResponseHeadersRead in HttpClient.SendAsync improve performance?

reverse-proxy/src/ReverseProxy.Core/Service/Proxy/HttpProxy.cs

Line 158 in bc92983

    
           var upstreamResponse = await httpClient.SendAsync(upstreamRequest, shortCancellation);

Remove FluentAssertions

This testing style and library are inconsistent with our other projects.

Config-based request and response header transformation

In Island Gateway we implemented the notion of Transformations, which takes inspiration from Traefik's middlewares (though we're not calling them middlewares for obvious reasons). See e.g. Traefik's StripPrefix.

A user can specify e.g. Transformations=StripPrefix('/api'), AddPrefix('/v2'), which results in the listed processing steps being applied sequentially at runtime.

The ability to specify simple transformations like these through config is a requirement for us.

Example of a route using transformations:

Rule            = Host('example.com') && Path('/myservice/{**catchall}')
Transformations = StripPrefix('/myservice')

(this is a feature we already implemented in Island Gateway, filing here as requested to track)

[Tracking] Kestrel config asks

Here are things we expect to need expanded on for kestrel's existing config system:

Endpoint config reload: dotnet/aspnetcore#19376
Backpressure: dotnet/aspnetcore#13295
- Planned for 6.0
[EPIC] HTTPS and Certificate Handling in Kestrel: dotnet/aspnetcore#21512
- SNI config: dotnet/aspnetcore#15144 (Working)
  - Tracked individually by #86
  - In 5.0.0-preview8 milestone
- Client cert mode: dotnet/aspnetcore#18660
- PEM certs: dotnet/aspnetcore#4706
  - PR open by @javiercn: dotnet/aspnetcore#23584

Proxy has cloud scale performance, and is benchmarked in the lab

Ongoing effort to analyze perf of the proxy.

Add micro benchmarks #223
Add more benchmark scenarios #222
Poor performance and errors when proxying https #355
Profile Memory #221
Do not flow the Connection header? #439

Enable IL Linker and Single-File in the Proxy

We should enable the IL Linker and Single-File in the Proxy "app" (whatever that ends up being) and make sure we are tracking file size. There's a desire to have a low disk footprint and since the Proxy is a more static component than an arbitrary app.

Get YARP running in the Benchmark lab

We need to get YARP running in the benchmark lab (https://msit.powerbi.com/view?r=eyJrIjoiYTZjMTk3YjEtMzQ3Yi00NTI5LTg5ZDItNmUyMGRlOTkwMGRlIiwidCI6IjcyZjk4OGJmLTg2ZjEtNDFhZi05MWFiLTJkN2NkMDExZGI0NyIsImMiOjV9).

Custom request/response modifications via code transforms

Ability to customize headers (beyond x-forward or affinity) from code extensibility for both request to back end, and response to the the client.

Load balancer config extension

Enable custom metadata in backend endpoint definition so a custom load balancer could do A/B switching etc.

Note: This task is about ensuring the plumbing is available to support this scenario, but does not involve creating a custom load balancer

[Tracking] SNI handling improvements

Asynchronous Certificate selection callback
Allow customizing of crypto parameters based on SNI

Write a good README

We need a good README.

[Tracking] HttpClient Dialers

Motivation: Select a specific NIC for certain outbound requests
dotnet/runtime#28721

Proxy supports dynamic configuration pulling from other sources

There are 3 ways I can see routes being defined for a proxy:
a) In config via a config file, changeable at runtime - this is what most proxies do today
b) Statically defined in code - setting up static endpoints similar to the sample startup.cs
c) Dynamically created by code. This seems to be what most of the Azure partners need. At startup, and periodically the code will need to query a backend to fetch both the Endpoints to listen to, and where each of those should be routed.
[added from discussion below]
d) When there are too many routes to load them all, we need to be able to query a database per request to load and cache routes. From our research, more Azure partners do this than C.

This last bucket is where we need to shine and differentiate ourselves from the competition.
It will need to account for:

Listening on new IP Addresses / ports
SNI and being able to dynamically lookup a cert based on the client hello
route based on the hierarchy of the URI path & query string

For some apps, they may need to do a DB lookup to discover the cert/route based on the URL being passed - so will need to have a cache and fallback mechanism.

Performance would dictate you probably need optimized structures like a trie, that I suspect are behind the existing routing tables. Either we need APIs to be able to dynamically edit the tables, and/or create and replace the tables, or be able to use the same concepts from custom app code for the proxy.

Proxy supports multiple algorithms for load balancing across destinations

Complete the set of common load balancing schemes

Round Robin #72
Random #73
Least loaded of 2 random #75

~~First healthy endpoint~~ (already implemented)

Proxy can be used in a hosted datacenter environment with limitations such as SNAT

Customers behind certain firewalls or load balancers, such as Azure's SNAT, have a limit on the number of connections they can make per ip/port endpoint.

HttpClient has a MaxConnectionsPerServer setting, but the limit is not actually applied per-server, but rather per-pool, with each pool being partitioned by a number of connection properties:

Kind, describing the type of connection (HTTP, HTTPS, Proxied HTTPS, etc.)
Host, the hostname in the request URI.
Port, the port in the request URI.
SslHostName, the SNI value associated with the connection (might be hostname from URI, might be Host header)
ProxyUri, the proxy that is being used to transport the connection.
Identity, the user the connection was authenticated over.

It is unreasonable to apply a meaningful SNAT limit on top of HttpClient because the client will pool connections and does not expose connection lifetime events to the user.

Active Health Checks of destinations

Creating a simple out of band (not part of proxied requests) to confirm the health status of back ends.

a) Polling back ends to see if they are alive
b) Eliminating dead entries from load balancing

[Tracking] Early filtering SSL

Be able to filter connections at SSL connection time to prevent DoS attacks

Review host matching validation

The RouteValidator uses a strict regex to check the host matching pattern. I have my doubts that this level of validation is even needed. The source is developer input, not external input, and it's only used in the route matcher (HostAttribute). We should be careful about second guessing what the HostMatcherPolicy can handle. For example the regex doesn't allow IDNs (Unicode hosts).

Preferably we'd use a validation helper provided directly by the framework so they'd be in sync.

Import IslandGateway code

Import the code from the IslandGateway project provided by our internal partners.

Simple circuit breaking (passive health-check)

Mark a back end node unhealthy when a request fails.

There will typically be a bunch of policy about how many failures are required to trip the circuit and when to retry the back end to see if it has recovered. This item is to ensure we have the plumbing to be able do circuit breaking, but not necessarily to create an all encompassing circuit breaking module.

Add CategoryName To IOperationLogger

What should we add or change to make your life better?

Create and use IOperationLogger<TCategoryName> instead of IOperationLogger and make the implementations use ILogger<TCategoryName> insead of ILogger.

Why is this important to you?

It's easier to follow logs.

Implement inbound PPv2

https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt

In Kestrel this would be implemented as a connection middleware. We helped an internal team build this middleware already and they're using it with an existing azure network services that sends that data.

I'll file a separate issue for outbound support as it would need to integrate with HttpClient.

Feature: lifecycle management for zero downtime distributed hosting

When running at scale, gateway instances must report their health state indicating whether an instance is willing or not to receive new traffic. See e.g. Azure Load Balancer health probes, which would call the Gateway periodically.

It makes sense for the Reverse Proxy core to support this since this requires careful coordination of startup / teardown activities, and it also influences runtime gateway behavior. E.g during graceful shutdown, Gateway must remain serving new requests, while reporting its upcoming teardown. It must also advise clients to close existing connections by sending a Connection: close response header when applicable during graceful shutdown, while delaying teardown until (a) all requests are drained; *and (b) Load Balancer has detected the instance is not eligible for new traffic.

We have implemented this in Island Gateway and I can share more details if desired.

Style conversion

In general, we'd like this to adhere to the same style "guidelines" as .NET Core (which aren't formally documented ).

We should also clean up things like tests to avoid using patterns we generally don't use (like FluentAssertions).

Finally we should include an .editorconfig that can enforce our style and ensure contributors can just open the solution and start working without worrying about style.

How to break up the proxy monolith

Today routing hands the request off to the ProxyInvoker. This orchestrator is a bit monolithic and has limited pluggability. It is given a list of healthy endpoints from the health service, invokes a load balancer service to resolve an endpoint, and then invokes the proxy logic.

In design meetings we've proposed the concept of a pipeline to allow developers to insert customizations into this process. However, there's a specific service contract between each of these components, not a generic contract between all of them, and it's not well suited to making this a generic pipeline.

Options:
A) Try to refactor this to a more generic object model? (e.g. middleware)
B) Insert targeted extensibility at specific points? (e.g. the authentication handler events model)

Additionally, is it possible to do this customization per route? Or is there a central pipeline that can react to endpoint specific metadata?

Proxy supports Session Affinity to route subsequent requests to the same host

We should support session affinity as a native feature in the proxy. It's effectively a load balancer directive. Some attribute of the incoming request is used as a key to identify which backend to route to.

Affinity logic needs to run:

Before load balancing, to check the affinity "stamp" and select a single destination (i.e. one of the servers within a backend) (if possible)
After load balancing, to stamp the selected destination into a cookie, if enabled.

It also may need to be able to re-trigger load balancing (in the failure modes below).

Supported "modes" (please do suggest others!):

Cookie
- A request with no incoming "stamp" is assigned to a destination by normal load balancing policy
- The return response is stamped with a Cookie containing the destination ID, cryptographically signed by a key trusted by all proxy instances (DataProtection!)
- On future requests, we validate the stamp and select the specific destination for routing
- If that destination is no longer in the active set of destinations, we "fail" (see failure modes)
5-tuple (Source IP, Source Port, Dest IP, Dest Port, Protocol)
- Hash the 5-tuple values and use it to identify a destination.
- If that destination is no longer in the active set of destinations, we "fail" (see failure modes)
Header value
- Hash some header value and use it to identify a destination
- Useful for auth-token based affinity (used in Azure SignalR, for example)
~~Crazy idea: magic SignalR mode?~~ We should track this separately, just wanted to see if anyone had C O O L thoughts here.
- It could be cool to have native support for session affinity for SignalR connections.
- There's a connection ID in the initial /negotiate response, which is also present in the query string of all future requests, which we could use for affinity.
- This could be very useful for applications which don't generally need affinity, but also use SignalR (which requires affinity).
- Could possibly be generalized

Failure modes

If the session affinity system identifies a target destination which is no longer routable, a configurable failure response should be used
Possible responses
- Re-affinitize to one of the other destinations (i.e. re-route and stamp the new backend)
- Return a 503 (since the target destination is no longer active)

These are mostly just my rando thoughts. I'm interested in @davidni 's thoughts, as well as the team's (@Tratcher @halter73)

Rename everything

We need to rename the IslandGateway projects to the new name. ~~Similarly we need to rename the repo.~~ We're going to hold off on renaming the repo.

Logging review

We should do a pass over the logging statements and fix up a few things:

Ensure every event has a Name. I don't feel a need for a numeric ID, but I'm fine either way. Names are easier to keep unique than IDs.
Use LoggerMessage.Define where reasonable.

microsoft / reverse-proxy Goto Github PK

reverse-proxy's Issues

What should we add or change to make your life better?

Why is this important to you?

Recommend Projects

Recommend Topics

Recommend Org