Coder Social home page Coder Social logo

alphagov / router Goto Github PK

View Code? Open in Web Editor NEW
213.0 78.0 23.0 16.74 MB

HTTP router in front of GOV.UK to proxy to backend servers on a single domain.

Home Page: https://docs.publishing.service.gov.uk/repos/router.html

License: MIT License

Makefile 0.60% Go 97.55% Dockerfile 0.72% Shell 1.12%
govuk container

router's People

Contributors

agadufrat avatar alangabbianelli avatar alext avatar bilbof avatar boffbowsh avatar bradwright avatar brucebolt avatar cbaines avatar chao-xian avatar dcarley avatar dependabot[bot] avatar djsd123 avatar gpeng avatar hannako avatar karlbaker02 avatar kevindew avatar kushalp avatar mattbostock avatar nickstenning avatar nsabri1 avatar richardtowers avatar samsimpson1 avatar sengi avatar surminus avatar tetrino avatar theseanything avatar thomasleese avatar tijmenb avatar timblair avatar tlwr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

router's Issues

triemux: (*Mux).ServeHTTP could be sped up by hoisting out and compiling shouldRedirToLowercasePath's regular expression

Noticed while reading through this code that we've got this pattern

router/triemux/mux.go

Lines 68 to 70 in 919496a

func shouldRedirToLowercasePath(path string) (match bool) {
match, _ = regexp.MatchString(`^\/[A-Z]+[A-Z\W\d]+$`, path)
return
in which every invocation of (*Mux).ServeHTTP goes through that code hence a hot path

router/triemux/mux.go

Lines 49 to 52 in 919496a

if shouldRedirToLowercasePath(r.URL.Path) {
mux.downcaser.ServeHTTP(w, r)
return
}

Inefficiency

That code above in every single invocation calls regexp.MatchString which firstly has to freshly compile the *regexp.Regexp object then discard it after a mere match, at many calls such as for what this code is used (for the country's digital services), this inefficiency consumes a whole lot of CPU and RAM; and you can verify with this simple benchmark

package triemux

import "testing"

var sink any = nil

var paths = []string{
	"/GOVERNMENT/GUIDANCE",
	"/GoVeRnMeNt/gUiDaNcE",
	"",
	"1234",
	"abc1234",
	"abc1234DE",
	"                                                    ",
	"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa  aaaaaaaaaa",
}

func BenchmarkShouldRedirectToLowercasePath(b *testing.B) {
	b.ReportAllocs()

	for i := 0; i < b.N; i++ {
		for _, path := range paths {
			sink = shouldRedirToLowercasePath(path)
		}
	}

	if sink == nil {
		b.Fatal("Benchmark did not run!")
	}

	sink = nil
}

CPU profile

Screenshot 2024-04-03 at 6 22 34 AM

Memory profile

Screenshot 2024-04-03 at 6 23 47 AM

Remedy

That regular expression can be hoisted out and compiled once then reused each time with the (*regexp.Regexp).MatchString method per

diff --git a/triemux/mux.go b/triemux/mux.go
index 57e09d1..b410b51 100644
--- a/triemux/mux.go
+++ b/triemux/mux.go
@@ -59,6 +59,8 @@ func (mux *Mux) ServeHTTP(w http.ResponseWriter, r *http.Request) {
 	handler.ServeHTTP(w, r)
 }
 
+var reShouldRedirect = regexp.MustCompile(`^\/[A-Z]+[A-Z\W\d]+$`)
+
 // shouldRedirToLowercasePath takes a URL path string (such as "/government/guidance")
 // and returns:
 //   - true, if path is in all caps; for example:
@@ -66,8 +68,7 @@ func (mux *Mux) ServeHTTP(w http.ResponseWriter, r *http.Request) {
 //   - false, otherwise; for example:
 //     "/GoVeRnMeNt/gUiDaNcE" -> false (should forward "/GoVeRnMeNt/gUiDaNcE" as-is)
 func shouldRedirToLowercasePath(path string) (match bool) {
-	match, _ = regexp.MatchString(`^\/[A-Z]+[A-Z\W\d]+$`, path)
-	return
+	return reShouldRedirect.MatchString(path)
 }
 
 // lookup finds a URL path in the Mux and returns the corresponding handler.

Improvement results

The benchmark shows a drastic performance hit in all dimensions

$ benchstat before.txt after.txt 
name                             old time/op    new time/op    delta
ShouldRedirectToLowercasePath-8    29.8µs ± 1%     0.8µs ±24%   -97.15%  (p=0.000 n=8+10)

name                             old alloc/op   new alloc/op   delta
ShouldRedirectToLowercasePath-8    33.7kB ± 0%     0.0kB       -100.00%  (p=0.000 n=9+10)

name                             old allocs/op  new allocs/op  delta
ShouldRedirectToLowercasePath-8       360 ± 0%         0       -100.00%  (p=0.000 n=10+10)

and now all the time is spent properly matching the string and not in trying to compile and then discard the regular expression object.

CPU profile

Screenshot 2024-04-03 at 6 29 51 AM

Memory profile

When the benchmark runs, that path no longer consumes any RAM
Screenshot 2024-04-03 at 6 31 19 AM

Bug: incoming URLs are decoded before being looked up

Problem example: https://www.gov.uk/vehicle%20tax should match against the route defined here but doesn't.

If a request URL has a space, encoded as %20, it won't be matched a route. This is because the router is decoding the request URL before matching against the encoded list here.

Routes in the database should be in their encoded form.

One option is to switch the call to something which returns the encoded version, but this will probably require an upgrade to Go 1.5. @alext has started doing some thinking about this.

CodeQL check keeps breaking with "configuration not found"

Something keeps breaking the CodeQL pre-merge check on this repo.

I think this is the third time it's happened now. Last couple of times we fixed it by disabling and re-enabling CodeQL in the repo settings.

Not sure if it's just GitHub being flaky or some other config management removing the CodeQL config for Go.

1 configuration not found

Warning: Code scanning cannot determine the alerts introduced by this pull request, because 1 configuration present on refs/heads/main was not found:
Default setup

❓ /language:go

For example on this run.

Use sync/atomic for atomicity

Hi!

I believe you need to use sync/atomic to implement some of the operations here. Since you are reading/writing this variable concurrently, you should be using sync.{Load,Store}Pointer to ensure atomicity, this will avoid torn reads.

Unfortunately, this is somewhat ugly; you have to use unsafe.Pointer to use the sync/atomic API. I recommend writing an atomicSetMap() and atomicGetMap() functions to contain the ugliness.

@dvyukov can probably tell me if I'm being too paranoid.

Requests being routed to nonexistent backend `whitehall-frontend`.

We've been logging about 1100 of these per day since whitehall-frontend was decommissioned:

Representative stack trace:
*net.OpError: dial tcp: lookup whitehall-frontend.production.govuk-internal.digital on 172.20.0.10:53: no such host
  File "github.com/alphagov/router/logger/sentry.go", line 74, in NotifySentry
  File "github.com/alphagov/router/handlers/backend_handler.go", line 164, in (*backendTransport).RoundTrip
  File "net/http/httputil/reverseproxy.go", line 473, in (*ReverseProxy).ServeHTTP
  File "github.com/alphagov/router/triemux/mux.go", line 50, in (*Mux).ServeHTTP
  File "github.com/alphagov/router/lib/router.go", line 150, in (*Router).ServeHTTP
  File "net/http/server.go", line 2938, in serverHandler.ServeHTTP
  File "net/http/server.go", line 2009, in (*conn).serve

Sentry issue for cross-referencing (internal-only, sorry)

Suspect there's some dangling bit of config somewhere, at least a dead route in the Router database but there may well be other cleanup upstream that was missed.

Development is not possible using govuk-docker

The main branch of govuk-docker is now M1-compatible, and part of that is removing the mongo-2.6 container and pointing router to mongo-3.6. This will work for general use, but the router-3.6 container in govuk-docker is not set up to use a replica set, which docker relies on to detect updates to the data. This will causes failures in the following situations:

  • developers actively working on router will not be able to run the test suite
  • developers using the stack in govuk-docker who publish new content and expect it to appear in their local dev environment through router (rather than directly accessed) will not see that new content.

This issue is related in part to this one: #210

Developers needing a workaround should try this govuk-docker branch which sets mongo-3.6 to use the replica set:
https://github.com/alphagov/govuk-docker/tree/m1-compatibility-keith

Alternatively, work in this router branch will render this replica-set requirement moot when complete:
https://github.com/alphagov/router/tree/change-stream

(This issue repeated in govuk-docker since it's not 100% clear where it should be, since the fix is in both places: alphagov/govuk-docker#591)

All logging goes to error log

There's an environment variable which is set to /var/log/router/errors.json.log which is what every log.Println uses, which includes stuff that isn't errors. This is confusing.

We should use standard out and standard error instead.

Integration tests fail (in GOV.UK Docker)

This regression was introduced in: 678ef63

It looks like the new code requires the Mongo DB to be a replica set. The test failures don't give any insight into the cause of the error, probably because the router is being run as a separate binary, and doesn't seem to be logging its errors anywhere.

Example failure:

Redirection
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:10
  prefix redirects
  /go/src/github.com/alphagov/router/integration_tests/redirect_test.go:71
    should not preserve the query string when redirecting if specified [It]
    /go/src/github.com/alphagov/router/integration_tests/redirect_test.go:106

    Expected
        <string>: 
    to equal
        <string>: /baz

    /go/src/github.com/alphagov/router/integration_tests/redirect_test.go:108

List of more failures:


[Fail] Redirection prefix redirects [It] should not preserve the query string when redirecting if specified 
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:108

[Fail] Redirection prefix redirects [It] should contain cache headers of 30 mins 
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:113

[Fail] Redirection prefix redirects [It] should handle path-preserving redirects with special characters 
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:129

[Fail] Redirection external redirects exact redirect [It] should redirect to the external URL 
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:146

[Fail] Redirection external redirects exact redirect [It] should not preserve the query string by default 

I get a more informative error if I start the app manually:

router-app_1                                | 2021/05/14 08:39:53 router: listening for refresh on :3055
router-app_1                                | 2021/05/14 08:39:53 router: starting self-update process, polling for route changes every 2s
router-app_1                                | 2021/05/14 08:39:55 router: couldn't get replica set status from MongoDB, skipping update (error: not running with --replSet)

I tried to fix this manually in GOV.UK Docker, and while the app started without errors, the integration tests are still failing for some unknown reason, due to the logs being mysteriously absent.

I think this is equally an issue for anywhere trying to develop this repo without GOV.UK Docker, since the README doesn't provide any specialist instructions about installing or configuring MongoDB.

I'd say this issue is resolved when:

  • We can see error logs from the router when we run integration tests. It may be these are being output somewhere; we could document where they're going, or change the file path so it's more obvious.

  • I can run integration tests in GOV.UK Docker and they pass, with the reasonable exception of the performance tests.

Integration tests fail to start up on GitHub Actions.

mongo.sh no longer works flakes on GitHub Actions, preventing the integration tests from running.

Waiting up to 10 s for for successful rs.initiate() done
Waiting up to 10 s for for healthy rs.status()...........gave up
make: *** [Makefile:34: start_mongo] Error 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.