alphagov / router Goto Github PK
View Code? Open in Web Editor NEWHTTP router in front of GOV.UK to proxy to backend servers on a single domain.
Home Page: https://docs.publishing.service.gov.uk/repos/router.html
License: MIT License
HTTP router in front of GOV.UK to proxy to backend servers on a single domain.
Home Page: https://docs.publishing.service.gov.uk/repos/router.html
License: MIT License
Noticed while reading through this code that we've got this pattern
Lines 68 to 70 in 919496a
Lines 49 to 52 in 919496a
That code above in every single invocation calls regexp.MatchString which firstly has to freshly compile the *regexp.Regexp object then discard it after a mere match, at many calls such as for what this code is used (for the country's digital services), this inefficiency consumes a whole lot of CPU and RAM; and you can verify with this simple benchmark
package triemux
import "testing"
var sink any = nil
var paths = []string{
"/GOVERNMENT/GUIDANCE",
"/GoVeRnMeNt/gUiDaNcE",
"",
"1234",
"abc1234",
"abc1234DE",
" ",
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaa",
}
func BenchmarkShouldRedirectToLowercasePath(b *testing.B) {
b.ReportAllocs()
for i := 0; i < b.N; i++ {
for _, path := range paths {
sink = shouldRedirToLowercasePath(path)
}
}
if sink == nil {
b.Fatal("Benchmark did not run!")
}
sink = nil
}
That regular expression can be hoisted out and compiled once then reused each time with the (*regexp.Regexp).MatchString method per
diff --git a/triemux/mux.go b/triemux/mux.go
index 57e09d1..b410b51 100644
--- a/triemux/mux.go
+++ b/triemux/mux.go
@@ -59,6 +59,8 @@ func (mux *Mux) ServeHTTP(w http.ResponseWriter, r *http.Request) {
handler.ServeHTTP(w, r)
}
+var reShouldRedirect = regexp.MustCompile(`^\/[A-Z]+[A-Z\W\d]+$`)
+
// shouldRedirToLowercasePath takes a URL path string (such as "/government/guidance")
// and returns:
// - true, if path is in all caps; for example:
@@ -66,8 +68,7 @@ func (mux *Mux) ServeHTTP(w http.ResponseWriter, r *http.Request) {
// - false, otherwise; for example:
// "/GoVeRnMeNt/gUiDaNcE" -> false (should forward "/GoVeRnMeNt/gUiDaNcE" as-is)
func shouldRedirToLowercasePath(path string) (match bool) {
- match, _ = regexp.MatchString(`^\/[A-Z]+[A-Z\W\d]+$`, path)
- return
+ return reShouldRedirect.MatchString(path)
}
// lookup finds a URL path in the Mux and returns the corresponding handler.
The benchmark shows a drastic performance hit in all dimensions
$ benchstat before.txt after.txt
name old time/op new time/op delta
ShouldRedirectToLowercasePath-8 29.8µs ± 1% 0.8µs ±24% -97.15% (p=0.000 n=8+10)
name old alloc/op new alloc/op delta
ShouldRedirectToLowercasePath-8 33.7kB ± 0% 0.0kB -100.00% (p=0.000 n=9+10)
name old allocs/op new allocs/op delta
ShouldRedirectToLowercasePath-8 360 ± 0% 0 -100.00% (p=0.000 n=10+10)
and now all the time is spent properly matching the string and not in trying to compile and then discard the regular expression object.
When the benchmark runs, that path no longer consumes any RAM
Problem example: https://www.gov.uk/vehicle%20tax
should match against the route defined here but doesn't.
If a request URL has a space, encoded as %20
, it won't be matched a route. This is because the router is decoding the request URL before matching against the encoded list here
.
Routes in the database should be in their encoded form.
One option is to switch the call to something which returns the encoded version, but this will probably require an upgrade to Go 1.5. @alext has started doing some thinking about this.
Something keeps breaking the CodeQL pre-merge check on this repo.
I think this is the third time it's happened now. Last couple of times we fixed it by disabling and re-enabling CodeQL in the repo settings.
Not sure if it's just GitHub being flaky or some other config management removing the CodeQL config for Go.
1 configuration not found
Warning: Code scanning cannot determine the alerts introduced by this pull request, because 1 configuration present on refs/heads/main was not found:
Default setup❓ /language:go
For example on this run.
Hi!
I believe you need to use sync/atomic to implement some of the operations here. Since you are reading/writing this variable concurrently, you should be using sync.{Load,Store}Pointer to ensure atomicity, this will avoid torn reads.
Unfortunately, this is somewhat ugly; you have to use unsafe.Pointer to use the sync/atomic API. I recommend writing an atomicSetMap() and atomicGetMap() functions to contain the ugliness.
@dvyukov can probably tell me if I'm being too paranoid.
We've been logging about 1100 of these per day since whitehall-frontend
was decommissioned:
*net.OpError: dial tcp: lookup whitehall-frontend.production.govuk-internal.digital on 172.20.0.10:53: no such host
File "github.com/alphagov/router/logger/sentry.go", line 74, in NotifySentry
File "github.com/alphagov/router/handlers/backend_handler.go", line 164, in (*backendTransport).RoundTrip
File "net/http/httputil/reverseproxy.go", line 473, in (*ReverseProxy).ServeHTTP
File "github.com/alphagov/router/triemux/mux.go", line 50, in (*Mux).ServeHTTP
File "github.com/alphagov/router/lib/router.go", line 150, in (*Router).ServeHTTP
File "net/http/server.go", line 2938, in serverHandler.ServeHTTP
File "net/http/server.go", line 2009, in (*conn).serve
Sentry issue for cross-referencing (internal-only, sorry)
Suspect there's some dangling bit of config somewhere, at least a dead route in the Router database but there may well be other cleanup upstream that was missed.
Haven't been able to repro this locally yet, but it seems to be roughly 50/50 when running on GitHub. Example run (cancelled after 4h)
The main branch of govuk-docker is now M1-compatible, and part of that is removing the mongo-2.6 container and pointing router to mongo-3.6. This will work for general use, but the router-3.6 container in govuk-docker is not set up to use a replica set, which docker relies on to detect updates to the data. This will causes failures in the following situations:
This issue is related in part to this one: #210
Developers needing a workaround should try this govuk-docker branch which sets mongo-3.6 to use the replica set:
https://github.com/alphagov/govuk-docker/tree/m1-compatibility-keith
Alternatively, work in this router branch will render this replica-set requirement moot when complete:
https://github.com/alphagov/router/tree/change-stream
(This issue repeated in govuk-docker since it's not 100% clear where it should be, since the fix is in both places: alphagov/govuk-docker#591)
There's an environment variable which is set to /var/log/router/errors.json.log
which is what every log.Println
uses, which includes stuff that isn't errors. This is confusing.
We should use standard out and standard error instead.
This regression was introduced in: 678ef63
It looks like the new code requires the Mongo DB to be a replica set. The test failures don't give any insight into the cause of the error, probably because the router is being run as a separate binary, and doesn't seem to be logging its errors anywhere.
Example failure:
Redirection
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:10
prefix redirects
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:71
should not preserve the query string when redirecting if specified [It]
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:106
Expected
<string>:
to equal
<string>: /baz
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:108
List of more failures:
[Fail] Redirection prefix redirects [It] should not preserve the query string when redirecting if specified
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:108
[Fail] Redirection prefix redirects [It] should contain cache headers of 30 mins
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:113
[Fail] Redirection prefix redirects [It] should handle path-preserving redirects with special characters
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:129
[Fail] Redirection external redirects exact redirect [It] should redirect to the external URL
/go/src/github.com/alphagov/router/integration_tests/redirect_test.go:146
[Fail] Redirection external redirects exact redirect [It] should not preserve the query string by default
I get a more informative error if I start the app manually:
router-app_1 | 2021/05/14 08:39:53 router: listening for refresh on :3055
router-app_1 | 2021/05/14 08:39:53 router: starting self-update process, polling for route changes every 2s
router-app_1 | 2021/05/14 08:39:55 router: couldn't get replica set status from MongoDB, skipping update (error: not running with --replSet)
I tried to fix this manually in GOV.UK Docker, and while the app started without errors, the integration tests are still failing for some unknown reason, due to the logs being mysteriously absent.
I think this is equally an issue for anywhere trying to develop this repo without GOV.UK Docker, since the README doesn't provide any specialist instructions about installing or configuring MongoDB.
I'd say this issue is resolved when:
We can see error logs from the router when we run integration tests. It may be these are being output somewhere; we could document where they're going, or change the file path so it's more obvious.
I can run integration tests in GOV.UK Docker and they pass, with the reasonable exception of the performance tests.
mongo.sh no longer works flakes on GitHub Actions, preventing the integration tests from running.
Waiting up to 10 s for for successful rs.initiate() done
Waiting up to 10 s for for healthy rs.status()...........gave up
make: *** [Makefile:34: start_mongo] Error 1
When searching for my local library catalogue (https://www.gov.uk/search-library-catalogue/reading) the webpage it takes you to is http://reading-hip.epixtech.co.uk/ipac20/ipac.jsp?profile= which can't be reached.
epixtech.co.uk seems to be abandoned and not used anymore. Last update according to whois was back in 2015, no DNS data.
This seems to be the most relevant repository to report this issue to.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.