Comments (8)
t2.medium sounds a bit optimistic, its unclear if its too small for the headscale, or for the test clients:
The error mentioned would mean one or more of:
- The node has gone away and its not taking the update
- The node is reconnecting and the update is being sent to the "closed" version
- The node did not accept the message fast enough
The problem here might be either that the Headscale machine does not have enough resources to maintain all of the connections, or the VMs running 100s of client does not have enough resources to run them all.
The machine used in #1656 is significantly larger, its probably a bit overspecced with the new alpha.
Have you tried the same with 0.22.3 (latest stable)? It is a lot more inefficient so might struggle more on a t2.medium.
from headscale.
Another important question is whether you are running sqlite or postgres. If sqlite try enabling wal, or switching to postgres. Sounds like it could be a concurrency issue.
from headscale.
Using Postgres I'm experiencing the same issue here using alpha 12 in a network of ~30 nodes, with a handful of ephemeral nodes coming in and out through the day. I've seen both regular users on laptops, and machines in the cloud be able to connect to Headscale, but then not be able to reach any other node in the network. Headscale outputs the same errors at stated at the beginning of the issue, though while digging through the new map session logic I'm unsure if the error and the issue is related. If I were to guess something is hanging in
Line 271 in 8571513
I had the problem with a laptop connecting to a remote machine, so I had ran tailscale down && tailscale up
on the remote machine, and it then fixed the problem. I'm betting there is an issue with connection recovery in the notifier, either to the node or database. I'll dig through logs later in the evening.
from headscale.
did you verify that there was a problem with the connections between nodes, or are you saying that you do not expect any errors?
from headscale.
did you verify that there was a problem with the connections between nodes, or are you saying that you do not expect any errors?
I verified that there are two issues in the latest version:
(1) When 600 users join a single Headscale server, the error "ERR update not sent, context cancelled..." occurs in Headscale.
(2) Some of the joined 600 users are in an offline status when checked with headscale node list.
There are no issues with connections between users who are in an online status.
from headscale.
I am currently using sqlite(without wal option). I will rerun the same tests on a higher performance instance using postgres.
from headscale.
Please try with WAL first.
from headscale.
WAL on by default for SQLite is coming in #1985.
I will close this issue as it is more of a performance/scaling thing than a bug. We have a couple of hidden tuning options, which together with WAL might be good content for a "performance" or "scaling" guide in the future.
from headscale.
Related Issues (20)
- [Feature] Add ExecReload to systemd service HOT 1
- [Bug] ACL policy not working after update to v23.0 beta1 HOT 26
- [Bug] v0.23.0-beta1 breaks built-in DERP HOT 12
- [Bug] 0.23.0-beta1 breaks DNS custom nameservers HOT 4
- [Bug] tailscale on ios cannot connect to the postgresql server HOT 6
- [Feature] Add SafeSurfer DoH HOT 2
- [Bug] 0.23.0-beta1 wipes resolv.conf on clients regardless of dns_config HOT 4
- [Bug] API does not return tags for a node
- [Bug] API backfillips call fail
- [Bug] expiring a node does not update the online status
- [Bug] ipv4 range not used HOT 2
- [Bug] Self-built DERP server cannot obtain delay information HOT 3
- tailscale-private | 2024/08/04 15:49:08 control: NetInfo: NetInfo{varies=true hairpin= ipv6=false ipv6os=true udp=true icmpv4=false derp=#900 portmap= link="" firewallmode="ipt-default"}
- [Feature] Headscale policy set validate ACL before applying? HOT 11
- [Bug] Feature request form HOT 1
- [Bug] Tailscale Exit Node DNS Doesnt Provide DNS HOT 3
- [Bug] 0.23.0-beta client setup hangs on linux on raspberry pi HOT 2
- [Bug] gRPC API fails to resolve relative policy file path, unlike server startup behavior
- [Bug] New Config.yaml ACL section does not work - Must use old acl_policy_path HOT 2
- [Bug] Unable to Add Another Account on Windows 10 Client 1.70.0 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from headscale.