Comments (13)
This ends up being two related bugs, both occurring due to that we immediately remove the pod IP from the stunnerd
config when the pod enters the terminating state, instead of waiting until it finally really terminates. This then causes the below problems:
- Clients can no longer refresh existing permissions to terminating pods. This usually happens 1-5 mins after pod shutdown starts.
- Sending packets on existing connections to terminating pods immediately fail with "peer port administratively prohibited" error.
The takeaway is that fixing this on the stunnerd
side would be more difficult than expected.
from stunner.
Further investigations: turns out the problem is that we're still using the old Endpoints API for backend pod discovery, which does not consider terminating pods, in contrast to the modern EndpointSlice API, which does.
By default, Kubernetes removes all pod IPs from the Endpoints object that belong to "Terminating" backend pods. For the above example, after graceful shutdown starts we get and empty Endpoints resource:
apiVersion: v1
kind: Endpoints
metadata:
name: media-plane
namespace: default
labels:
app: media-plane
Since we use the endpoint IPs in this object for permission request handlers and for port-range filtering, we immediately break all existing connections to "Terminating" pods.
The EndpointSlice API, however, returns all the pod IPs, containing the Terminating ones as well:
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
name: media-plane-qzbrt
namespace: default
labels:
app: media-plane
kubernetes.io/service-name: media-plane
addressType: IPv4
ports:
- name: ""
port: 9001
protocol: UDP
endpoints:
- addresses:
- 10.244.0.3
conditions:
ready: false
serving: true
terminating: true
nodeName: stunner
Observe the endpoint IP 10.244.0.3 with terminating: true
: that's our terminating backend pod.
So the solution is to rewrite the gateway operator from the Endpoints API to the EndpointSlice API and then add the pod IPs for terminating pods to the list of permitted endpoints.
from stunner.
Addendum: with 2 "ready" and one "terminating" pods:
media-plane-55658cb4f5-hdw6c 1/1 Running 0 10.244.0.14
media-plane-55658cb4f5-pjp9c 1/1 Terminating 0 10.244.0.12
media-plane-55658cb4f5-vjvnz 1/1 Running 0 10.244.0.13
We get this EndpointSlice:
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
name: media-plane-qzbrt
namespace: default
labels:
app: media-plane
kubernetes.io/service-name: media-plane
ports:
- name: ""
port: 9001
protocol: UDP
addressType: IPv4
endpoints:
- addresses:
- 10.244.0.12
conditions:
ready: false
serving: true
terminating: true
nodeName: stunner
targetRef:
kind: Pod
name: media-plane-55658cb4f5-pjp9c
namespace: default
uid: 277d19b4-f5ab-4616-8024-4deebca8f7e9
- addresses:
- 10.244.0.13
conditions:
ready: true
serving: true
terminating: false
nodeName: stunner
targetRef:
kind: Pod
name: media-plane-55658cb4f5-vjvnz
namespace: default
uid: 965ae9b0-13c8-4145-a850-533b857876af
- addresses:
- 10.244.0.14
conditions:
ready: true
serving: true
terminating: false
nodeName: stunner
targetRef:
kind: Pod
name: media-plane-55658cb4f5-hdw6c
namespace: default
uid: 02a3e3e2-b83c-442c-afe4-4a7ad647bf19
from stunner.
I've tested this and the issue can no longer be reproduced with the new STUNner dev
version that uses the EndpointSlice controller.
-
Fire up the UDP greeter example again but set
terminationGracePeriodSeconds: 300
in themedia-plane
deployment. -
Create a
turncat
tunnel and create a TURN allocation:export IPERF_ADDR=$(kubectl get pod -l app=media-plane -o jsonpath="{.items[0].status.podIP}") turncat --log=all:TRACE - 'k8s://stunner/udp-gateway:udp-listener' udp://$IPERF_ADDR:9001 Hi Greetings from STUNner! ...
-
Scale the
media-plane
deployment down to 0 pods:kubectl scale deployment media-plane --replicas=0
This will trigger the UDP greeter pod to enter into a
TERMINATING
state:kubectl get pods NAME READY STATUS RESTARTS AGE media-plane-55658cb4f5-d7h2l 1/1 Terminating 0 91s
-
And the
turncat
tunnel stays open:... Hi again after terminate Greetings from STUNner! And the connection remains open, isn't it? Greetings from STUNner! ...
from stunner.
Related Issues (20)
- Question about debugging message on UDP gateway pod HOT 9
- Is stunner FedRamp compliant? HOT 11
- Meetecho Janus integration HOT 7
- turn ERROR: Failed to handle datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header HOT 1
- Mixed protocol available for AWS? If not how to setup health check if not supported? HOT 3
- Does it work with MediaMTX (Whip) and can I choose the destination server with an API? HOT 8
- Gatteway API v1.0 incompatibility on GKE HOT 6
- UDP Gateway Error HOT 11
- srflx ICE candidate wrong ip? HOT 1
- SRS integration? HOT 5
- Extra question about horizontally scaled Stunner HOT 3
- Example app udp-greeter.yaml not working - help needed HOT 10
- v0.16.0 - Websocket error HOT 3
- v0.16.0 - Stunnerd pods get into state where they won't respond to TURN requests HOT 1
- Allow Gateways to request a specific NodePort in the automatically created Service HOT 7
- `stunnerctl config` does not fall back to the default namespace
- Help testing on AKS (Azure) HOT 1
- Media plane: Asymmetric ICE connection issues: no allocation found HOT 6
- Deployment in headless mode does not resolve public ip address of client HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stunner.