Comments (15)
I'm looking into this
from atomic-system-containers.
@jasonbrooks Aligning installation and configuration with other projects is a good idea.
I will close the issue because with the updated containers it very likely is a layer 8 problem on my side. Thank you for looking into this!
from atomic-system-containers.
Thanks for the report @neuhalje! It looks like this is due to the latest version not being available in Fedora as /run
should be mounted in from the system: https://github.com/projectatomic/atomic-system-containers/blob/master/kubernetes-proxy/config.json.template#L324-L334
@jasonbrooks can you push through an update?
from atomic-system-containers.
Related: https://pagure.io/releng/issue/7217
from atomic-system-containers.
Update
I upgraded the system & containers:
- I cannot access it from any other system (no reply to
SYN
packets). - I can access the service from my (single) node
- the message has changed (slightly) but the service still logs an error (
Failed to start in resource-only container "/kube-proxy": mkdir /sys/fs/cgroup/cpuset/kube-proxy: read-only file system
).
Status
Installed Versions
sudo atomic images list | grep proxy
> registry.fedoraproject.org/f27/kubernetes-proxy latest 4660f3d3b9a3 2018-01-13 10:13 262.53 MB ostree
Log
journalctl -xe -u kube-proxy.service
...
-- Unit kube-proxy.service has finished starting up.
--
-- The start-up result is done.
Jan 13 10:11:54 node-1.[redacted] runc[772]: 2018-01-13 10:11:54.524258 I | proto: duplicate proto type registered: google.protobuf.Any
Jan 13 10:11:54 node-1.[redacted] runc[772]: 2018-01-13 10:11:54.537896 I | proto: duplicate proto type registered: google.protobuf.Duration
Jan 13 10:11:54 node-1.[redacted] runc[772]: 2018-01-13 10:11:54.538171 I | proto: duplicate proto type registered: google.protobuf.Timestamp
Jan 13 10:11:54 node-1.[redacted] runc[772]: W0113 10:11:54.934872 1 server.go:190] WARNING: all flags other than --config, --write-config-to, and --cleanup-iptables are deprecated. Please begin using a config file ASAP.
Jan 13 10:11:55 node-1.[redacted] runc[772]: I0113 10:11:55.133819 1 server.go:478] Using iptables Proxier.
Jan 13 10:11:55 node-1.[redacted] runc[772]: W0113 10:11:55.155968 1 proxier.go:488] clusterCIDR not specified, unable to distinguish between internal an
Jan 13 10:11:55 node-1.[redacted] runc[772]: I0113 10:11:55.156343 1 server.go:513] Tearing down userspace rules.
Jan 13 10:11:55 node-1.[redacted] runc[772]: W0113 10:11:55.475478 1 server.go:628] Failed to start in resource-only container "/kube-proxy": mkdir /sys/fs/cgroup/cpuset/kube-proxy: read-only file system
Jan 13 10:11:55 node-1.[redacted] runc[772]: I0113 10:11:55.476775 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
Jan 13 10:11:55 node-1.[redacted] runc[772]: I0113 10:11:55.476989 1 conntrack.go:52] Setting nf_conntrack_max to 131072
Jan 13 10:11:55 node-1.[redacted] runc[772]: I0113 10:11:55.477159 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to
Jan 13 10:11:55 node-1.[redacted] runc[772]: I0113 10:11:55.477307 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3
Jan 13 10:11:55 node-1.[redacted] runc[772]: I0113 10:11:55.478839 1 config.go:202] Starting service config controller
Jan 13 10:11:55 node-1.[redacted] runc[772]: I0113 10:11:55.478880 1 config.go:102] Starting endpoints config controller
Jan 13 10:11:55 node-1.[redacted] runc[772]: I0113 10:11:55.524651 1 controller_utils.go:994] Waiting for caches to sync for service config controller
Nodes
kubectl get nodes
NAME STATUS ROLES AGE VERSION
node-1.[redacted ] Ready <none> 33d v1.7.3
node-1
has the ip address 172.20.61.51.
Services
The service is running:
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 33d
my-nginx NodePort 10.254.17.99 <none> 80:30849/TCP 3h
kubectl describe service my-nginx
Name: my-nginx
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=nginx
Type: NodePort
IP: 10.254.17.99
Port: <unset> 80/TCP
TargetPort: 80/TCP
NodePort: <unset> 30849/TCP
Endpoints: 172.17.0.2:80,172.17.0.3:80
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment2-540558622-9zxwt 1/1 Running 1 3h
nginx-deployment2-540558622-jzjv0 1/1 Running 1 3h
Analysis
Log
Compared to the old output the first message is still logged but Failed to execute iptables-restore: failed to open iptables lock /run/xtables.lock: open /run/xtables.lock: read-only file system
is no longer logged:
Failed to start in resource-only container "/kube-proxy": mkdir /sys/fs/cgroup/cpuset/kube-proxy: read-only file system
...
Access service from the node works
On the node the service can be accessed (curl http://172.20.61.51:30849
succeeds).
Access service from other systems does not work
When I access the service from my laptop the service cannot be accessed (curl http://172.20.61.51:30849
hangs).
tcpdump
shows that my host gets no reply for the initial SYN
packet:
# on the host node-1
sudo tcpdump -nn port 30849
...
13:55:15.300884 IP 172.20.10.50.54187 > 172.20.61.51.30849: Flags [S], seq 1814234020, win 65535, options [mss 1460,nop,wscale 5,nop,nop,TS val 1060856956 ecr 0,sackOK,eol], length 0
13:55:16.303943 IP 172.20.10.50.54187 > 172.20.61.51.30849: Flags [S], seq 1814234020, win 65535, options [mss 1460,nop,wscale 5,nop,nop,TS val 1060857956 ecr 0,sackOK,eol], length 0
...
Firewall
iptables
has rules for the service:
# on the host node-1
sudo iptables -n -L -t nat
....
Chain KUBE-NODEPORTS (1 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- 0.0.0.0/0 0.0.0.0/0 /* default/my-nginx: */ tcp dpt:32474
KUBE-SVC-BEPXDJBUHFCSYIC3 tcp -- 0.0.0.0/0 0.0.0.0/0 /* default/my-nginx: */ tcp dpt:32474
...
Chain KUBE-SEP-BLX3X6UTIG6UGCA2 (1 references)
target prot opt source destination
KUBE-MARK-MASQ all -- 172.17.0.5 0.0.0.0/0 /* default/my-nginx: */
DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 /* default/my-nginx: */ tcp to:172.17.0.5:80
...
Chain KUBE-SERVICES (2 references)
target prot opt source destination
KUBE-MARK-MASQ tcp -- !172.17.0.0/16 10.254.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:443
KUBE-SVC-NPX46M4PTMTKRN6Y tcp -- 0.0.0.0/0 10.254.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:443
KUBE-MARK-MASQ tcp -- !172.17.0.0/16 10.254.168.195 /* default/my-nginx: cluster IP */ tcp dpt:80
KUBE-SVC-BEPXDJBUHFCSYIC3 tcp -- 0.0.0.0/0 10.254.168.195 /* default/my-nginx: cluster IP */ tcp dpt:80
KUBE-MARK-MASQ tcp -- !172.17.0.0/16 10.254.210.142 /* ingress-nginx/default-http-backend: cluster IP */ tcp dpt:80
KUBE-SVC-J4PGGZ6AUXZWNA2B tcp -- 0.0.0.0/0 10.254.210.142 /* ingress-nginx/default-http-backend: cluster IP */ tcp dpt:80
KUBE-NODEPORTS all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL
Chain KUBE-SVC-BEPXDJBUHFCSYIC3 (2 references)
target prot opt source destination
KUBE-SEP-BLX3X6UTIG6UGCA2 all -- 0.0.0.0/0 0.0.0.0/0 /* default/my-nginx: */ statistic mode random probability 0.50000000000
KUBE-SEP-J5WBW7HEOGAHN6ZG all -- 0.0.0.0/0 0.0.0.0/0 /* default/my-nginx: */
...
# Outbound
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE all -- 172.17.0.0/16 0.0.0.0/0
KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */
...
Chain KUBE-POSTROUTING (1 references)
target prot opt source destination
MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000
from atomic-system-containers.
@ashcrow Is this a bug or a setup problem on my side?
from atomic-system-containers.
@jasonbrooks ^^
from atomic-system-containers.
@neuhalje It might be a setup problem on your side. I'm testing this on a three node cluster with system containers installed, and my nodeport is exposed on each of my nodes, and I'm able to curl the nginx server.
I am getting the /sys/fs/cgroup/cpuset/kube-proxy: read-only file system
error as well. The system containers for the openshift origin node (https://github.com/openshift/origin/blob/release-3.7/images/node/system-container/config.json.template), which cover the kubelet and the proxy components, bind /sys rw, we could take that approach, or we could change our ro bind of /sys/fs/cgroup
to rw.
A wider issue is that we need to update / refine our suggested kubernetes setup process. I've always used https://github.com/kubernetes/contrib/tree/master/ansible, but those scripts have been deprecated for a different ansible-based approach that doesn't use these system containers at all.
I think it might make sense to try and work out upstream kube master and node roles that work with https://github.com/openshift/openshift-ansible.
from atomic-system-containers.
I've hit this same issue.
Able to connect to tutor-proxy nodePort locally, but not remotely. I'm running the latest available version of the kube-proxy system container from registry.fedoraproject.org/f27/kubernetes-proxy
.
kube-proxy output:
Feb 13 18:40:45 ip-10-107-20-177.us-west-2.compute.internal systemd[1]: Started kubernetes-proxy.
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: 2018-02-13 18:40:46.089456 I | proto: duplicate proto type registered: google.protobuf.Any
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: 2018-02-13 18:40:46.089550 I | proto: duplicate proto type registered: google.protobuf.Duration
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: 2018-02-13 18:40:46.089570 I | proto: duplicate proto type registered: google.protobuf.Timestamp
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: W0213 18:40:46.133750 1 server.go:190] WARNING: all flags other than --config, --write-config-to, and --cleanup-iptables are deprecated. Please begin using a config file ASAP.
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: I0213 18:40:46.140159 1 server.go:478] Using iptables Proxier.
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: W0213 18:40:46.145704 1 proxier.go:488] clusterCIDR not specified, unable to distinguish between internal and external traffic
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: I0213 18:40:46.146164 1 server.go:513] Tearing down userspace rules.
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: W0213 18:40:46.156672 1 server.go:628] Failed to start in resource-only container "/kube-proxy": mkdir /sys/fs/cgroup/cpuset/kube-proxy: read-only file system
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: I0213 18:40:46.157028 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: I0213 18:40:46.157116 1 conntrack.go:52] Setting nf_conntrack_max to 131072
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: I0213 18:40:46.157264 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: I0213 18:40:46.157307 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: I0213 18:40:46.157646 1 config.go:202] Starting service config controller
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: I0213 18:40:46.157662 1 controller_utils.go:994] Waiting for caches to sync for service config controller
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: I0213 18:40:46.157701 1 config.go:102] Starting endpoints config controller
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: I0213 18:40:46.157708 1 controller_utils.go:994] Waiting for caches to sync for endpoints config controller
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: I0213 18:40:46.257846 1 controller_utils.go:1001] Caches are synced for endpoints config controller
Feb 13 18:40:46 ip-10-107-20-177.us-west-2.compute.internal runc[6281]: I0213 18:40:46.257870 1 controller_utils.go:1001] Caches are synced for service config controller
from atomic-system-containers.
Reopening. @jasonbrooks can you reproduce?
from atomic-system-containers.
It has been stated that this issue will be resolved with 2d50826
But I have doubt that the above fix applies to the kubernetes-proxy system container. It looks like it only applies to the kubelet container.
from atomic-system-containers.
@deuscapturus Right, I'm going to test adding a similar fix in the kube-proxy container
from atomic-system-containers.
@deuscapturus So, I tested the change, and it got rid of the error, but I'm able to access my nodeport from a separate system with or without the change.
I can try to reproduce what you're seeing, do you have a test manifest or something I can try
from atomic-system-containers.
My problem is somewhere in iptables. I'm able to connect to my service externally on the nodePort when I change kube-proxy to --proxy-mode=userspace
.
@jasonbrooks as your test suggests the ro filesystem error/warning is an entirely different issue. Would you prefer a new issue or to change the title on this one?
from atomic-system-containers.
@deuscapturus we can keep this issue. I'm curious if you install the and run the proxy from the rpm, will you still have this issue. The following command will do it. I'm including a dl of the particular package because the current latest kube in f27 is 1.9.1, but a system container w/ that version hasn't been released yet.
atomic uninstall kube-proxy && curl -O https://kojipkgs.fedoraproject.org//packages/kubernetes/1.7.3/1.fc27/x86_64/kubernetes-node-1.7.3-1.fc27.x86_64.rpm && rpm-ostree install kubernetes-node-1.7.3-1.fc27.x86_64.rpm -r
from atomic-system-containers.
Related Issues (19)
- Lifestyle -> lifecycle
- kube-dns HOT 6
- kubernetes-apiserver needs a writable directory for certificates HOT 3
- Runtime spec changed, config.json files are not usable anymore by runc HOT 2
- kubeadm & weave: No networks found in /etc/cni/net.d HOT 5
- Add jsonbrooks HOT 1
- Services fail bc. service script is lacking executable bit HOT 11
- Issues running kubernetes on Fedora Atomic Host HOT 1
- Repository Usage Guidance? HOT 2
- source for registry.centos.org/projectatomic/cri-o:latest HOT 3
- kubelet syscontainer can't get the /dev/disk of host HOT 3
- kubeadm init ends with modprobe error
- atomic uninstall flannel stops docker service HOT 8
- flannel service shows 'failed' after stopping
- flannel container not working w/ vxlan backend HOT 2
- Issues regarding system containers not using runc HOT 5
- system container is not multi-arch aware HOT 7
- atomic uninstall kubelet does not remove flannel, etc. HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from atomic-system-containers.