Coder Social home page Coder Social logo

After Rebooting all my Control Plane Nodes at the same time: ovn-central details":"inconsistent data","error":"ovsdb error" about kube-ovn HOT 12 CLOSED

Smithx10 avatar Smithx10 commented on August 18, 2024
After Rebooting all my Control Plane Nodes at the same time: ovn-central details":"inconsistent data","error":"ovsdb error"

from kube-ovn.

Comments (12)

Smithx10 avatar Smithx10 commented on August 18, 2024

It appears this is only happening on 1 pod which is on headnode-01

[ use1 ] root@headnode-01:~$ kubectl ko nb status
faf1
Name: OVN_Northbound
Cluster ID: 277c (277c6d46-33b1-42b7-83be-4457951d8c54)
Server ID: faf1 (faf1ae0a-1103-48b4-993e-b06aab97f168)
Address: ssl:[172.16.0.3]:6643
Status: cluster member
Role: leader
Term: 20
Leader: self
Vote: self

Last Election started 44557 ms ago, reason: timeout
Last Election won: 44555 ms ago
Election timer: 5000
Log: [1802, 3219]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: (->ebb1) ->48ff <-48ff
Disconnections: 3
Servers:
    ebb1 (ebb1 at ssl:[172.16.0.1]:6643) next_index=3219 match_index=3218 last msg 13984 ms ago
    48ff (48ff at ssl:[172.16.0.2]:6643) next_index=3219 match_index=3218 last msg 650 ms ago
    faf1 (faf1 at ssl:[172.16.0.3]:6643) (self) next_index=3216 match_index=3218
status: ok


[ use1 ] root@headnode-01:~$ kubectl ko sb status
9e2c
Name: OVN_Southbound
Cluster ID: 7da9 (7da9f927-2c1d-4160-94c2-5be40d54cb81)
Server ID: 9e2c (9e2cd89b-5a30-4ae4-b3b8-7dfbd659ca53)
Address: ssl:[172.16.0.3]:6644
Status: cluster member
Role: leader
Term: 20
Leader: self
Vote: self

Last Election started 133221 ms ago, reason: timeout
Last Election won: 133218 ms ago
Election timer: 5000
Log: [1869, 3584]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: (->6919) ->81a3 <-81a3
Disconnections: 5
Servers:
    6919 (6919 at ssl:[172.16.0.1]:6644) next_index=1918 match_index=3583 last msg 21212 ms ago
    81a3 (81a3 at ssl:[172.16.0.2]:6644) next_index=3584 match_index=3583 last msg 1600 ms ago
    9e2c (9e2c at ssl:[172.16.0.3]:6644) (self) next_index=3580 match_index=3583
status: ok


Γöé kube-system     ovn-central-868c6dc8c7-f9sr5      ΓùÅ      0/1       CrashLoopBackOff                 3       0       0          0          0          0          0 172.16.0.1      headnode-01      116s       Γöé
Γöé kube-system     ovn-central-868c6dc8c7-jfkzp      ΓùÅ      1/1       Running                          0      46      70         15          1         35          1 172.16.0.2      headnode-02      2m6s       Γöé
Γöé kube-system     ovn-central-868c6dc8c7-kl45v      ΓùÅ      1/1       Running                          0      88      74         29          2         37          1 172.16.0.3      headnode-03      2m2s       Γöé

from kube-ovn.

bobz965 avatar bobz965 commented on August 18, 2024

please see the doc: https://kubeovn.github.io/docs/v1.13.x/ops/recover-db/

from kube-ovn.

Smithx10 avatar Smithx10 commented on August 18, 2024

@bobz965 Great! Kick From cluster worked perfectly. Thank You :)

from kube-ovn.

Smithx10 avatar Smithx10 commented on August 18, 2024

After kicking heanode-01 out of the cluster:

nbctl show is empty. "_" I think the database is gone?

[ use1 ] root@headnode-02:$ k ko nbctl show
[ use1 ] root@headnode-02:
$

from kube-ovn.

Smithx10 avatar Smithx10 commented on August 18, 2024

Alright, something might be strange with the plugin.

seems k ko nbctl isn't returning results, but running the ovn-nbctl show command inside 1 of the ovn-central pods works as expected.
root@headnode-03:/kube-ovn# ovn-nbctl show
switch 55df0ec4-af42-4035-9252-6c06b7b19a9a (storage)
port localnet.storage
type: localnet
addresses: ["unknown"]
port t2.default.storage.default.ovn
addresses: ["00:00:00:EE:7E:6A 172.16.17.105"]
switch a4cfaf08-da84-4735-9b3e-08f5cbf78644 (ovn-default)
port virt-operator-66b7f94b6d-tf9h9.kubevirt
addresses: ["00:00:00:08:2E:01 172.16.128.15"]
port linstor-csi-node-z2p7t.piraeus-datastore
addresses: ["00:00:00:D8:D9:EA 172.16.128.54"]
port ha-controller-7q7hf.piraeus-datastore
addresses: ["00:00:00:DA:4D:6D 172.16.128.132"]
port kube-ovn-pinger-sgdg6.kube-system
addresses: ["00:00:00:40:31:FC 172.16.129.45"]
port rke2-coredns-rke2-coredns-5b8c65d87f-nl9f2.kube-system
addresses: ["00:00:00:2D:47:1D 172.16.128.17"]
port kube-ovn-pinger-h6g65.kube-system
addresses: ["00:00:00:74:BA:A1 172.16.128.144"]
port ha-controller-rmp9q.piraeus-datastore
addresses: ["00:00:00:FB:74:57 172.16.128.159"]
port rke2-ingress-nginx-controller-8cvgx.kube-system
addresses: ["00:00:00:AA:DA:D3 172.16.128.129"]
port rke2-coredns-rke2-coredns-5b8c65d87f-g8kwq.kube-system
addresses: ["00:00:00:23:47:3D 172.16.128.18"]
port rke2-ingress-nginx-controller-h7z4m.kube-system
addresses: ["00:00:00:2C:C6:E2 172.16.128.98"]
port virt-handler-7hw2b.kubevirt
addresses: ["00:00:00:45:17:13 172.16.128.120"]
port ha-controller-zhspc.piraeus-datastore
addresses: ["00:00:00:AB:2E:61 172.16.128.122"]
port redisoperator-fb5478dbb-nz8bg.harbor-operator-ns
addresses: ["00:00:00:63:BC:B1 172.16.128.97"]
port virt-api-5686f97bc-vfzhb.kubevirt
addresses: ["00:00:00:8E:19:A1 172.16.128.2"]
port rke2-ingress-nginx-controller-5vgt8.kube-system
addresses: ["00:00:00:38:E8:40 172.16.128.125"]
port piraeus-operator-controller-manager-65c7fbbb5b-2ghg8.piraeus-datastore
addresses: ["00:00:00:FF:94:06 172.16.128.6"]
port ha-controller-445kw.piraeus-datastore
addresses: ["00:00:00:F4:C2:9C 172.16.128.143"]
port rke2-coredns-rke2-coredns-5b8c65d87f-xlwsz.kube-system
addresses: ["00:00:00:B4:D8:6F 172.16.128.13"]
port kube-ovn-pinger-5hdvw.kube-system
addresses: ["00:00:00:99:E9:27 172.16.129.43"]
port virt-controller-678998f868-8c4bg.kubevirt
addresses: ["00:00:00:33:69:26 172.16.128.16"]
port harbor-operator-7b48c67445-xvh8d.harbor-operator-ns
addresses: ["00:00:00:91:12:7E 172.16.128.14"]
port linstor-csi-node-n8tsq.piraeus-datastore
addresses: ["00:00:00:00:4D:85 172.16.128.50"]
port virt-handler-55gpf.kubevirt
addresses: ["00:00:00:81:EE:65 172.16.128.127"]
port ovn-default-ovn-cluster
type: router
router-port: ovn-cluster-ovn-default
port ha-controller-t76f4.piraeus-datastore
addresses: ["00:00:00:93:68:28 172.16.128.124"]
port virt-handler-55nl4.kubevirt
addresses: ["00:00:00:2B:7D:5A 172.16.129.36"]
port rke2-coredns-rke2-coredns-5b8c65d87f-lm744.kube-system
addresses: ["00:00:00:7C:5F:89 172.16.128.12"]
port kube-ovn-pinger-d6hbn.kube-system
addresses: ["00:00:00:C6:36:8C 172.16.128.48"]
port postgres-operator-95754cbfd-q48qg.harbor-operator-ns
addresses: ["00:00:00:8F:A2:13 172.16.128.203"]
port linstor-csi-node-rj4rq.piraeus-datastore
addresses: ["00:00:00:9E:D5:D9 172.16.128.32"]
port virt-handler-pmn4j.kubevirt
addresses: ["00:00:00:4E:12:DA 172.16.128.165"]
port linstor-csi-node-x6kt2.piraeus-datastore
addresses: ["00:00:00:84:08:D6 172.16.128.87"]
port ha-controller-7hwxc.piraeus-datastore
addresses: ["00:00:00:9F:62:58 172.16.129.48"]
port kube-ovn-pinger-49zch.kube-system
addresses: ["00:00:00:3B:AD:C6 172.16.129.35"]
port linstor-csi-node-4sv9g.piraeus-datastore
addresses: ["00:00:00:7A:0C:0A 172.16.128.61"]
port cert-manager-85cfbd86f5-gzb5q.cert-manager
addresses: ["00:00:00:58:AB:67 172.16.128.7"]
port linstor-csi-controller-557f665789-9xpfq.piraeus-datastore
addresses: ["00:00:00:01:69:C2 172.16.128.11"]
port kube-ovn-pinger-jcttq.kube-system
addresses: ["00:00:00:83:A1:8C 172.16.128.93"]
port virt-handler-lmzcj.kubevirt
addresses: ["00:00:00:63:E1:D3 172.16.129.32"]
port ha-controller-rkjld.piraeus-datastore
addresses: ["00:00:00:C5:31:F5 172.16.129.54"]
port kube-ovn-pinger-4q448.kube-system
addresses: ["00:00:00:32:9E:FF 172.16.129.33"]
port openebs-zfs-localpv-controller-0.openebs
addresses: ["00:00:00:44:ED:99 172.16.128.113"]
port rke2-coredns-rke2-coredns-autoscaler-945fbd459-f898l.kube-system
addresses: ["00:00:00:E7:63:1D 172.16.128.5"]
port ha-controller-zq7xm.piraeus-datastore
addresses: ["00:00:00:3D:3B:0E 172.16.128.168"]
port linstor-csi-node-snwlf.piraeus-datastore
addresses: ["00:00:00:96:D9:AC 172.16.128.91"]
port rke2-ingress-nginx-controller-m82ff.kube-system
addresses: ["00:00:00:5A:00:94 172.16.128.138"]
port kube-ovn-pinger-xgc27.kube-system
addresses: ["00:00:00:EA:1C:48 172.16.128.30"]
port my-nginx-684dd4dcd4-25t4z.default
addresses: ["00:00:00:D1:B6:39 172.16.128.188"]
port linstor-controller-7df855f57-94hd6.piraeus-datastore
addresses: ["00:00:00:BD:F7:D6 172.16.128.214"]
port rke2-coredns-rke2-coredns-5b8c65d87f-gb6b7.kube-system
addresses: ["00:00:00:B1:96:DD 172.16.128.102"]
port rke2-coredns-rke2-coredns-5b8c65d87f-ht7xw.kube-system
addresses: ["00:00:00:AF:2C:1A 172.16.128.103"]
port rke2-ingress-nginx-controller-tvq9r.kube-system
addresses: ["00:00:00:56:33:54 172.16.128.156"]
port rke2-coredns-rke2-coredns-5b8c65d87f-ck8cx.kube-system
addresses: ["00:00:00:1C:D3:DD 172.16.128.105"]
port virt-handler-4vv4x.kubevirt
addresses: ["00:00:00:5D:30:0C 172.16.129.57"]
port kube-ovn-pinger-dft8j.kube-system
addresses: ["00:00:00:E6:F7:33 172.16.129.44"]
port linstor-csi-node-58p86.piraeus-datastore
addresses: ["00:00:00:F3:66:53 172.16.128.85"]
port t3.default
addresses: ["00:00:00:10:DB:23 172.16.128.106"]
port cert-manager-webhook-847d7676c9-fs9ld.cert-manager
addresses: ["00:00:00:1C:85:CA 172.16.128.212"]
port kube-ovn-pinger-wlkf7.kube-system
addresses: ["00:00:00:7E:C2:4E 172.16.128.28"]
port cdi-apiserver-78d5585c5d-wpv24.cdi
addresses: ["00:00:00:54:3C:B6 172.16.128.209"]
port kube-ovn-pinger-hvxvq.kube-system
addresses: ["00:00:00:5F:53:3C 172.16.128.44"]
port linstor-csi-node-w8xnt.piraeus-datastore
addresses: ["00:00:00:A9:D5:62 172.16.128.76"]
port virt-handler-hk6dn.kubevirt
addresses: ["00:00:00:2C:12:AF 172.16.128.151"]
port kube-ovn-pinger-mdz4d.kube-system
addresses: ["00:00:00:F7:27:00 172.16.128.46"]
port virt-handler-959xv.kubevirt
addresses: ["00:00:00:37:DF:63 172.16.128.162"]
port linstor-csi-node-j5wfv.piraeus-datastore
addresses: ["00:00:00:80:3D:E5 172.16.128.86"]
port linstor-csi-node-dgzsz.piraeus-datastore
addresses: ["00:00:00:CF:18:C2 172.16.128.100"]
port linstor-csi-node-mgj7t.piraeus-datastore
addresses: ["00:00:00:CD:B4:65 172.16.128.35"]
port virt-handler-2hx96.kubevirt
addresses: ["00:00:00:45:5F:DF 172.16.128.152"]
port virt-handler-dx9qx.kubevirt
addresses: ["00:00:00:CA:D0:A0 172.16.128.142"]
port virt-handler-cmql9.kubevirt
addresses: ["00:00:00:26:77:9D 172.16.128.135"]
port virt-handler-6dfqn.kubevirt
addresses: ["00:00:00:43:4B:AC 172.16.128.158"]
port kube-ovn-pinger-wg9hh.kube-system
addresses: ["00:00:00:86:79:4C 172.16.128.145"]
port kube-ovn-pinger-cvgzs.kube-system
addresses: ["00:00:00:9B:E3:6F 172.16.128.47"]
port t4.default
addresses: ["00:00:00:34:07:AE 172.16.128.107"]
port rke2-snapshot-validation-webhook-54c5989b65-nzp5l.kube-system
addresses: ["00:00:00:9F:C5:0A 172.16.128.211"]
port virt-handler-nhnwd.kubevirt
addresses: ["00:00:00:99:75:FF 172.16.128.141"]
port ha-controller-rbnm6.piraeus-datastore
addresses: ["00:00:00:B5:6D:10 172.16.128.147"]
port my-nginx-684dd4dcd4-p9fvk.default
addresses: ["00:00:00:9F:C3:89 172.16.128.31"]
port piraeus-operator-gencert-7c5d64d5fc-ddmz8.piraeus-datastore
addresses: ["00:00:00:82:9E:49 172.16.128.99"]
port rke2-coredns-rke2-coredns-5b8c65d87f-vg6g6.kube-system
addresses: ["00:00:00:22:AA:BE 172.16.128.26"]
port ha-controller-pg6vk.piraeus-datastore
addresses: ["00:00:00:D7:5F:88 172.16.128.128"]
port virt-api-5686f97bc-564nw.kubevirt
addresses: ["00:00:00:EA:F9:E2 172.16.128.204"]
port rke2-ingress-nginx-controller-sd4ql.kube-system
addresses: ["00:00:00:A9:E0:3C 172.16.128.133"]
port rke2-ingress-nginx-controller-722zg.kube-system
addresses: ["00:00:00:14:CF:D3 172.16.128.121"]
port linstor-csi-node-2hnwh.piraeus-datastore
addresses: ["00:00:00:CB:18:17 172.16.128.43"]
port ha-controller-2r2ww.piraeus-datastore
addresses: ["00:00:00:08:E9:56 172.16.128.153"]
port cert-manager-cainjector-c7d4dbdd9-bv6ln.cert-manager
addresses: ["00:00:00:0D:1C:19 172.16.128.207"]
port rke2-ingress-nginx-controller-v5bcd.kube-system
addresses: ["00:00:00:B6:C5:B2 172.16.128.166"]
port rke2-snapshot-controller-59cc9cd8f4-4sfxv.kube-system
addresses: ["00:00:00:A3:1B:1B 172.16.128.213"]
port cdi-deployment-74b786dcc6-twr5m.cdi
addresses: ["00:00:00:C3:3D:EC 172.16.128.8"]
port ha-controller-xsbh5.piraeus-datastore
addresses: ["00:00:00:04:58:8A 172.16.128.163"]
port virt-handler-lvclp.kubevirt
addresses: ["00:00:00:A8:09:4B 172.16.129.51"]
port kube-ovn-pinger-tggf5.kube-system
addresses: ["00:00:00:44:CC:01 172.16.128.146"]
port cdi-operator-75d5789946-mgr5h.cdi
addresses: ["00:00:00:30:08:69 172.16.128.3"]
port virt-handler-v5qfj.kubevirt
addresses: ["00:00:00:95:E7:07 172.16.128.131"]
port linstor-csi-node-c7z97.piraeus-datastore
addresses: ["00:00:00:82:1A:37 172.16.128.62"]
port rke2-ingress-nginx-controller-47c4f.kube-system
addresses: ["00:00:00:DF:5A:01 172.16.128.148"]
port kube-ovn-pinger-t2vfk.kube-system
addresses: ["00:00:00:E4:42:09 172.16.128.45"]
port virt-operator-66b7f94b6d-mptpm.kubevirt
addresses: ["00:00:00:4A:AC:DE 172.16.128.9"]
port ha-controller-4fhcc.piraeus-datastore
addresses: ["00:00:00:6B:A9:1D 172.16.128.104"]
port rke2-ingress-nginx-controller-ftvnb.kube-system
addresses: ["00:00:00:4D:95:3F 172.16.128.154"]
port virt-handler-48pvp.kubevirt
addresses: ["00:00:00:80:C2:C5 172.16.129.34"]
port virt-handler-z6ddd.kubevirt
addresses: ["00:00:00:B3:8E:87 172.16.129.47"]
port rke2-metrics-server-544c8c66fc-s4cpr.kube-system
addresses: ["00:00:00:3C:26:A5 172.16.128.4"]
port kube-ovn-pinger-rqx5w.kube-system
addresses: ["00:00:00:EA:8F:67 172.16.129.31"]
port virt-controller-678998f868-jw4wp.kubevirt
addresses: ["00:00:00:B8:CF:6E 172.16.128.206"]
port virt-handler-45zv9.kubevirt
addresses: ["00:00:00:46:3B:8E 172.16.128.167"]
port kube-ovn-pinger-npk7p.kube-system
addresses: ["00:00:00:DC:A2:06 172.16.128.96"]
port ha-controller-8zvg6.piraeus-datastore
addresses: ["00:00:00:4E:81:D7 172.16.128.155"]
port rke2-ingress-nginx-controller-24r7b.kube-system
addresses: ["00:00:00:62:3A:F7 172.16.129.50"]
port rke2-ingress-nginx-controller-7f7dp.kube-system
addresses: ["00:00:00:43:73:F2 172.16.129.46"]
port minio-operator-855cd887f4-hfqwf.harbor-operator-ns
addresses: ["00:00:00:00:5D:B4 172.16.128.201"]
port linstor-csi-node-92r8k.piraeus-datastore
addresses: ["00:00:00:72:D9:4B 172.16.128.42"]
port rke2-ingress-nginx-controller-lbr8p.kube-system
addresses: ["00:00:00:50:E9:95 172.16.128.137"]
port linstor-csi-node-zfdx5.piraeus-datastore
addresses: ["00:00:00:F2:E8:43 172.16.128.81"]
port rke2-ingress-nginx-controller-2ml82.kube-system
addresses: ["00:00:00:50:18:89 172.16.129.55"]
port rke2-ingress-nginx-controller-8q8vk.kube-system
addresses: ["00:00:00:CA:E0:20 172.16.128.160"]
port rke2-ingress-nginx-controller-hw7hp.kube-system
addresses: ["00:00:00:E9:3A:0A 172.16.128.170"]
port ha-controller-p6l9t.piraeus-datastore
addresses: ["00:00:00:4C:2B:AA 172.16.128.136"]
port cdi-uploadproxy-5c4d65444d-nczdp.cdi
addresses: ["00:00:00:23:5D:5A 172.16.128.208"]
port linstor-csi-node-lpxcv.piraeus-datastore
addresses: ["00:00:00:E2:68:65 172.16.128.84"]
port virt-handler-rcdng.kubevirt
addresses: ["00:00:00:56:06:CB 172.16.128.119"]
port ha-controller-js7wn.piraeus-datastore
addresses: ["00:00:00:C6:79:7A 172.16.129.52"]
port kube-ovn-pinger-kqmnn.kube-system
addresses: ["00:00:00:48:91:40 172.16.128.95"]
port minio-operator-855cd887f4-z6xqr.harbor-operator-ns
addresses: ["00:00:00:4A:79:48 172.16.128.40"]
switch 22cb15be-d2fe-4a3f-a9d5-2a0552d86da7 (join)
port node-nsc-08
addresses: ["00:00:00:B3:43:B2 100.64.0.5"]
port node-nsc-04
addresses: ["00:00:00:63:AC:EB 100.64.0.7"]
port node-spinning-02
addresses: ["00:00:00:39:EA:28 100.64.0.16"]
port node-nvme-02
addresses: ["00:00:00:E2:53:21 100.64.0.2"]
port node-nvme-01
addresses: ["00:00:00:16:B4:94 100.64.0.3"]
port node-headnode-02
addresses: ["00:00:00:19:C3:4B 100.64.0.22"]
port node-nsc-07
addresses: ["00:00:00:A4:B5:DE 100.64.0.8"]
port node-nsc-03
addresses: ["00:00:00:EF:D0:E9 100.64.0.13"]
port join-ovn-cluster
type: router
router-port: ovn-cluster-join
port node-headnode-03
addresses: ["00:00:00:34:EF:6B 100.64.0.21"]
port node-nvme-03
addresses: ["00:00:00:EB:D6:F6 100.64.0.4"]
port node-headnode-01
addresses: ["00:00:00:63:89:BF 100.64.0.23"]
port node-spinning-03
addresses: ["00:00:00:D4:4C:5A 100.64.0.15"]
port node-spinning-01
addresses: ["00:00:00:1F:CC:1B 100.64.0.17"]
port node-nsc-05
addresses: ["00:00:00:32:AE:F8 100.64.0.10"]
port node-nsc-01
addresses: ["00:00:00:B6:61:59 100.64.0.12"]
port node-nsc-10
addresses: ["00:00:00:4A:73:0A 100.64.0.6"]
port node-nsc-06
addresses: ["00:00:00:05:C9:1A 100.64.0.14"]
port node-nsc-09
addresses: ["00:00:00:2C:2F:7D 100.64.0.9"]
port node-nsc-02
addresses: ["00:00:00:BE:FA:EE 100.64.0.11"]
switch 88a16dca-c621-4801-b754-8393b7e78e16 (external2080)
port t2.default
addresses: ["00:00:00:AE:C3:21 10.91.237.1"]
port localnet.external2080
type: localnet
tag: 2080
addresses: ["unknown"]
switch 9a9d40e6-934b-487a-bbd7-98de0fa3cfce (external)
port localnet.external
type: localnet
tag: 1998
addresses: ["unknown"]
port t1.default
addresses: ["00:00:00:65:0A:B1 10.91.64.3"]
port external-ovn-cluster
type: router
router-port: ovn-cluster-external
router 79a8ca90-715b-47b0-bb21-3c43ed2be277 (ovn-cluster)
port ovn-cluster-external
mac: "00:00:00:9E:F1:47"
networks: ["10.91.64.1/19"]
gateway chassis: [66585c3d-7730-4548-8339-74334208a934 fc35cde7-fdca-4280-aa68-31da9aa25654 4e94822c-a871-4aef-8adb-37558537534a af148592-4f16-432e-972b-a9a934502a41 e98adbf1-4fb8-41fa-b5d9-3975f18cc9a2 d6af920a-a9d0-4a32-92b8-4be203b28f4f bea6ae2e-6c61-4506-a2af-725f6164a31c 594ac44a-e6b6-430c-8032-aeaf67cc7eaa a0226e57-92f2-48e3-943e-3cfb78c9388d 21167e15-e730-49e5-8298-5beccf59120a 0b64effb-6818-4cd0-9a84-224f5fb6f8e5 0e11f83a-85bf-4e53-b088-251b8dbf55fb 3a697fd6-fa3b-4403-b781-ab23c844b0fd 42645368-650c-4ad6-91a9-87b58631ad14 2871172c-4d32-4c33-b9c9-2fa27137428b 0d6e24cd-7ab9-4e8a-8625-f25f692404ec e0d7b65f-28d1-4c73-a96e-0fe443940110 57eacb42-4981-4a1b-a3ae-95565e393726 624f5878-23b6-40ac-8593-2064be0fc543]
port ovn-cluster-join
mac: "00:00:00:F0:B4:C8"
networks: ["100.64.0.1/16"]
port ovn-cluster-ovn-default
mac: "00:00:00:63:7A:29"
networks: ["172.16.128.1/17"]
nat 310923ad-0955-4070-86d5-89cddd246f4b
external ip: "10.91.64.220"
logical ip: "172.16.128.106"
type: "dnat_and_snat"

from kube-ovn.

Smithx10 avatar Smithx10 commented on August 18, 2024

After coming back, I went to test EIP..... Tried deleting a eip and fip..... not found.

E0416 14:08:14.390556       1 ovn-nb-nat.go:302] not found logical router ovn-cluster nat 'type dnat_and_snat external ip 10.91.64.5 logical ip 172.16.128.134'
E0416 14:08:14.390587       1 ovn-nb-nat.go:214] not found logical router ovn-cluster nat 'type dnat_and_snat external ip 10.91.64.5 logical ip 172.16.128.134'
E0416 14:08:14.390600       1 ovn_fip.go:464] failed to delete fip eip-static, not found logical router ovn-cluster nat 'type dnat_and_snat external ip 10.91.64.5 logical ip 172.16.128.134'
E0416 14:08:14.390645       1 ovn_fip.go:172] error syncing 'eip-static': not found logical router ovn-cluster nat 'type dnat_and_snat external ip 10.91.64.5 logical ip 172.16.128.134', requeuing
I0416 14:08:25.356517       1 ovn_eip.go:324] handle del ovn eip eip-static
E0416 14:08:25.356582       1 ovn_eip.go:637] ovn eip 'eip-static' is still in use, finalizer will not be removed
E0416 14:08:25.356596       1 ovn_eip.go:348] failed to handle remove ovn eip finalizer , ovn eip 'eip-static' is still in use, finalizer will not be removed
E0416 14:08:25.356635       1 ovn_eip.go:204] error syncing 'eip-static': ovn eip 'eip-static' is still in use, finalizer will not be removed, requeuing

from kube-ovn.

Smithx10 avatar Smithx10 commented on August 18, 2024

Restarting all 3 of the ovn-central pods allowed the kubectl ko command to function as expected. I assume its ttempting to execute this command on an old leader or?

from kube-ovn.

Smithx10 avatar Smithx10 commented on August 18, 2024

After bouncing them..... kube-ovn-controller logs are flooded with

E0416 14:15:10.352430       1 pod.go:433] error syncing 'kubevirt/virt-handler-nhnwd': generate operations for creating logical switch port virt-handler-nhnwd.kubevirt: get logical switch ovn-default when generate mutate operations: not found logical switch "ovn-default", requeuing
I0416 14:15:10.352608       1 event.go:364] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kubevirt", Name:"virt-handler-nhnwd", UID:"690eb317-6d9b-4bd1-affb-64eb906d1f1c", APIVersion:"v1", ResourceVersion:"29244282", FieldPath:""}): type: 'Warning' reason: 'CreateOVNPortFailed' generate operations for creating logical switch port virt-handler-nhnwd.kubevirt: get logical switch ovn-default when generate mutate operations: not found logical switch "ovn-default"
I0416 14:15:10.451737       1 pod.go:578] handle add/update pod harbor-operator-ns/postgres-operator-95754cbfd-q48qg
I0416 14:15:10.451832       1 pod.go:635] sync pod harbor-operator-ns/postgres-operator-95754cbfd-q48qg allocated
I0416 14:15:10.451855       1 ipam.go:72] allocating static ip 172.16.128.203 from subnet ovn-default
I0416 14:15:10.451876       1 ipam.go:108] allocate v4 172.16.128.203, mac 00:00:00:8F:A2:13 for harbor-operator-ns/postgres-operator-95754cbfd-q48qg from subnet ovn-default
E0416 14:15:10.452595       1 ovn-nb-logical_switch.go:379] not found logical switch "ovn-default"
E0416 14:15:10.452614       1 ovn-nb-logical_switch_port.go:730] get logical switch ovn-default when generate mutate operations: not found logical switch "ovn-default"
E0416 14:15:10.452624       1 ovn-nb-logical_switch_port.go:111] get logical switch ovn-default when generate mutate operations: not found logical switch "ovn-default"
E0416 14:15:10.452654       1 pod.go:737] generate operations for creating logical switch port postgres-operator-95754cbfd-q48qg.harbor-operator-ns: get logical switch ovn-default when generate mutate operations: not found logical switch "ovn-default"
E0416 14:15:10.452666       1 pod.go:617] generate operations for creating logical switch port postgres-operator-95754cbfd-q48qg.harbor-operator-ns: get logical switch ovn-default when generate mutate operations: not found logical switch "ovn-default"
E0416 14:15:10.452686       1 pod.go:433] error syncing 'harbor-operator-ns/postgres-operator-95754cbfd-q48qg': generate operations for creating logical switch port postgres-operator-95754cbfd-q48qg.harbor-operator-ns: get logical switch ovn-default when generate mutate operations: not found logical switch "ovn-default", requeuing
I0416 14:15:10.452708       1 event.go:364] Event(v1.ObjectReference{Kind:"Pod", Namespace:"harbor-operator-ns", Name:"postgres-operator-95754cbfd-q48qg", UID:"5284fc0b-0064-42d1-bee3-fb5c64c2c11f", APIVersion:"v1", ResourceVersion:"29244285", FieldPath:""}): type: 'Warning' reason: 'CreateOVNPortFailed' generate operations for creating logical switch port postgres-operator-95754cbfd-q48qg.harbor-operator-ns: get logical switch ovn-default when generate mutate operations: not found logical switch "ovn-default"

from kube-ovn.

Smithx10 avatar Smithx10 commented on August 18, 2024

Bouncing the kube-ovn-controllers and ran into

│ I0416 14:19:31.176288       1 init.go:451] take 0.01 seconds to initialize IPAM                                                                                                                                  │
│ I0416 14:19:31.901405       1 vpc.go:721] vpc ovn-cluster add static route: &{Policy:policyDst CIDR:0.0.0.0/0 NextHopIP:100.64.0.1 ECMPMode: BfdID: RouteTable:}                                                 │
│ I0416 14:19:31.902370       1 ovn-nb-logical_router_route.go:103] logical router ovn-cluster del static routes: []                                                                                               │
│ I0416 14:19:31.902471       1 init.go:619] start to sync subnets                                                                                                                                                 │
│ E0416 14:19:31.903057       1 subnet.go:2223] ipam subnet external2080 has no ip in using, but some ip cr left: ip 1, vip 0, iptable eip 0, ovn eip 0                                                            │
│ E0416 14:19:31.903080       1 init.go:636] failed to calculate subnet external2080 used ip: ipam subnet external2080 has no ip in using, but some ip cr left: ip 1, vip 0, iptable eip 0, ovn eip 0              │
│ E0416 14:19:31.903139       1 klog.go:10] "failed to sync crd subnets" err="ipam subnet external2080 has no ip in using, but some ip cr left: ip 1, vip 0, iptable eip 0, ovn eip 0"                             │
│ Stream closed EOF for kube-system/kube-ovn-controller-6d4cc9b96b-5s6pl (kube-ovn-controller)                                                                                                                     │

[ use1 ] root@headnode-01:~/yamls/eip$ k get subnet
NAME           PROVIDER              VPC           PROTOCOL   CIDR              PRIVATE   NAT     DEFAULT   GATEWAYTYPE   V4USED   V4AVAILABLE   V6USED   V6AVAILABLE   EXCLUDEIPS                       U2OINTERCONNECTIONIP
external       ovn                   ovn-cluster   IPv4       10.91.64.0/19     false             false     distributed   3        8186          0        0             ["10.91.95.254"]
external2080   ovn                   ovn-cluster   IPv4       10.91.237.0/24    false     false   false     distributed   1        252           0        0             ["10.91.237.254"]
join           ovn                   ovn-cluster   IPv4       100.64.0.0/16     false     false   false     distributed   19       65514         0        0             ["100.64.0.1"]
ovn-default    ovn                   ovn-cluster   IPv4       172.16.128.0/17   false     true    true      distributed   176      32589         0        0             ["172.16.128.1"]
storage        storage.default.ovn   ovn-cluster   IPv4       172.16.16.0/21    false     false   false     distributed   1        1791          0        0             ["172.16.16.1..172.16.16.254"]

[ use1 ] root@headnode-01:~/yamls/eip$ k get ip  | grep external
t1.default                                                               10.91.64.3              00:00:00:65:0A:B1   nsc-06        external
t2.default                                                               10.91.237.1             00:00:00:AE:C3:21   nsc-05        external2080
│ I0416 14:41:01.712318       1 init.go:619] start to sync subnets                                                                                                                                                 │
│ E0416 14:41:01.712830       1 subnet.go:2223] ipam subnet storage has no ip in using, but some ip cr left: ip 1, vip 0, iptable eip 0, ovn eip 0                                                                 │
│ E0416 14:41:01.712872       1 init.go:636] failed to calculate subnet storage used ip: ipam subnet storage has no ip in using, but some ip cr left: ip 1, vip 0, iptable eip 0, ovn eip 0                        │
│ E0416 14:41:01.712933       1 klog.go:10] "failed to sync crd subnets" err="ipam subnet storage has no ip in using, but some ip cr left: ip 1, vip 0, iptable eip 0, ovn eip 0"                                  │

from kube-ovn.

Smithx10 avatar Smithx10 commented on August 18, 2024

I was able the controller to get passed init by kubectl deleting all the pods, and ips on those subnets which required me to patch the finalizer null.

These aren't in production, but I imagine that subnet init function may have a bug in it.

from kube-ovn.

bobz965 avatar bobz965 commented on August 18, 2024

After kicking heanode-01 out of the cluster:

nbctl show is empty. "_" I think the database is gone?

[ use1 ] root@headnode-02:$ k ko nbctl show [ use1 ] root@headnode-02:$

i think you should kick the bad one, and then clean its nb db and sb db data, and add it back.

from kube-ovn.

oilbeater avatar oilbeater commented on August 18, 2024

should be fixed by #3928

from kube-ovn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.