Comments (7)
According to the containerd docs at https://github.com/containerd/containerd/blob/release/1.7/docs/hosts.md, all the host fields are valid at the root level:
For each registry host namespace directory in your registry config_path you may include a hosts.toml configuration file. The following root level toml fields apply to the registry host namespace:
This is what k3s generates:
root@systemd-node-1:/# cat /var/lib/rancher/k3s/agent/etc/containerd/certs.d/172-17-0-7.sslip.io/hosts.toml
# File generated by k3s. DO NOT EDIT.
server = "https://172-17-0-7.sslip.io/v2"
capabilities = ["pull", "resolve", "push"]
ca = ["/usr/local/share/ca-certificates/registry.crt"]
However, containerd fails to load that:
time="2024-04-01T22:11:02.070675417Z" level=error msg="failed to decode hosts.toml" error="invalid `host` tree"
Apparently it goes looking for at least one host
section; if it can't find one it fails to use the hosts.toml file entirely, despite the presence of valid config at the root level.
As a workaround, we can generate an empty host section; the following works properly:
root@systemd-node-1:/# cat /var/lib/rancher/k3s/agent/etc/containerd/certs.d/172-17-0-7.sslip.io/hosts.toml
# File generated by k3s. DO NOT EDIT.
server = "https://172-17-0-7.sslip.io/v2"
capabilities = ["pull", "resolve", "push"]
ca = ["/usr/local/share/ca-certificates/registry.crt"]
[host]
I can address this in the next release. In the mean time, if you do not currently specify a port in your registry namespace, you should be able to work around the issue with something like this in your registries.yaml:
mirrors:
172-17-0-7.sslip.io:
endpoint:
- https://172-17-0-7.sslip.io:443
configs:
"172-17-0-7.sslip.io:443":
tls:
ca_file: /usr/local/share/ca-certificates/registry.crt
Note use of a port in the endpoint to force it to generate a host entry in the hosts.toml.
from k3s.
Can you confirm that you are not using a custom containerd config template? Can you provide the output of find /var/lib/rancher/k3s/agent/etc/containerd/ -type f -print -exec cat {} \;
along with containerd.log showing the failed pull?
from k3s.
Can you confirm that you are not using a custom containerd config template? Can you provide the output of
find /var/lib/rancher/k3s/agent/etc/containerd/ -type f -print -exec cat {} \;
along with containerd.log showing the failed pull?
I have not touched the template at all. I also inspected the containerd toml and compared everything that seemed relevant to a backup from an earlier version and everything was identical.
I do not have the containerd log anymore. Are you unable to reproduce this behavior in 1.29.3+k3s1? 🤔 If absolutely need be I can destroy my cluster and build from scratch, but that should be the last resort.
EDIT: the cluster is up and running on 1.29.2+k3s1 with traffic going to/from. It's disruptive for me to test this on the same metal. I can try on another machine, but so can anyone :) it would be nice to see if anyone else can reproduce this
from k3s.
- Opened an upstream issue: containerd/containerd#10027
- And pull request: containerd/containerd#10028
from k3s.
Thank you very much for going through the work to reproduce this, @brandond!
from k3s.
Using 172-17-0-7.sslip.io
as an example registry, the two possible work-arounds are:
- If your registry namespace does not currently include a port, configure a mirror endpoint with a port:
mirrors: 172-17-0-7.sslip.io: endpoint: - https://172-17-0-7.sslip.io:443 configs: "172-17-0-7.sslip.io:443": tls: ca_file: /usr/local/share/ca-certificates/registry.crt
- Manually drop the CA certificate into the registry namespace's configuration directory, and make it immutable so that k3s does not remove it when restarting:
mkdir -p /var/lib/rancher/k3s/agent/etc/containerd/certs.d/172-17-0-7.sslip.io/ cp /usr/local/share/ca-certificates/registry.crt /var/lib/rancher/k3s/agent/etc/containerd/certs.d/172-17-0-7.sslip.io/ca.crt chattr +i /var/lib/rancher/k3s/agent/etc/containerd/certs.d/172-17-0-7.sslip.io/ca.crt
from k3s.
Validated on master branch with version v1.29.4-rc1+k3s1
Environment Details
Infrastructure
- Cloud
- Hosted
Node(s) CPU architecture, OS, and Version:
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.2 LTS"
$ uname -m
x86_64
Cluster Configuration:
HA: 3 server/ 1 agent
Config.yaml:
token: xxxx
cluster-init: true
write-kubeconfig-mode: "0644"
node-external-ip: 1.1.1.1
node-label:
- k3s-upgrade=server
registries.yaml:
$ sudo cat /etc/rancher/k3s/registries.yaml
mirrors:
pvt-registry.com:
endpoint:
- pvt-registry.com
docker.io:
endpoint:
- pvt-registry.com
k8s.gcr.io:
endpoint:
- pvt-registry.com
configs:
pvt-registry.com:
auth:
username: xxxx
password: xxxx
tls:
ca_file: /home/user/ca.pem
test-image.yaml:
apiVersion: v1
kind: Namespace
metadata:
name: pvt-reg-test
labels:
pod-security.kubernetes.io/enforce: privileged
pod-security.kubernetes.io/audit: privileged
pod-security.kubernetes.io/warn: privileged
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: pvt-reg-test
namespace: pvt-reg-test
spec:
selector:
matchLabels:
k8s-app: nginx-app-clusterip
replicas: 2
template:
metadata:
labels:
k8s-app: nginx-app-clusterip
spec:
containers:
- name: nginx
image: pvt-registry.com/nginx:latest
ports:
- containerPort: 8080
Testing Steps
- Copy config.yaml and registries.yaml
$ sudo mkdir -p /etc/rancher/k3s
$ sudo cp config.yaml /etc/rancher/k3s
$ sudo cp registries.yaml /etc/rancher/k3s
- Install k3s
curl -sfL https://get.k3s.io | sudo INSTALL_K3S_VERSION='v1.29.4-rc1+k3s1' sh -s - server
- Verify Cluster Status:
kubectl get nodes -o wide
kubectl get pods -A
- Push an image onto the private registry and try to deploy a pod with said image.
The image should get pulled and pod should come up without any tls certificate errors.
$ kubectl apply -f test-image.yaml
$ kubectl get pods -n pvt-reg-test
$ kubectl describe pod/pvt-reg-test-abcd -n pvt-reg-test
- Check the hosts.toml files for host section
Replication Results:
- k3s version used for replication:
$ k3s -v
k3s version v1.29.3+k3s1 (8aecc26b)
go version go1.21.8
$ kubectl get pods -A
kube-system coredns-6799fbcd5-p7pkw 1/1 Running 0 4m38s
kube-system helm-install-traefik-9v8gb 0/1 Completed 1 4m38s
kube-system helm-install-traefik-crd-5n2cw 0/1 Completed 0 4m38s
kube-system local-path-provisioner-6c86858495-gps56 1/1 Running 0 4m38s
kube-system metrics-server-54fd9b65b-mtzk5 1/1 Running 0 4m38s
kube-system svclb-traefik-44e43501-4kkng 2/2 Running 0 3m26s
kube-system svclb-traefik-44e43501-hd2qx 2/2 Running 0 4m16s
kube-system svclb-traefik-44e43501-rx2pt 2/2 Running 0 2m37s
kube-system svclb-traefik-44e43501-smtfd 2/2 Running 0 4m16s
kube-system traefik-f4564c4f4-2t2l8 1/1 Running 0 4m17s
pvt-reg-test pvt-reg-test-64bc967f8b-6j8jk 0/1 ImagePullBackOff 0 28s
pvt-reg-test pvt-reg-test-64bc967f8b-sgxg9 0/1 ErrImagePull 0 28s
Pod Events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m39s default-scheduler Successfully assigned pvt-reg-test/pvt-reg-test-64bc967f8b-sgxg9 to ip-172-31-16-132
Normal Pulling 6m6s (x4 over 7m38s) kubelet Pulling image "pvt-registry.com/nginx:latest"
Warning Failed 6m6s (x4 over 7m38s) kubelet Failed to pull image "pvt-registry.com/nginx:latest": failed to pull and unpack image "pvt-registry.com/nginx:latest": failed to resolve reference "pvt-registry.com/nginx:latest": failed to do request: Head "https://pvt-registry.com/v2/nginx/manifests/latest": tls: failed to verify certificate: x509: certificate signed by unknown authority
Warning Failed 6m6s (x4 over 7m38s) kubelet Error: ErrImagePull
Warning Failed 5m54s (x6 over 7m38s) kubelet Error: ImagePullBackOff
Normal BackOff 2m27s (x21 over 7m38s) kubelet Back-off pulling image "pvt-registry.com/nginx:latest"
Validation Results:
- k3s version used for validation:
$ k3s -v
k3s version v1.29.4-rc1+k3s1 (d973fadb)
go version go1.21.9
$ kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6799fbcd5-ccwrw 1/1 Running 0 4m42s
kube-system helm-install-traefik-667w4 0/1 Completed 1 4m43s
kube-system helm-install-traefik-crd-2nq47 0/1 Completed 0 4m43s
kube-system local-path-provisioner-6c86858495-dvwzt 1/1 Running 0 4m42s
kube-system metrics-server-54fd9b65b-nkzds 1/1 Running 0 4m42s
kube-system svclb-traefik-045f5f22-9cdff 2/2 Running 0 4m27s
kube-system svclb-traefik-045f5f22-dnvkt 2/2 Running 0 4m27s
kube-system svclb-traefik-045f5f22-jwx2j 2/2 Running 0 3m27s
kube-system svclb-traefik-045f5f22-rmx7m 2/2 Running 0 2m37s
kube-system traefik-7d5f6474df-26pw8 1/1 Running 0 4m27s
pvt-reg-test pvt-reg-test-66cb57586c-7ckvp 1/1 Running 0 28s
pvt-reg-test pvt-reg-test-66cb57586c-f88jb 1/1 Running 0 28s
Check the hosts.toml for host section:
$ sudo cat /var/lib/rancher/k3s/agent/etc/containerd/certs.d/pvt-registry.com/hosts.toml
# File generated by k3s. DO NOT EDIT.
server = "https://pvt-registry.com/v2"
capabilities = ["pull", "resolve", "push"]
ca = ["/home/ubuntu/ca.pem"]
[host]
$ sudo cat /var/lib/rancher/k3s/agent/etc/containerd/certs.d/docker.io/hosts.toml
# File generated by k3s. DO NOT EDIT.
server = "https://registry-1.docker.io/v2"
capabilities = ["pull", "resolve", "push"]
[host]
[host."https://pvt-registry.com/v2"]
capabilities = ["pull", "resolve"]
ca = ["/home/ubuntu/ca.pem"]
$ sudo cat /var/lib/rancher/k3s/agent/etc/containerd/certs.d/k8s.gcr.io/hosts.toml
# File generated by k3s. DO NOT EDIT.
server = "https://k8s.gcr.io/v2"
capabilities = ["pull", "resolve", "push"]
[host]
[host."https://pvt-registry.com/v2"]
capabilities = ["pull", "resolve"]
ca = ["/home/ubuntu/ca.pem"]
from k3s.
Related Issues (20)
- [Release-1.29] - New k3s server flag: --write-kubeconfig-own or --write-kubeconfig-group HOT 1
- [Release-1.28] - New k3s server flag: --write-kubeconfig-own or --write-kubeconfig-group HOT 1
- [Release-1.27] - New k3s server flag: --write-kubeconfig-own or --write-kubeconfig-group HOT 1
- Install script fails to install v1.28.5 when version is specified HOT 1
- k3s is unable to start sidecar container HOT 4
- Node Problem Detector guidelines?
- Incorrect warning message for expiring K3s CA certificates HOT 1
- [Release-1.29] - Incorrect warning message for expiring K3s CA certificates HOT 1
- [Release-1.28] - Incorrect warning message for expiring K3s CA certificates HOT 1
- [Release-1.27] - Incorrect warning message for expiring K3s CA certificates HOT 1
- Agent certificate generation retry causes agents to bypass local loadbalancer
- sql: Scan error on column index 0, name \"prev_revision\": converting NULL to int64 is unsupported HOT 1
- Missing log information in Windows HOT 1
- [Release-1.29] - Agent certificate generation retry causes agents to bypass local loadbalancer HOT 1
- [Release-1.28] - Agent certificate generation retry causes agents to bypass local loadbalancer HOT 1
- [Release-1.27] - Agent certificate generation retry causes agents to bypass local loadbalancer HOT 1
- Etcd s3 config secret support
- Snapshot retention does not work with etcd-s3-folder HOT 6
- K3S server doesn't start on RHEL9 HOT 1
- Flannel-external-ip is ignored in cloud environments? HOT 11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from k3s.