siderolabs / talos Goto Github PK
View Code? Open in Web Editor NEWTalos Linux is a modern Linux distribution built for Kubernetes.
Home Page: https://www.talos.dev
License: Mozilla Public License 2.0
Talos Linux is a modern Linux distribution built for Kubernetes.
Home Page: https://www.talos.dev
License: Mozilla Public License 2.0
A new feature in v1.11.0-beta.1 kubeadm
requires cut
: https://github.com/kubernetes/kubernetes/blame/e85b81bbee098a7ec75cc894a8785867ec586798/pkg/util/ipvs/kernelcheck_linux.go#L64
[ERROR RequiredIPVSKernelModulesAvailable]: error getting installed ipvs required kernel modules: executable file not found in $PATH()
We should become "Certified Kubernetes" by following the the CNCF guide.
An automated process to create AMIs is needed for releases.
The CRI-O logs have a lot of logs like the following:
time="2018-04-03 04:07:18.909401344Z" level=info msg="Received container exit code: -1, message: exec failed: container_linux.go:348: starting container process caused "open /proc/self/fd: no such file or directory"
Instead of generating /etc/os-release
at build time, generate it at runtime. We set a version variable at build time.
The network config causes pods to come up with an IP from the 10.88.0.0/16
CIDR block. This breaks DNS resolution in pods.
I haven't been able to find much documentation around it, but it seems we should not create a network configuration and leave it up to the network plugin to handle that.
Seeing the following in the CRI-O logs:
time="2018-03-22 04:26:51.229159744Z" level=warning msg="hooks path: "/usr/share/containers/oci/hooks.d" does not exist"
time="2018-03-22 04:26:51.229230416Z" level=warning msg="hooks path: "/etc/containers/oci/hooks.d" does not exist"
time="2018-03-22 04:26:51.229386000Z" level=error msg="error updating cni config: Missing CNI default network"
time="2018-03-22 04:26:51.229418768Z" level=error msg="Missing CNI default network"
ERROR: logging before flag.Parse: W0322 04:26:51.312486 1291 hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
time="2018-03-22 04:26:51.330119568Z" level=error msg="watcher.Add("/usr/share/containers/oci/hooks.d") failed: no such file or directory"
We should address all of these.
We should provide info from /proc/<pid>/stat
(and perhaps other places in procfs
) that would give users detailed info on the running processes.
The user data fields should be verified. Clear and early errors in the initramfs relating to the user data would provide a better development and user experience.
Currently we use text/template
to render MasterConfiguration
and NodeConfiguration
YAML files that are then fed into kubeadm
. This is suboptimal since we are duplicating work done in kubeadm
. We should instead support specifying the files in their raw format:
kubernetes:
kubeadm: |
kind: NodeConfiguration
apiVersion: kubeadm.k8s.io/v1alpha1
token: abcd.1234567898765432
The following warning is showing up in the kubelet
logs. We should install conntrack
.
W0322 04:14:51.391333 1339 hostport_manager.go:68] The binary conntrack is not installed, this can
cause failures in network connection cleanup.
Instead of a RESTful API, I would prefer we use gRPC for a number of reasons. Using protocol buffers, we can generate the client and server API that can be used by init
and a CLI.
The init
logic needs to be robust and as simple as possible. Currently, the gRPC service and init
are one and the same. We should pull out the gRPC server code into a dedicated service. This would ensure that if something goes bad within the gRPC service, that we don't take out the whole node with it in the case of a bug that could trigger a kernel panic. I propose we call it osd
to compliment osctl
.
See buildkit.
Implement upgrades.
Creating images is relatively simple now due to #114, and opens up the idea of encouraging users to build images with sensitive files baked in. The user data file is one example. Right now, the cluster depends on an external source of truth depending on the platform that the cluster is running on.
For example, in AWS, the user data is sourced from http://169.254.169.254/latest/user-data
. The problem with this is that any pod in the cluster can reach this endpoint, which is a huge security risk considering what is in it.
In bare metal, there is the added burden of maintaining an http server for the purpose of serving the user data. In this example, you could secure the endpoint, but you still have to get the credentials onto the nodes somehow.
This proposal leverages the existing RoT
features we have by further extending the responsibility of such a node. Nodes that serve as roots of trust are trusted by their very definition and therefore makes a perfect candidate for hosting a user data service on.
The workflow would be something like the following:
RoT
credentials.RoT
credentials.RoT
credentials to securely serve the user data files stored on them.RoT
credentials to make a request to a master node for the user data.This method simplifies things operationally, and improves security in the process. It does move some of the complexity into the image building requirement, but the security and day-to-day operational value it brings, outweighs the complexity it adds.
The MasterConfiguration
should have kubernetesVersion
set to the same version in the build.
The way in which we are handling stdout
and stderr
in the process manager is blocking the process itself. The problem is that io.Pipe blocks until something reads and writes from it. This means that process are blocked until a user accesses the process logs via the API.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
test 1/1 Running 0 3m
$ kubectl exec -it test -- sh
error: Internal error occurred: error executing command in container: open /dev/ptmx: no such file or directory
Since #59 was merged, the following shows up in the kubelet
logs:
E0502 13:37:38.488869 1386 remote_runtime.go:209] StartContainer "c9d4b9a2cf132d74f3d3d5ca3b36763f88fab8e7dcaf8d37bfbb84ba6cc7371e" from runtime service failed: rpc error: code = Unknown desc = failed to start container "c9d4b9a2cf132d74f3d3d5ca3b36763f88fab8e7dcaf8d37bfbb84ba6cc7371e": Error response from daemon: invalid header field value "oci runtime error: container_linux.go:247: starting container process caused \"process_linux.go:258: applying cgroup configuration for process caused \\\"failed to set memory.kmem.limit_in_bytes, because either tasks have already joined this cgroup or it has children\\\"\"\n"
This breaks kubeadm init
.
[ 20.240345] cgroup: kubelet (1209) created nested cgroup for controller "memory" which has incomplete hierarchy support. Nested cgroups may change behavior in the future.
[ 20.242831] cgroup: "memory" requires setting use_hierarchy to 1 on the root
The tasks outlined in the official HA guide can be automated. I think Option 2 is the best route to go, at least to start with. We can always change this later, or even add support for both. So that leaves us with having to:
master0
kubeadm init
kube-proxy
For 1, we can add an RPC call. We can use either a push or pull model. In the pull model, master nodes, excluding master0
, would request the PKI directory upon initialization. This would require us to verify that requesters of the PKI directory are authorized to do so. The alternative is to push the PKI directory to other master nodes when they come up. This would require that a well known set of machines will participate as master nodes. I think the push model is better, as we can tie it in to the notion of a RoT
we have established already.
For 2, we can deploy an ingress controller that runs only on the master nodes. This model offers a generic solution that can handle cloud and bare-metal deployments. The only issue then becomes the management of the DNS record pointing to the master nodes. How should we add/remove healthy/unhealthy nodes to/from the list?
For 3, this ties in to 2. Whatever we decide in 2, we would need to be sure to add the node to the load balancer.
For 4, we can use the admin.conf
on master0
to update kube-proxy
.
Instead of discovering the platform, we should require explicit flags from a user.
The ability to debug kernel level issues is important. We should provide the contents of /proc/kmsg
via the API.
kubectl port-forward etcd-master-0 2379:2379 -n kube-system
Forwarding from 127.0.0.1:2379 -> 2379
curl localhost:2379/v2
curl: (52) Empty reply from server
kubectl port-forward etcd-master-0 2379:2379 -n kube-system
Forwarding from 127.0.0.1:2379 -> 2379
Handling connection for 2379
E0402 21:26:31.026140 78292 portforward.go:331] an error occurred forwarding 2379 -> 2379: error forwarding port 2379 to pod 67354788607ba8b9caad4f69c7a4256581b46655281b6fd182ec0fd21ef0591c, uid : exit status 1: 2018/04/03 04:26:29 socat[11867] E getaddrinfo("localhost", "NULL", {1,2,1,6}, {}): Name or service not known
A CLI would be useful for things like:
and other features as this project grows. It should probably be gRPC based after #19 is implemented.
Using either the standard library, or something like logrus, we should provide more robust logging.
Since #64 was merged, we now enforce security without the possibility of opting out. CSRs need to be automated for nodes as they come up. One way to achieve this would be to define a CustomResourceDefinition
(or more than one) along with a controller that would automate the CSR workflow.
There are a couple of ways we can implement this:
Option one has security risks and does not align with our goals.
EDIT:
Option one was chosen, and option two will be implemented in time.
In the build we should docker save
the control plane images, and perhaps the essential addons, and then docker load
them in init
.
I0705 03:19:00.863140 1170 kernel_validator.go:96] Validating kernel config
[WARNING Service-Kubelet]: no supported init system detected, skipping checking for services
[WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh] or no builtin kernel ipvs support: map[ip_vs_rr:{} ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{} ip_vs:{}]
you can solve this problem with following methods:
1. Run 'modprobe -- ' to load missing kernel modules;
2. Provide the missing builtin kernel ipvs support
When the kubelet or docker daemon dies, we should restart it.
Users will need a way to get the logs of the processes running on the host.
Right now we require that a user specify nameserver
s in the user data. We should discover the DHCP assigned nameserver
and add it in addition to what is in the user data.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.