openshift / machine-config-operator Goto Github PK

License: Apache License 2.0

Shell 1.27% Go 98.18% Makefile 0.21% Dockerfile 0.13% Python 0.21%

machine-config-operator's Introduction

machine-config-operator

OpenShift 4 is an operator-focused platform, and the Machine Config operator extends that to the operating system itself, managing updates and configuration changes to essentially everything between the kernel and kubelet.

To repeat for emphasis, this operator manages updates to systemd, cri-o/kubelet, kernel, NetworkManager, etc. It also offers a new MachineConfig CRD that can write configuration files onto the host.

The approach here is a "fusion" of code from the original CoreOS Tectonic as well as some components of Red Hat Enterprise Linux Atomic Host, as well as some fundamentally new design.

The MCO (for short) interacts closely with both the installer as well as Red Hat CoreOS. See also the machine-api-operator which handles provisioning of new machines - once the machine-api-operator provisions a machine (with a "pristine" base Red Hat CoreOS), the MCO will take care of configuring it.

One way to view the MCO is to treat the operating system itself as "just another Kubernetes component" that you can inspect and manage with oc.

The MCO uses CoreOS Ignition as a configuration format. Operating system updates use rpm-ostree, with ostree updates encapsulated inside a container image. More information in OSUpgrades.md.

As of release 4.12, you can try out OCP CoreOS Layering which lets you use more familiar "Containerfile" (Dockerfile) syntax to apply configuration to your pools.

Sub-components and design

This one git repository generates 4 components in a cluster; the machine-config-operator pod manages the remaining 3 sub-components. Here are links to design docs:

Interacting with the MCO

Because the MCO is a cluster-level operator, you can inspect its status just like any other operator that is part of the release image. If it's reporting success, then that means that the operating system is up to date and configured.

oc describe clusteroperator/machine-config

One level down from the operator CRD, the machineconfigpool objects track updates to a group of nodes. You will often want to run a command like this:

oc describe machineconfigpool

Particularly note the Updated and Updating columns.

Applying configuration changes to the cluster

The MCO has "high level" knobs for some components of the cluster state; for example, SSH keys and kubelet configuration. However, there are obviously a quite large number of things one may want to configure on a system. For example, offline environments may want to specify an internal NTP pool. Another example is static network configuration. By providing a MachineConfig object containing Ignition configuration, systemd units can be provided, arbitrary files can be laid down into writable locations (i.e. /etc and /var).

See the OCP product documentation for more information.

What to look at after creating a MachineConfig

Once you create a MachineConfig fragment like the above, the controller will generate a new "rendered" version that will be used as a target. For more information, see MachineConfig.

In particular, you should look at oc describe machineconfigpool and oc describe clusteroperator/machine-config as noted above.

More information about OS updates

The model implemented by the MCO is that the cluster controls the operating system. OS updates are just another entry in the release image. For more information, see OSUpgrades.md.

Developing the MCO

See HACKING.md.

Frequently Asked Questions

See FAQ.md.

Security Response

If you've found a security issue that you'd like to disclose confidentially please contact Red Hat's Product Security team. Details at https://access.redhat.com/security/team/contact

machine-config-operator's People

Contributors

Stargazers

Watchers

Forkers

abhinavdahiya sdemos yifan-gu vikramsk jlebon ashcrow enxebre wking vikaschoudhary16 rphillips sjenning kikisdeliveryservice derekwaynecarr smarterclayton yuqi-zhang deads2k flaper87 csrwng gettyio damemi ekuric mgugino-upstream-stage pliurh cgwalters hexfusion staebler suicidesin vrutkovs paulfantom gnufied wenlxie umohnani8 sttts runcom lorbuschris mrogers950 bysnupy jeremyeder letubert aperturetechnology miabbott mtrmac mfojtik tsungming sdodson dustymabe pecameron jaormx gbraad robertkrawitz serbrech poc-up-fail fedosin ericxnhu ravisantoshgudimetla trown triplekill iamemilio stbenjam enj sosiouxme openshift-cherrypick-robot rgolangh qiwang19 dantrainor danyc97 bcrochet cybertron sgreene570 skitt mandre patrickdillon eparis haircommander cynepco3hahue florinpeter tomassedovic yprokule pensu russellb celebdor mkumatag maysamacedo dcbw sallyom alaypatel07 karmab sinnykumari dougbtv yrobla evanspjz mhrivnak ravitri prashanth684 hchenxa derekhiggins tuan-hoang1 ericavonb sammarland-rh retroflexer

machine-config-operator's Issues

MCD: Actually handle directories and symlinks

Right now, we're only handling the files list of the storage section. We should support directories and links too.

MCD: consider removing annotation equality checking from node validation logic

the current validation logic in the MCD first checks if the annotations on the node are equal. if they are, it assumes that the node has a valid configuration.

this assumption does make some amount of sense. we are generally careful to validate the node state before marking the update as done, so if everything is as expected and nothing ever changes, the node is going to be in a valid state.

however, I think we should always validate the machine state in full every time we call the validation function. it's not a particularly expensive check, and it allows us to easily validate our assumptions about machine state being constant, and fix it if the state changes for some reason. plus, it insulates us a little bit from transitory issues with the values in the annotations, which obviously shouldn't happen, but we can trivially not rely as much on them and instead anchor ourselves in reality by removing this check. there might be other good reasons I'm not coming up with too.

thoughts?

/cc @jlebon @ashcrow

MCD hot looping, 100% cpu

top - 16:10:54 up  1:26,  1 user,  load average: 1.68, 1.17, 0.59
Tasks: 109 total,   1 running, 108 sleeping,   0 stopped,   0 zombie
%Cpu(s): 43.3 us,  0.5 sy,  0.0 ni, 56.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.2 st
KiB Mem :  2046940 total,   663244 free,   213220 used,  1170476 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  1621404 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                       
 3933 root      20   0   35820  18092   8292 S 115.6  0.9   8:23.25 machine-config-                                                                                                               
 2832 root      20   0 1059904  98464  45736 S   0.7  4.8   1:02.02 hyperkube                                                                                                                     
 2974 root      20   0  828680  54108  19456 S   0.3  2.6   0:41.27 crio                                                                                                                          
 3734 root      20   0  265368  22824  11416 S   0.3  1.1   0:07.11 flanneld

Using libvirt. Some time (it varies) after a worker comes up via the Machine API actuator, the MCD starts hot looping.

SIGABRTed to get the stack trace

# crictl logs 64d4b837eed28
I0919 14:43:16.878108    6667 start.go:42] Version: 0.0.0-182-g83741924
I0919 14:43:16.903225    6667 start.go:78] chrooting into rootMount /rootfs
I0919 14:43:16.903248    6667 start.go:84] moving to / inside the chroot
I0919 14:43:16.903258    6667 daemon.go:80] Starting MachineConfigDaemon
SIGABRT: abort
PC=0x428870 m=0 sigcode=0

goroutine 1 [running]:
runtime.deferreturn(0xc420366870)
	/usr/local/go/src/runtime/panic.go:316 fp=0xc4204097a0 sp=0xc420409798 pc=0x428870
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/watch.(*StreamWatcher).Stop(0xc420366870)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/watch/streamwatcher.go:78 +0x6a fp=0xc4204097c8 sp=0xc4204097a0 pc=0x7c0d1a
github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/watch.Until(0x0, 0x1277e20, 0xc420366870, 0xc4204099e0, 0x1, 0x1, 0x0, 0x126ff40, 0xc4200a1420)
	/go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/watch/until.go:68 +0x304 fp=0xc4204098f8 sp=0xc4204097c8 pc=0x7c14d4
github.com/openshift/machine-config-operator/pkg/daemon.waitUntilUpdate(0x1296440, 0xc42029ab70, 0xc42003e02a, 0xc, 0x1, 0x1)
	/go/src/github.com/openshift/machine-config-operator/pkg/daemon/node.go:57 +0x493 fp=0xc420409ab0 sp=0xc4204098f8 pc=0xf34883
github.com/openshift/machine-config-operator/pkg/daemon.(*Daemon).process(0xc4200bc140, 0x1203258, 0xc4204bdb58)
	/go/src/github.com/openshift/machine-config-operator/pkg/daemon/daemon.go:102 +0x129 fp=0xc420409b08 sp=0xc420409ab0 pc=0xf32929
github.com/openshift/machine-config-operator/pkg/daemon.(*Daemon).Run(0xc4200bc140, 0xc42009e360, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/pkg/daemon/daemon.go:83 +0xd3 fp=0xc420409b78 sp=0xc420409b08 pc=0xf326e3
main.runStartCmd(0x19f9440, 0x1a21480, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/cmd/machine-config-daemon/start.go:91 +0x553 fp=0xc420409ce0 sp=0xc420409b78 pc=0xf3f193
github.com/openshift/machine-config-operator/vendor/github.com/spf13/cobra.(*Command).execute(0x19f9440, 0x1a21480, 0x0, 0x0, 0x19f9440, 0x1a21480)
	/go/src/github.com/openshift/machine-config-operator/vendor/github.com/spf13/cobra/command.go:766 +0x2c1 fp=0xc420409dd0 sp=0xc420409ce0 pc=0x57ec21
github.com/openshift/machine-config-operator/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x19f91e0, 0x0, 0xc42010bf78, 0x403dfc)
	/go/src/github.com/openshift/machine-config-operator/vendor/github.com/spf13/cobra/command.go:852 +0x30a fp=0xc420409f10 sp=0xc420409dd0 pc=0x57f7da
github.com/openshift/machine-config-operator/vendor/github.com/spf13/cobra.(*Command).Execute(0x19f91e0, 0xc42010bf60, 0x0)
	/go/src/github.com/openshift/machine-config-operator/vendor/github.com/spf13/cobra/command.go:800 +0x2b fp=0xc420409f40 sp=0xc420409f10 pc=0x57f4ab
main.main()
	/go/src/github.com/openshift/machine-config-operator/cmd/machine-config-daemon/main.go:27 +0x31 fp=0xc420409f88 sp=0xc420409f40 pc=0xf3ea21
runtime.main()
	/usr/local/go/src/runtime/proc.go:198 +0x212 fp=0xc420409fe0 sp=0xc420409f88 pc=0x42ad72
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:2361 +0x1 fp=0xc420409fe8 sp=0xc420409fe0 pc=0x455051

goroutine 17 [chan receive]:
github.com/openshift/machine-config-operator/vendor/github.com/golang/glog.(*loggingT).flushDaemon(0x1a032a0)
	/go/src/github.com/openshift/machine-config-operator/vendor/github.com/golang/glog/glog.go:879 +0x8b
created by github.com/openshift/machine-config-operator/vendor/github.com/golang/glog.init.0
	/go/src/github.com/openshift/machine-config-operator/vendor/github.com/golang/glog/glog.go:410 +0x203

goroutine 18 [IO wait, 99 minutes]:
internal/poll.runtime_pollWait(0x7f32a0b2cea0, 0x72, 0xc42003c6b0)
	/usr/local/go/src/runtime/netpoll.go:173 +0x57
internal/poll.(*pollDesc).wait(0xc42038bb18, 0x72, 0xc4203b6600, 0x1000, 0x0)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:85 +0x9b
internal/poll.(*pollDesc).waitRead(0xc42038bb18, 0xc42003c600, 0x10, 0x10)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:90 +0x3d
internal/poll.(*FD).ReadMsg(0xc42038bb00, 0xc42003c6b0, 0x10, 0x10, 0xc4203b6620, 0x1000, 0x1000, 0x0, 0x0, 0x0, ...)
	/usr/local/go/src/internal/poll/fd_unix.go:231 +0x1f1
net.(*netFD).readMsg(0xc42038bb00, 0xc42003c6b0, 0x10, 0x10, 0xc4203b6620, 0x1000, 0x1000, 0xfea400, 0xc4200725d0, 0xc42003c5c8, ...)
	/usr/local/go/src/net/fd_unix.go:214 +0x90
net.(*UnixConn).readMsg(0xc4200c2760, 0xc42003c6b0, 0x10, 0x10, 0xc4203b6620, 0x1000, 0x1000, 0xc42003c5c8, 0x1291880, 0xc42003c500, ...)
	/usr/local/go/src/net/unixsock_posix.go:115 +0x91
net.(*UnixConn).ReadMsgUnix(0xc4200c2760, 0xc42003c6b0, 0x10, 0x10, 0xc4203b6620, 0x1000, 0x1000, 0xc4203bc5b0, 0x7, 0xc42003c5ce, ...)
	/usr/local/go/src/net/unixsock.go:137 +0xaa
github.com/openshift/machine-config-operator/vendor/github.com/godbus/dbus.(*oobReader).Read(0xc4203b6600, 0xc42003c6b0, 0x10, 0x10, 0x1020, 0xc4203b6600, 0x0)
	/go/src/github.com/openshift/machine-config-operator/vendor/github.com/godbus/dbus/transport_unix.go:21 +0x8f
io.ReadAtLeast(0x1270080, 0xc4203b6600, 0xc42003c6b0, 0x10, 0x10, 0x10, 0x10a3780, 0xc420040101, 0xc4203b6600)
	/usr/local/go/src/io/io.go:309 +0x86
io.ReadFull(0x1270080, 0xc4203b6600, 0xc42003c6b0, 0x10, 0x10, 0xc420040180, 0x21, 0x21)
	/usr/local/go/src/io/io.go:327 +0x58
github.com/openshift/machine-config-operator/vendor/github.com/godbus/dbus.(*unixTransport).ReadMessage(0xc42039c180, 0xc4200ae540, 0x1, 0x1)
	/go/src/github.com/openshift/machine-config-operator/vendor/github.com/godbus/dbus/transport_unix.go:85 +0x113
github.com/openshift/machine-config-operator/vendor/github.com/godbus/dbus.(*Conn).inWorker(0xc4200f7000)
	/go/src/github.com/openshift/machine-config-operator/vendor/github.com/godbus/dbus/conn.go:285 +0x4b
created by github.com/openshift/machine-config-operator/vendor/github.com/godbus/dbus.(*Conn).Auth
	/go/src/github.com/openshift/machine-config-operator/vendor/github.com/godbus/dbus/auth.go:118 +0x6c9

goroutine 19 [chan receive, 99 minutes]:
github.com/openshift/machine-config-operator/vendor/github.com/godbus/dbus.(*Conn).outWorker(0xc4200f7000)
	/go/src/github.com/openshift/machine-config-operator/vendor/github.com/godbus/dbus/conn.go:427 +0x63
created by github.com/openshift/machine-config-operator/vendor/github.com/godbus/dbus.(*Conn).Auth
	/go/src/github.com/openshift/machine-config-operator/vendor/github.com/godbus/dbus/auth.go:119 +0x6ee

goroutine 35 [IO wait, 2 minutes]:
internal/poll.runtime_pollWait(0x7f32a0b2cdd0, 0x72, 0xc420407858)
	/usr/local/go/src/runtime/netpoll.go:173 +0x57
internal/poll.(*pollDesc).wait(0xc4203ba418, 0x72, 0xffffffffffffff00, 0x12722c0, 0x19a9318)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:85 +0x9b
internal/poll.(*pollDesc).waitRead(0xc4203ba418, 0xc4204a0000, 0x4000, 0x4000)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:90 +0x3d
internal/poll.(*FD).Read(0xc4203ba400, 0xc4204a0000, 0x4000, 0x4000, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:157 +0x17d
net.(*netFD).Read(0xc4203ba400, 0xc4204a0000, 0x4000, 0x4000, 0x0, 0x8, 0x3ffb)
	/usr/local/go/src/net/fd_unix.go:202 +0x4f
net.(*conn).Read(0xc42000c068, 0xc4204a0000, 0x4000, 0x4000, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/net.go:176 +0x6a
crypto/tls.(*block).readFromUntil(0xc4203f0000, 0x7f32a0b70068, 0xc42000c068, 0x5, 0xc42000c068, 0x0)
	/usr/local/go/src/crypto/tls/conn.go:493 +0x96
crypto/tls.(*Conn).readRecord(0xc4203dc000, 0x1204b17, 0xc4203dc120, 0xaa)
	/usr/local/go/src/crypto/tls/conn.go:595 +0xe0
crypto/tls.(*Conn).Read(0xc4203dc000, 0xc420443000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
	/usr/local/go/src/crypto/tls/conn.go:1156 +0x100
bufio.(*Reader).Read(0xc4200ca300, 0xc42041e2d8, 0x9, 0x9, 0xc420407c50, 0x468333, 0xc420407c60)
	/usr/local/go/src/bufio/bufio.go:216 +0x238
io.ReadAtLeast(0x126fd00, 0xc4200ca300, 0xc42041e2d8, 0x9, 0x9, 0x9, 0xc4200a0401, 0xc420407d00, 0x7b51fb)
	/usr/local/go/src/io/io.go:309 +0x86
io.ReadFull(0x126fd00, 0xc4200ca300, 0xc42041e2d8, 0x9, 0x9, 0xc4200a0050, 0xc420407d30, 0x7b510d)
	/usr/local/go/src/io/io.go:327 +0x58
github.com/openshift/machine-config-operator/vendor/golang.org/x/net/http2.readFrameHeader(0xc42041e2d8, 0x9, 0x9, 0x126fd00, 0xc4200ca300, 0x0, 0x0, 0xc4204de000, 0x0)
	/go/src/github.com/openshift/machine-config-operator/vendor/golang.org/x/net/http2/frame.go:237 +0x7b
github.com/openshift/machine-config-operator/vendor/golang.org/x/net/http2.(*Framer).ReadFrame(0xc42041e2a0, 0xc4204de000, 0x0, 0x0, 0x0)
	/go/src/github.com/openshift/machine-config-operator/vendor/golang.org/x/net/http2/frame.go:492 +0xa4
github.com/openshift/machine-config-operator/vendor/golang.org/x/net/http2.(*clientConnReadLoop).run(0xc420407fb8, 0x1203d08, 0xc4200d8fb8)
	/go/src/github.com/openshift/machine-config-operator/vendor/golang.org/x/net/http2/transport.go:1616 +0x8e
github.com/openshift/machine-config-operator/vendor/golang.org/x/net/http2.(*ClientConn).readLoop(0xc4204201c0)
	/go/src/github.com/openshift/machine-config-operator/vendor/golang.org/x/net/http2/transport.go:1544 +0x68
created by github.com/openshift/machine-config-operator/vendor/golang.org/x/net/http2.(*Transport).newClientConn
	/go/src/github.com/openshift/machine-config-operator/vendor/golang.org/x/net/http2/transport.go:619 +0x684

rax    0x0
rbx    0x0
rcx    0x1
rdx    0xc420366870
rdi    0xc420042280
rsi    0x3
rbp    0xc4204097b8
rsp    0xc420409798
r8     0xc4200421d0
r9     0x17
r10    0xc420042070
r11    0x1
r12    0x0
r13    0x20
r14    0xdd
r15    0x100
rip    0x428870
rflags 0x202
cs     0x33
fs     0x0
gs     0x0

@rphillips you see this?

MCD: Run-once first install start systemd units

For mcd run-once, need to start systemd units on RHEL.

Our POC implementation in ansible was to start any service labeled as 'enabled'.

Best to 'restart' as opposed to just 'start' as users may need the installer to be somewhat idempotent when installing clusters on BYO (eg, network blip, misconfiguration).

machine-config-server should not listen in the local port range

The machine-config-operator seems to listen on port 49500 (with hostNetwork: true). This is in the default ip_local_port_range, which means it can collide with active tcp sessions:

[root@test1-master-0 core]# sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 32768    60999

It should serve on a port lower than 32768.

For example, I managed to collide with a persistent connection from the apiserver to etcd:

[root@test1-master-0 core]# nc -l -t -p 49500
Ncat: bind to 0.0.0.0:49500: Address already in use. QUITTING.
[root@test1-master-0 core]# ss -np | grep 49500
tcp    ESTAB      0      0      192.168.126.11:49500              192.168.126.11:2379                users:(("hypershift",pid=10044,fd=60))

MCD should write logs to the journal

Container logs are temporary, journal logs are forever (well, longer at least...).

This should help with debugging issues with the node rebooting and somehow not having access to the previous container logs.

MCC taking a while to create MachineConfigs

Split out from #199 (comment):

It took quite a while for the MCC to create the MachineConfigs:

$ kubectl get machineconfigs
No resources found.

A few minutes later:

$ kubectl get machineconfigs
NAME        AGE
00-master   1m
00-worker   1m

I'm sure 7f7366afead827ab5aa47ce67d5903c9 will show up if I wait a bit longer.

The MCC logs show the following:

I1129 23:24:23.120333       1 start.go:44] Version: 3.11.0-278-g628ad5d1-dirty
I1129 23:24:23.319527       1 leaderelection.go:185] attempting to acquire leader lease  openshift-machine-config-operator/machine-config-controller...
E1129 23:26:19.915022       1 event.go:259] Could not construct reference to: '&v1.ConfigMap{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"machine-config-controller", GenerateName:"", Namespace:"openshift-machine-config-operator", SelfLink:"/api/v1/namespaces/openshift-machine-config-operator/configmaps/machine-config-controller", UID:"913656b5-f42d-11e8-9834-12202f72768c", ResourceVersion:"7816", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63679130518, loc:(*time.Location)(0x1c60aa0)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string{"control-plane.alpha.kubernetes.io/leader":"{\"holderIdentity\":\"machine-config-controller-5f644d794f-l55l5_e7aa2bc2-f42d-11e8-a035-0a580a800025\",\"leaseDurationSeconds\":90,\"acquireTime\":\"2018-11-29T23:26:19Z\",\"renewTime\":\"2018-11-29T23:26:19Z\",\"leaderTransitions\":1}"}, OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Data:map[string]string(nil), BinaryData:map[string][]uint8(nil)}'
due to: 'no kind is registered for the type v1.ConfigMap'. Will not report event: 'Normal' 'LeaderElection' 'machine-config-controller-5f644d794f-l55l5_e7aa2bc2-f42d-11e8-a035-0a580a800025 became leader'
I1129 23:26:19.922331       1 leaderelection.go:194] successfully acquired lease openshift-machine-config-operator/machine-config-controller
I1129 23:26:20.128953       1 template_controller.go:111] Starting MachineConfigController-TemplateController
I1129 23:26:20.221709       1 render_controller.go:103] Starting MachineConfigController-RenderController
I1129 23:26:20.323361       1 node_controller.go:117] Starting MachineConfigController-NodeController
I1129 23:26:21.319588       1 render.go:91] ignoring non-directory path ".gitkeep"
E1129 23:29:05.801651       1 node_controller.go:345] Empty CurrentMachineConfig
E1129 23:29:05.804675       1 node_controller.go:345] Empty CurrentMachineConfig

Add MCD OWNERS to manifest files

Currently those who are MCD owners have access to the code, but not the manifests/yaml files related to the code. Would it make sense to add them for things such as #60 (comment)?

Add an "Unreconcilable" label distinct from degraded

Building on #234
First, we should never "reboot loop".

The two other outcomes today are (config updated successfully | degraded)

Now, what I want to argue is that we shouldn't go degraded if we can't reconcile a new config. Going degraded is going to be expensive in some cases (particularly for masters). An admin will want to know that e.g. the machineconfig they created is going to reprovision their entire cluster.

Let's introduce a distinct "unreconcilable" label. If for example the admin messes up a machineconfig, they should totally have the option to just revert it and not degrade their machines.

IOW, degraded is only for the case where the node changed out from under us in an unexpected way, or some sort of fatal error (rpm-ostree isn't working, whatever).

Now, it may be that the admin wants to eat that reprovisioning; the MAO should also support reprovisioning unreconcilable systems too. But let's distinguish those two cases.

Related: #234

MCD "regularly" restarts

Watches can time out/expire. Likely this should be more gracefully handled..

$ kubectl logs -p -n openshift-machine-config-operator   po/machine-config-daemon-4gfps
I0915 19:13:38.565897       1 start.go:42] Version: 0.0.0-162-g16dbf47c-dirty
I0915 19:13:38.619370       1 start.go:78] chrooting into rootMount /rootfs
I0915 19:13:38.619413       1 start.go:84] moving to / inside the chroot
I0915 19:13:38.619425       1 daemon.go:80] Starting MachineConfigDaemon
E0915 20:55:53.481964       1 daemon.go:85] Marking degraded due to: Failed to watch for update request: watch closed before Until timeout
I0915 20:55:53.581095       1 daemon.go:86] Shutting down MachineConfigDaemon

MCD CrashLoopBackOff skopeo inspect docker://://dummy ... invalid reference format

MCD on worker is in CrashLoopBackOff

$ oc get machineconfig
NAME                               AGE
00-master                          14m
00-worker                          14m
3c6b6d8063551a7c8a4561755ea5bae5   14m
f0db02dc9fb258ad37bed110741198c1   14m

$ oc get pod -o wide
NAME                                         READY     STATUS             RESTARTS   AGE       IP               NODE                 NOMINATED NODE
machine-config-controller-57c78dc856-xkr6w   1/1       Running            0          15m       10.2.0.20        dev-master-0         <none>
machine-config-daemon-cgmdx                  0/1       CrashLoopBackOff   7          13m       192.168.126.51   dev-worker-0-qplk8   <none>
machine-config-daemon-dm8kj                  1/1       Running            0          14m       192.168.126.11   dev-master-0         <none>
machine-config-operator-76c4d5db69-xxt42     1/1       Running            0          18m       10.2.0.9         dev-master-0         <none>
machine-config-server-m565j                  1/1       Running            0          14m       192.168.126.11   dev-master-0         <none>

$ oc logs machine-config-daemon-dm8kj -n openshift-machine-config-operator
I1031 15:49:02.518478    7375 start.go:42] Version: 3.11.0-195-g214b930d-dirty
I1031 15:49:02.549275    7375 run.go:22] Running captured: chroot /rootfs rpm-ostree status --json
I1031 15:49:02.632582    7375 daemon.go:95] Booted osImageURL: registry.svc.ci.openshift.org/rhcos/maipo@sha256:cf407efbac50e1499c39a8fe4af46cb423aa87cb36264a25cff6c294bee342eb (47.36)
I1031 15:49:02.632690    7375 start.go:90] Calling chroot("/rootfs")
I1031 15:49:02.632738    7375 start.go:100] Starting MachineConfigDaemon
E1031 15:49:02.655460    7375 daemon.go:163] Marking degraded due to: machineconfigs.machineconfiguration.openshift.io "f0db02dc9fb258ad37bed110741198c1" not found

$ oc logs machine-config-daemon-cgmdx -n openshift-machine-config-operator
I1031 15:57:03.421947    5277 start.go:42] Version: 3.11.0-195-g214b930d-dirty
I1031 15:57:03.435575    5277 run.go:22] Running captured: chroot /rootfs rpm-ostree status --json
I1031 15:57:03.468033    5277 daemon.go:95] Booted osImageURL: registry.svc.ci.openshift.org/rhcos/maipo@sha256:cf407efbac50e1499c39a8fe4af46cb423aa87cb36264a25cff6c294bee342eb (47.36)
I1031 15:57:03.468098    5277 start.go:90] Calling chroot("/rootfs")
I1031 15:57:03.468110    5277 start.go:100] Starting MachineConfigDaemon
I1031 15:57:03.481394    5277 update.go:81] Checking if configs are reconcilable
I1031 15:57:03.496623    5277 update.go:31] Updating node with new config
I1031 15:57:03.496635    5277 update.go:81] Checking if configs are reconcilable
I1031 15:57:03.496643    5277 update.go:179] Updating files
I1031 15:57:03.496652    5277 update.go:367] Writing file "/etc/containers/registries.conf"
I1031 15:57:03.500388    5277 update.go:367] Writing file "/etc/hosts"
I1031 15:57:03.507316    5277 update.go:367] Writing file "/etc/sysconfig/crio-network"
I1031 15:57:03.513328    5277 update.go:367] Writing file "/etc/kubernetes/kubelet.conf"
I1031 15:57:03.518240    5277 update.go:367] Writing file "/etc/docker/certs.d/docker-registry.default.svc:5000/ca.crt"
I1031 15:57:03.522217    5277 update.go:367] Writing file "/etc/kubernetes/ca.crt"
I1031 15:57:03.527447    5277 update.go:301] Writing systemd unit "kubelet.service"
I1031 15:57:03.527667    5277 update.go:339] Enabling systemd unit "kubelet.service"
I1031 15:57:03.527694    5277 update.go:248] /etc/systemd/system/multi-user.target.wants/kubelet.service already exists. Not making a new symlink
I1031 15:57:03.527702    5277 update.go:200] Deleting stale data
I1031 15:57:03.527714    5277 update.go:460] Updating OS to ://dummy
I1031 15:57:03.527721    5277 run.go:13] Running: /bin/pivot ://dummy
pivot version 0.0.1
I1031 15:57:03.530508    5310 run.go:27] Running: rpm-ostree status --json
I1031 15:57:03.545927    5310 root.go:79] Previous pivot: registry.svc.ci.openshift.org/rhcos/maipo@sha256:cf407efbac50e1499c39a8fe4af46cb423aa87cb36264a25cff6c294bee342eb
I1031 15:57:03.545946    5310 run.go:27] Running: skopeo inspect docker://://dummy
time="2018-10-31T15:57:03Z" level=fatal msg="invalid reference format" 
F1031 15:57:03.560788    5310 run.go:33] skopeo: exit status 1
F1031 15:57:03.561170    5277 start.go:105] error checking initial state of node: exit status 255

Multiple machine-config server restarts after 'http: TLS handshake error from 10.0.29.128:17205: EOF'

In a recent CI run, I saw:

Dec 14 05:53:23.635: INFO: Pod status openshift-machine-config-operator/machine-config-server-c9dr5:
{
  "phase": "Running",
  "conditions": [
    {
      "type": "Initialized",
      "status": "True",
      "lastProbeTime": null,
      "lastTransitionTime": "2018-12-14T05:41:23Z"
    },
    {
      "type": "Ready",
      "status": "True",
      "lastProbeTime": null,
      "lastTransitionTime": "2018-12-14T05:47:34Z"
    },
    {
      "type": "ContainersReady",
      "status": "True",
      "lastProbeTime": null,
      "lastTransitionTime": null
    },
    {
      "type": "PodScheduled",
      "status": "True",
      "lastProbeTime": null,
      "lastTransitionTime": "2018-12-14T05:41:23Z"
    }
  ],
  "message": "container machine-config-server has restarted more than 5 times",
  "hostIP": "10.0.2.151",
  "podIP": "10.0.2.151",
  "startTime": "2018-12-14T05:41:23Z",
  "containerStatuses": [
    {
      "name": "machine-config-server",
      "state": {
        "running": {
          "startedAt": "2018-12-14T05:47:33Z"
        }
      },
      "lastState": {
        "terminated": {
          "exitCode": 1,
          "reason": "Error",
          "startedAt": "2018-12-14T05:44:50Z",
          "finishedAt": "2018-12-14T05:44:50Z",
          "containerID": "cri-o://35e3004e72b35a273ab4b0e2e75e082f0840464c55a13f5716d3b796be241e8a"
        }
      },
      "ready": true,
      "restartCount": 6,
      "image": "registry.svc.ci.openshift.org/ci-op-4xwzpczq/stable@sha256:7f2cd078c139f2ed319d16d68e7a5d05f9c60012fd4eeafddc66b1d24a78abf8",
      "imageID": "registry.svc.ci.openshift.org/ci-op-4xwzpczq/stable@sha256:7f2cd078c139f2ed319d16d68e7a5d05f9c60012fd4eeafddc66b1d24a78abf8",
      "containerID": "cri-o://c60a6df780ea4d8d9679309a9037c057002a96f7db1fb62772e2a0b5bb00eaa3"
    }
  ],
  "qosClass": "BestEffort"
}
Dec 14 05:53:23.639: INFO: Running AfterSuite actions on all node
Dec 14 05:53:23.639: INFO: Running AfterSuite actions on node 1
fail [github.com/openshift/origin/test/extended/operators/cluster.go:109]: Expected
    <[]string | len:1, cap:1>: [
        "Pod openshift-machine-config-operator/machine-config-server-c9dr5 is not healthy: container machine-config-server has restarted more than 5 times",
    ]
to be empty
...
Dec 14 05:51:09.642 W ns=openshift-monitoring pod=prometheus-adapter-bdc5f58cb-5l4jt MountVolume.SetUp failed for volume "prometheus-adapter-tls" : secrets "prometheus-adapter-tls" not found
Dec 14 05:51:16.557 E kube-apiserver Kube API started failing: Get https://ci-op-4xwzpczq-1d3f3-api.origin-ci-int-aws.dev.rhcloud.com:6443/api/v1/namespaces/kube-system?timeout=3s: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Dec 14 05:51:16.557 I openshift-apiserver OpenShift API started failing: Get https://ci-op-4xwzpczq-1d3f3-api.origin-ci-int-aws.dev.rhcloud.com:6443/apis/image.openshift.io/v1/namespaces/openshift-apiserver/imagestreams/missing?timeout=3s: context deadline exceeded
Dec 14 05:51:18.547 E kube-apiserver Kube API is not responding to GET requests
Dec 14 05:51:18.547 E openshift-apiserver OpenShift API is not responding to GET requests
Dec 14 05:51:20.645 I openshift-apiserver OpenShift API started responding to GET requests
Dec 14 05:51:20.742 I kube-apiserver Kube API started responding to GET requests
...
failed: (2m18s) 2018-12-14T05:53:23 "[Feature:Platform][Suite:openshift/smoke-4] Managed cluster should have no crashlooping pods in core namespaces over two minutes [Suite:openshift/conformance/parallel]"

From the logs for one of those server pods:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/905/pull-ci-openshift-installer-master-e2e-aws/2291/artifacts/e2e-aws/pods/openshift-machine-config-operator_machine-config-server-77dc7_machine-config-server.log.gz | zcat
I1214 05:41:50.478893       1 start.go:37] Version: 3.11.0-352-g0cfc4183-dirty
I1214 05:41:50.480250       1 api.go:54] launching server
I1214 05:41:50.480380       1 api.go:54] launching server
2018/12/14 05:41:51 http: TLS handshake error from 10.0.29.128:17205: EOF
2018/12/14 05:41:52 http: TLS handshake error from 10.0.0.231:28579: EOF
2018/12/14 05:41:52 http: TLS handshake error from 10.0.72.138:31458: EOF
...
2018/12/14 06:10:01 http: TLS handshake error from 10.0.72.138:38099: EOF
2018/12/14 06:10:02 http: TLS handshake error from 10.0.29.128:59541: EOF
2018/12/14 06:10:02 http: TLS handshake error from 10.0.45.28:9790: EOF

This is possibly related to #199, which also had TLS handshake errors (although in that case they were bad-certificate errors). Are these errors someone attempting to connect to the MCS but immediately hanging up? Who would do that? Is there information about the restart reason somewhere I can dig up?

Also, only one of the three machine-config-server containers seems to have had a restart issue:

$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/905/pull-ci-openshift-installer-master-e2e-aws/2291/artifacts/e2e-aws/pods.json | jq '.items[] | .status.containerStatuses[] | select(.restartCount > 0) | {name, restartCount}'
{
  "name": "operator",
  "restartCount": 1
}
{
  "name": "operator",
  "restartCount": 1
}
{
  "name": "csi-operator",
  "restartCount": 1
}
{
  "name": "machine-config-server",
  "restartCount": 6
}
{
  "name": "prometheus",
  "restartCount": 1
}
{
  "name": "prometheus",
  "restartCount": 1
}
$ curl -s https://storage.googleapis.com/origin-ci-test/pr-logs/pull/openshift_installer/905/pull-ci-openshift-installer-master-e2e-aws/2291/artifacts/e2e-aws/pods.json | jq '.items[] | .status.containerStatuses[] | select(.name == "machine-config-server") | {name, restartCount}'
{
  "name": "machine-config-server",
  "restartCount": 0
}
{
  "name": "machine-config-server",
  "restartCount": 6
}
{
  "name": "machine-config-server",
  "restartCount": 0
}

add more to test suite

It should be pretty easy to put together a quick test suite in e.g. bash that given a running cluster, tests at least a nondestructive scenario of creating a MC, waiting for it to roll out, running a quick daemonset that verifies the files exist, then e.g. deletes it and verifies the nodes are back as they were.

We may be able to enhance non-destructive testing by creating a separate machineset too and having MCs being more destructive just to those nodes (e.g. making them degraded etc.)

MCD: Move sets of functionality with side effects to clients to allow for unit testing

In #33 we identified that a major blocker for unittesting the MCD are all the side effects that occur from internal functions. The best way we came up with for getting around this limitation is to move things which cause side effects into clients. This issue is a tracker to link to as more of our items move over.

How to add tests to MC CI

Right now 3 unittests are being executed via CI:

operator Clone the correct source code into an image and tag it as src
operator Find the input image root and tag it into the pipeline
operator Run the tests for unit in a pod and wait for success or failure

The MCD code base is slowly adding more and more tests for the MCD but they don't seem to be picked up in CI. How do we tell the CI to execute our additional tests?

/cc @abhinavdahiya @sdemos @kikisdeliveryservice @jlebon

Add retries when trying to fetch or set node annotations

Because we don't want transient network blips to take down a whole node.

Rename `setup-etcd-environment` binary

The setup-etcd-environment binary doesn't have any scope and can be misleasding in terms of what functions it performs.

We should change it.
/cc @smarterclayton
/assign

/usr/bin/machine-config-daemon: no such file or directory

Logs from a recent master:

[core@wking-master-0 ~]$ journalctl
...
Nov 14 22:48:26 wking-master-0 hyperkube[802]: I1114 22:48:26.759484     802 reconciler.go:207] operationExecutor.VerifyControllerAttachedVolume started for volume "machine-config-daemon-token-m6sb8" (UniqueName
: "kubernetes.io/secret/660e568b-e85f-11e8-8f39-2282d965a3a6-machine-config-daemon-token-m6sb8") pod "machine-config-daemon-c8jnt" (UID: "660e568b-e85f-11e8-8f39-2282d965a3a6")
Nov 14 22:48:26 wking-master-0 systemd[1]: Started Kubernetes transient mount for /var/lib/kubelet/pods/660e568b-e85f-11e8-8f39-2282d965a3a6/volumes/kubernetes.io~secret/machine-config-daemon-token-m6sb8.
Nov 14 22:48:26 wking-master-0 systemd[1]: Starting Kubernetes transient mount for /var/lib/kubelet/pods/660e568b-e85f-11e8-8f39-2282d965a3a6/volumes/kubernetes.io~secret/machine-config-daemon-token-m6sb8.
Nov 14 22:48:27 wking-master-0 systemd[1]: Started crio-conmon-89cb5a3db42bbbef7217b5a16c0af0dde19ef85f1f679d38442cd458355043f5.scope.
Nov 14 22:48:27 wking-master-0 systemd[1]: Starting crio-conmon-89cb5a3db42bbbef7217b5a16c0af0dde19ef85f1f679d38442cd458355043f5.scope.
Nov 14 22:48:27 wking-master-0 systemd[1]: Scope libcontainer-7068-systemd-test-default-dependencies.scope has no PIDs. Refusing.
Nov 14 22:48:27 wking-master-0 systemd[1]: Scope libcontainer-7068-systemd-test-default-dependencies.scope has no PIDs. Refusing.
Nov 14 22:48:27 wking-master-0 systemd[1]: Created slice libcontainer_7068_systemd_test_default.slice.
--
Nov 14 22:48:33 wking-master-0 systemd[1]: Stopping libcontainer_7123_systemd_test_default.slice.
Nov 14 22:48:33 wking-master-0 systemd[1]: Started libcontainer container d7762af5c0b21aed261a775866671947a26ab5a0d26721805f25959500639c38.
Nov 14 22:48:33 wking-master-0 systemd[1]: Starting libcontainer container d7762af5c0b21aed261a775866671947a26ab5a0d26721805f25959500639c38.
Nov 14 22:48:33 wking-master-0 systemd[1]: Stopped libcontainer container d7762af5c0b21aed261a775866671947a26ab5a0d26721805f25959500639c38.
Nov 14 22:48:33 wking-master-0 systemd[1]: Stopping libcontainer container d7762af5c0b21aed261a775866671947a26ab5a0d26721805f25959500639c38.
Nov 14 22:48:33 wking-master-0 crio[831]: time="2018-11-14 22:48:33.759095707Z" level=error msg="Container creation error: container_linux.go:336: starting container process caused "exec: \"/usr/bin/machine-conf
ig-daemon\": stat /usr/bin/machine-config-daemon: no such file or directory"
Nov 14 22:48:33 wking-master-0 crio[831]: "
Nov 14 22:48:33 wking-master-0 hyperkube[802]: E1114 22:48:33.849383     802 remote_runtime.go:187] CreateContainer in sandbox "89cb5a3db42bbbef7217b5a16c0af0dde19ef85f1f679d38442cd458355043f5" from runtime serv
ice failed: rpc error: code = Unknown desc = container create failed: container_linux.go:336: starting container process caused "exec: \"/usr/bin/machine-config-daemon\": stat /usr/bin/machine-config-daemon: no 
such file or directory"
Nov 14 22:48:33 wking-master-0 hyperkube[802]: E1114 22:48:33.850090     802 kuberuntime_manager.go:737] container start failed: CreateContainerError: container create failed: container_linux.go:336: starting co
ntainer process caused "exec: \"/usr/bin/machine-config-daemon\": stat /usr/bin/machine-config-daemon: no such file or directory"
Nov 14 22:48:33 wking-master-0 hyperkube[802]: E1114 22:48:33.850142     802 pod_workers.go:186] Error syncing pod 660e568b-e85f-11e8-8f39-2282d965a3a6 ("machine-config-daemon-c8jnt_openshift-machine-config-operator(660e568b-e85f-11e8-8f39-2282d965a3a6)"), skipping: failed to "StartContainer" for "machine-config-daemon" with CreateContainerError: "container create failed: container_linux.go:336: starting container process caused \"exec: \\\"/usr/bin/machine-config-daemon\\\": stat /usr/bin/machine-config-daemon: no such file or directory\"\n"
Nov 14 22:48:34 wking-master-0 hyperkube[802]: I1114 22:48:34.667320     802 kuberuntime_manager.go:517] Container {Name:machine-config-daemon Image:registry.svc.ci.openshift.org/openshift/origin-v4.0-20181114203912@sha256:a058885f3294ff33962322389e6ca2e0cc8dfd916206503c837c5c2e1bbf68a5 Command:[] Args:[start] WorkingDir: Ports:[] EnvFrom:[] Env:[{Name:NODE_NAME Value: ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:spec.nodeName,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,}}] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:rootfs ReadOnly:false MountPath:/rootfs SubPath: MountPropagation:<nil>} {Name:var-run-dbus ReadOnly:false MountPath:/var/run/dbus SubPath: MountPropagation:<nil>} {Name:run-systemd ReadOnly:false MountPath:/run/systemd SubPath: MountPropagation:<nil>} {Name:etc-ssl-certs ReadOnly:true MountPath:/etc/ssl/certs SubPath: MountPropagation:<nil>} {Name:etc-mcd ReadOnly:true MountPath:/etc/machine-config-daemon SubPath: MountPropagation:<nil>} {Name:machine-config-daemon-token-m6sb8 ReadOnly:true MountPath:/var/run/secrets/kubernetes.io/serviceaccount SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,RunAsGroup:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
...

Commit for that image:

$ oc --config=${INSTALL_DIR}/auth/kubeconfig adm release info registry.svc.ci.openshift.org/openshift/origin-release:v4.0-20181114203912 --commits | grep machine-config
  machine-config-controller                     https://github.com/openshift/machine-config-operator                       c39289b736b6ca0ce8376bbc26c245c22dc2cdce
  machine-config-daemon                         https://github.com/openshift/machine-config-operator                       c39289b736b6ca0ce8376bbc26c245c22dc2cdce
  machine-config-operator                       https://github.com/openshift/machine-config-operator                       c39289b736b6ca0ce8376bbc26c245c22dc2cdce
  machine-config-server                         https://github.com/openshift/machine-config-operator                       c39289b736b6ca0ce8376bbc26c245c22dc2cdce

Maybe this was what #174 was fixing, @abhinavdahiya?

tweak hack/cluster-push.sh to use CVO overrides

Was looking at the CVO docs and came across https://github.com/openshift/cluster-version-operator/blob/master/docs/dev/clusterversion.md

It looks like we should change our dev script to set up overrides rather than disabling the CVO. This is particularly important for the MCO as we really want to test the flow of having OS updates flow via the CVO, but also test changes to e.g. the MCD.

Add a Degraded annotation

Right now one of my nodes went degraded, apparently the MCD restarted so I don't have those logs, and more worrying I can't find anything in the journal.

I'm thinking we want a new node annotation:

machineconfiguration.openshift.io/state=Degraded
machineconfiguration.openshift.io/degraded="Unable to reconcile ..."

Rebooting after: "Updating machineconfig from {hash} to {same-hash}"

Poking around on master-0 during this run:

[core@ip-10-0-7-211 ~]$ sudo crictl ps -a
CONTAINER ID        IMAGE                                                                                                                          CREATED             STATE               NAME                          ATTEMPT
6413013f1b203       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:299840e1ef37549af1c6bb9b45ed1f4eb48ca51b0384772709ce88e3d9d60bfc    9 seconds ago       Running             operator                      2
434fa05a795cb       f6df05d6a36426dde85b9c778b435a7aaa12543d69d603f629b8f5273356ec7b                                                               13 seconds ago      Running             cluster-dns-operator          1
9c3e0b96b8325       f06c190859935d127c2efee77beb689fbacb53ec93b88547c25392fc970289f7                                                               14 seconds ago      Running             operator                      1
ebc194bc9b966       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:6667ac4aecae183dfd4e6ae4277dd86ca977e0a3b9feefee653043105503c6d6    18 seconds ago      Exited              tuned                         1
011545eb60059       05503aa686767edf45b70172c8975c8b9743bb6a6c1c182c181eb36cd610f6fc                                                               18 seconds ago      Running             machine-config-server         1
d8aa7ead7c3dc       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:581af93fda80257651d621dade879e438f846a5bf39040dd0259006fc3b73820    18 seconds ago      Running             machine-approver-controller   1
413a0cca0629b       bb84dbdafdfa61c29ca989c652d63debe03a043c2099f6ad7097ac28865bd406                                                               20 seconds ago      Running             cluster-network-operator      1
c0ace414b0ce2       25dbbded706585310ae3ebc743bcc59659f826fff1bac24be4a56b83d37e3cc2                                                               25 seconds ago      Running             machine-config-daemon         1
a2e1ef4a0c97b       03de8f11d9e07ee2b23be6d48dc849b9a5e24e4ab4c3ab758bdcd583b3b8fbd9                                                               27 seconds ago      Running             sdn-controller                1
c08642e38da06       registry.svc.ci.openshift.org/ci-op-1mpypn4i/release@sha256:77dd81dbdb38c941fc288f551f39ddef1de251384cbfb8f6755ff7f072ab9a13   27 seconds ago      Running             cluster-version-operator      1
8f80023565506       1d2ec4ba1e697f9c0eb69c154888e6f09007a3d2aad4c34bb7868cec86b8f8f8                                                               28 seconds ago      Running             sdn                           1
fb31831e0665c       1d2ec4ba1e697f9c0eb69c154888e6f09007a3d2aad4c34bb7868cec86b8f8f8                                                               28 seconds ago      Running             openvswitch                   1
89155ae218212       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:f82a3c247a4c59538a3d40ad1a2257383420440e15c4675b2e11ad620601bf98    30 seconds ago      Running             openshift-kube-apiserver      1
edafa7d07d214       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:4d0106d7428828c87ed905728742fbc11bd8b30d0c87165359699d0a475e2315    30 seconds ago      Running             kube-controller-manager       1
abab48f8aaf0e       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:4d0106d7428828c87ed905728742fbc11bd8b30d0c87165359699d0a475e2315    30 seconds ago      Running             scheduler                     1
b735c768b8baf       94bc3af972c98ce73f99d70bd72144caa8b63e541ccc9d844960b7f0ca77d7c4                                                               38 seconds ago      Running             etcd-member                   1
b79135e346749       b02de22ff740f0bfa7e5dde5aa1a8169051375a5f0c69c28fafefc9408f72b06                                                               39 seconds ago      Exited              certs                         0
04b4489ae42e3       04a052dbf6cb5ac2afa57eb41c37a2964ee16c7ee62986900aceb38f369c8411                                                               39 seconds ago      Exited              discovery                     1
1e38cb58d5063       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:14b67d7a5d1ec05dd45e60663b6e4e0c460cf7f429397dd3a3ec2d1997e52096    2 minutes ago       Exited              machine-config-daemon         0
e8715191161af       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:08fbd1a5a90b59572a3b097389fc17f7ae9b9b1ef7e1f3d19103e280fc656518    2 minutes ago       Exited              console                       0
9c19639ae072d       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:9ecf75c8384f5ec5431b4a3267466833c1caa212a183ce8b3e7f42bc3f7e1dcc    3 minutes ago       Exited              machine-config-server         0
439c340ab23ce       03de8f11d9e07ee2b23be6d48dc849b9a5e24e4ab4c3ab758bdcd583b3b8fbd9                                                               3 minutes ago       Exited              controller-manager            0
a12b3df1bae1f       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:f82a3c247a4c59538a3d40ad1a2257383420440e15c4675b2e11ad620601bf98    4 minutes ago       Exited              openshift-kube-apiserver      0
74d8d492a43db       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:f82a3c247a4c59538a3d40ad1a2257383420440e15c4675b2e11ad620601bf98    4 minutes ago       Exited              openshift-apiserver           0
98c47bd0e9246       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:299840e1ef37549af1c6bb9b45ed1f4eb48ca51b0384772709ce88e3d9d60bfc    4 minutes ago       Exited              installer                     0
6c15d73ec3e14       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:c8c110b8733d0d352ddc5fe35ba9eeac913b7609c2c9c778586f2bb74f281681    4 minutes ago       Exited              registry                      0
7796346af6b01       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:581af93fda80257651d621dade879e438f846a5bf39040dd0259006fc3b73820    5 minutes ago       Exited              machine-approver-controller   0
afb16e1bdfea8       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:40fa0e40cf625f314586d47a4199d5de985b7f980d7c4b650ab7f2c0f74f41b2    5 minutes ago       Exited              registry-ca-hostmapper        0
8aff3b343e0dc       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:299840e1ef37549af1c6bb9b45ed1f4eb48ca51b0384772709ce88e3d9d60bfc    6 minutes ago       Exited              operator                      1
158f097ba79cb       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:4d0106d7428828c87ed905728742fbc11bd8b30d0c87165359699d0a475e2315    7 minutes ago       Exited              kube-controller-manager       0
128c03f9fa9a8       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:38d43ca65fce090c19b092b1c0962781c146f9fc65f3228eb96f5aad684c9119    7 minutes ago       Exited              installer                     0
bf6ac2a6570f2       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:4d0106d7428828c87ed905728742fbc11bd8b30d0c87165359699d0a475e2315    8 minutes ago       Exited              scheduler                     0
b43c90d9cebb6       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:b532d9351b803b5a03cf6777f12d725b4973851957497ea3e2b37313aadd6750    8 minutes ago       Exited              operator                      0
4f333ad437f34       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:a494ee2152687973d1987598cf82f7606b2c4fbb48c7f09f2e897cb417ab88f1    8 minutes ago       Exited              installer                     0
28eed681186eb       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:299840e1ef37549af1c6bb9b45ed1f4eb48ca51b0384772709ce88e3d9d60bfc    8 minutes ago       Exited              installer                     0
c0d2a89890a29       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:ce86e514320b680f39735323288cfd19caee5a9480b086b4b275454aef94136e    8 minutes ago       Exited              dns-node-resolver             0
d244a6189ea7b       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:e4936a702d7d466a64a6a9359f35c7ad528bba7c35fe5c582a90e46f9051d8b8    8 minutes ago       Exited              dns                           0
7e12941ef62cd       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:3287ba30af508652bda691490ad7dbd97febf4e90b72e23228f87c9038fc387e    9 minutes ago       Exited              operator                      0
9f0bcda47b3bc       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:f538d65377e0891e70555009f4778c0b1782360ea0f4309adec905cad752593d    9 minutes ago       Exited              cluster-dns-operator          0
dc6d7fba5de79       registry.svc.ci.openshift.org/ci-op-1mpypn4i/release@sha256:77dd81dbdb38c941fc288f551f39ddef1de251384cbfb8f6755ff7f072ab9a13   9 minutes ago       Exited              cluster-version-operator      0
b46b53bc15b68       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:0f51e8c6713cf23fac9b4b61d3e10e453936c139ee9a58171090b5ffe7cd37ae    9 minutes ago       Exited              openvswitch                   0
98d49e2e52366       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:0f51e8c6713cf23fac9b4b61d3e10e453936c139ee9a58171090b5ffe7cd37ae    9 minutes ago       Exited              sdn                           0
62fe7aeacb776       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:f82a3c247a4c59538a3d40ad1a2257383420440e15c4675b2e11ad620601bf98    10 minutes ago      Exited              sdn-controller                0
e2026a15792b4       registry.svc.ci.openshift.org/ci-op-1mpypn4i/stable@sha256:a8aa3e53cbaeae806210878f0c7b499b636a963b2a52f4d1eea6db3dfa2fdc98    11 minutes ago      Exited              cluster-network-operator      0
751acc223850b       quay.io/coreos/etcd@sha256:688e6c102955fe927c34db97e6352d0e0962554735b2db5f2f66f3f94cfe8fd1                                    13 minutes ago      Exited              etcd-member                   0
[core@ip-10-0-7-211 ~]$ sudo crictl logs 1e38cb58d5063
I1208 07:04:15.643512   12912 start.go:51] Version: 3.11.0-321-g9d379bd8-dirty
I1208 07:04:15.644717   12912 start.go:88] starting node writer
I1208 07:04:15.654184   12912 run.go:22] Running captured: chroot /rootfs rpm-ostree status --json
I1208 07:04:15.780961   12912 daemon.go:125] Booted osImageURL: registry.svc.ci.openshift.org/rhcos/maipo@sha256:ede3888e50016d61a720af2fe3f80e67e86bd819e16516ac36538456d46e0d77 (47.198)
I1208 07:04:18.656693   12912 start.go:139] Calling chroot("/rootfs")
I1208 07:04:18.727383   12912 daemon.go:673] While getting MachineConfig ea8bbb8e8e084123f670a4bf90258ac8, got: machineconfigs.machineconfiguration.openshift.io "ea8bbb8e8e084123f670a4bf90258ac8" not found. Retrying...
I1208 07:04:28.807287   12912 update.go:95] Checking if configs are reconcilable
I1208 07:04:28.807317   12912 daemon.go:572] No target osImageURL provided
E1208 07:04:28.807901   12912 daemon.go:653] content mismatch for file: "/etc/hosts"; expected: # IPv4 and IPv6 localhost aliases
127.0.0.1 localhost
::1   localhost

# Internal registry hack
10.3.0.25 docker-registry.default.svc
; received: # IPv4 and IPv6 localhost aliases
127.0.0.1 localhost
::1   localhost

# Internal registry hack
10.3.0.25 docker-registry.default.svc
172.30.198.180 image-registry.openshift-image-registry.svc image-registry.openshift-image-registry.svc.cluster.local # openshift-generated-node-resolver
I1208 07:04:28.881287   12912 update.go:37] Updating machineconfig from ea8bbb8e8e084123f670a4bf90258ac8 to ea8bbb8e8e084123f670a4bf90258ac8
I1208 07:04:28.881311   12912 update.go:95] Checking if configs are reconcilable
I1208 07:04:28.881325   12912 update.go:199] Updating files
I1208 07:04:28.881336   12912 update.go:387] Writing file "/etc/containers/registries.conf"
I1208 07:04:28.882785   12912 update.go:387] Writing file "/etc/hosts"
I1208 07:04:28.884044   12912 update.go:387] Writing file "/etc/kubernetes/manifests/etcd-member.yaml"
I1208 07:04:28.885615   12912 update.go:387] Writing file "/etc/sysconfig/crio-network"
I1208 07:04:28.886943   12912 update.go:387] Writing file "/etc/kubernetes/static-pod-resources/etcd-member/ca.crt"
I1208 07:04:28.888766   12912 update.go:387] Writing file "/etc/kubernetes/static-pod-resources/etcd-member/root-ca.crt"
I1208 07:04:28.890648   12912 update.go:387] Writing file "/etc/kubernetes/kubelet.conf"
I1208 07:04:28.892163   12912 update.go:387] Writing file "/var/lib/kubelet/config.json"
I1208 07:04:28.894227   12912 update.go:387] Writing file "/etc/docker/certs.d/docker-registry.default.svc:5000/ca.crt"
I1208 07:04:28.896036   12912 update.go:387] Writing file "/etc/kubernetes/ca.crt"
I1208 07:04:28.897806   12912 update.go:387] Writing file "/etc/sysctl.d/forward.conf"
I1208 07:04:28.899180   12912 update.go:321] Writing systemd unit "kubelet.service"
I1208 07:04:28.899367   12912 update.go:359] Enabling systemd unit "kubelet.service"
I1208 07:04:28.899513   12912 update.go:268] /etc/systemd/system/multi-user.target.wants/kubelet.service already exists. Not making a new symlink
I1208 07:04:28.899530   12912 update.go:220] Deleting stale data
I1208 07:04:28.899550   12912 update.go:482] No target osImageURL provided
E1208 07:04:28.899591   12912 event.go:259] Could not construct reference to: '&v1.Node{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"ip-10-0-7-211.ec2.internal", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, Spec:v1.NodeSpec{PodCIDR:"", ProviderID:"", Unschedulable:false, Taints:[]v1.Taint(nil), ConfigSource:(*v1.NodeConfigSource)(nil), DoNotUse_ExternalID:""}, Status:v1.NodeStatus{Capacity:v1.ResourceList(nil), Allocatable:v1.ResourceList(nil), Phase:"", Conditions:[]v1.NodeCondition(nil), Addresses:[]v1.NodeAddress(nil), DaemonEndpoints:v1.NodeDaemonEndpoints{KubeletEndpoint:v1.DaemonEndpoint{Port:0}}, NodeInfo:v1.NodeSystemInfo{MachineID:"", SystemUUID:"", BootID:"", KernelVersion:"", OSImage:"", ContainerRuntimeVersion:"", KubeletVersion:"", KubeProxyVersion:"", OperatingSystem:"", Architecture:""}, Images:[]v1.ContainerImage(nil), VolumesInUse:[]v1.UniqueVolumeName(nil), VolumesAttached:[]v1.AttachedVolume(nil), Config:(*v1.NodeConfigStatus)(nil)}}' due to: 'selfLink was empty, can't make reference'. Will not report event: 'Normal' 'Reboot' 'Node will reboot into config ea8bbb8e8e084123f670a4bf90258ac8'
I1208 07:04:28.899753   12912 update.go:497] machine-config-daemon initiating reboot: Node will reboot into config ea8bbb8e8e084123f670a4bf90258ac8

It's possible that the underlying issue is due to whatever took down the initial etcd-member pod (751acc223850b, I'm still looking into this), but I don't understand why the MCD keeps going after:

Updating machineconfig from ea8bbb8e8e084123f670a4bf90258ac8 to ea8bbb8e8e084123f670a4bf90258ac8

Is the issue the selfLink was empty, can't make reference? It seems surprising to trigger a reboot because you failed to submit an event. Some previous discussion of reboot triggers in #199, but I don't understand either that issue or this one clearly enough to want to attempt to tie them together ;). @cgwalters comment about identical configs seems like this issue though.

MCD: Log level clean up

In the logs we are using a combination of default verbosity and 2. However, most, if not all, of our current logging seems like it would be beneficial in normal circumstances.

What brings this up

Looking at the log output of #79 brought this to mind.

Proposal

Use default V for general information useful to admins
Use V(2) for more fine grained items
Use V(3) for debug level logging

Example

Daemon.enableUnit could be updated like:

// enableUnit enables a systemd unit via symlink
func (dn *Daemon) enableUnit(unit ignv2_2types.Unit) error {
	// The link location
	wantsPath := filepath.Join(wantsPathSystemd, unit.Name)
	// sanity check that we don't return an error when the link already exists
	if _, err := os.Stat(wantsPath); err == nil {
		glog.Infof("%s already exists. Not making a new symlink", wantsPath)
		return nil
	}
	// The originating file to link
	servicePath := filepath.Join(pathSystemd, unit.Name)
	err := os.Symlink(servicePath, wantsPath)
        if err == nil {
    	    glog.Infof("Enabled %s", unit.Name)
            glog.V(3).Infof("Symlinked %s to %s", servicePath, wantsPath)
        } else {
            glog.Warningf("Unable to enable unit %s: %s", unit.Name, err)
        }
        return err
}

/cc @sdemos @jlebon @kikisdeliveryservice

Design for osImageURL updates - integration with CVO/release payload

I wanted to elaborate here on the current status of this. We have a PR in #363 which will finally close the loop and inject machine-os-content from the release payload all the way into the MachineConfig objects, which will result in the MCD updating.

The final architecture will be:

New kernel errata, turns into RPM, converted into ostree then oscontainer. A bit more information on the build system side here. The oscontainer makes it into a new release payload published on quay.io.

At some point the release payload pulled down by CVO, which includes a osimageurl ConfigMap that references that container (same thing as the machine-os-content ImageStream). The CVO updates the ConfigMap, which you can see via

oc -n openshift-machine-config-operator get configmap/machine-config-osimageurl

The operator notices the change to the configmap and updates the "controllerconfig" which is an internal CRD that is used as the primary input to the MCC. See oc get -o yaml controllerconfig.

The "template" sub-controller of the MCC then updates machineconfigs/00-master and machineconfigs/00-worker.

The "render" sub-controller of the MCC generates new "rendered" MCs that look like machineconfigs/master-<hash> and machineconfigs/worker-<hash> and updates the MachineConfigPools to target them. For more information on this, see the MCC docs.

On each node the MCD will get the new osimageurl, and if it's different than what's booted, it will pull down the container and rebase to it and reboot. This is also the same as any other config change.

MCD: How to verify OS version

A todo is listed in the MCD spec. Adding some info here to see if this is helpful ....

rpm-ostree provides a status and -b shows the currently booted version:

$ rpm-ostree status -b
State: idle; auto updates disabled
Deployments:
● ostree://fedora-workstation:fedora/28/x86_64/workstation
                   Version: 28.20180722.0 (2018-07-23 00:57:57)
                BaseCommit: a9c5d45fcdbc2ff2ae241871ea725a0493c73942218df924255fd9e5cf0296e8
              GPGSignature: Valid signature by 128CF232A9371991C8A65695E08E7E629DB62FB1

One could use standard tools to verify the new version if need be via the MCD:

$ rpm-ostree status -b | grep "BaseCommit" | sed -e "s|^.*BaseCommit: ||"
a9c5d45fcdbc2ff2ae241871ea725a0493c73942218df924255fd9e5cf0296e8
$ rpm-ostree status -b | grep "Version" | sed -e "s|^.*Version: ||" | sed -e "s| (.*||"
28.20180722.0

Information can be received through dbus as well. As an example see ostree CLUO and it's status.

/cc @jlebon @cgwalters @sdemos

create a script to both build + upload to cluster

Moving this discussion from #189

As a developer I want to be able to conveniently hack on the stack.

BYOH: MCD crashes "An unsupported OS is being used: centos"

MCD fails on a CentOS cluster deployed on GCP

# oc logs -f machine-config-daemon-frq6h -n openshift-machine-config-operator
I1109 14:28:25.824759   14757 start.go:46] Version: 3.11.0-222-g393cc444-dirty
F1109 14:28:25.826242   14757 start.go:50] Error found when checking operating system: An unsupported OS is being used: centos:

I'm not sure if it should necessary fail if it has encountered an unsupported OS. In any case, RHEL and CentOS should be whitelisted so that BYOH team could proceed

MCD: TestReconcilable bug

Since I'm updating the file update_test.go, I noticed an unintended bug in the TestReconcilable logic.
For all of the test cases after the "ignition" test, if the test case fails, all of the test cases afterwards also fail - resulting in confusing output:

Running tool: /usr/bin/go test -timeout 30s github.com/openshift/machine-config-operator/pkg/daemon -run ^TestReconcilable$

--- FAIL: TestReconcilable (0.00s)
	/home/k/go/src/github.com/openshift/machine-config-operator/pkg/daemon/update_test.go:74: Expected the same ignition values would cause reconcilable. Received irreconcilable.
	/home/k/go/src/github.com/openshift/machine-config-operator/pkg/daemon/update_test.go:74: Expected the same networkd values would cause reconcilable. Received irreconcilable.
	/home/k/go/src/github.com/openshift/machine-config-operator/pkg/daemon/update_test.go:74: Expected the same disk values would cause reconcilable. Received irreconcilable.
	/home/k/go/src/github.com/openshift/machine-config-operator/pkg/daemon/update_test.go:74: Expected the same filesystem values would cause reconcilable. Received irreconcilable.
	/home/k/go/src/github.com/openshift/machine-config-operator/pkg/daemon/update_test.go:74: Expected the same raid values would cause reconcilable. Received irreconcilable.
FAIL
FAIL	github.com/openshift/machine-config-operator/pkg/daemon	0.009s
Error: Tests failed.

To reproduce:
for the error case above, I commented out line 126 in update_test.go.

Expected Behavior
I should only be seeing a failure relating to ignition, all of the other tests should still pass.

Cause
I suspect that this happens bc instead of each case being broken out into seperate funcs, the changes to the oldconfigs/newconfigs compound as they move through the function. So the function is failing for networkd not because those networkd configs are wrong, but bc the config is false bc of the prior bad ignition config.

Not sure if it makes sense to keep all of these in one test because of the above and resetting each prior config option adds a lot of lines and can make it hard to read.

I'm working in the file and with the function already, so I'm happy to make the changes, and keep it consistent with my other tests, but wanted to get thoughts on the best approach.

cc: @ashcrow

Machine Config Operator is root in cluster

MCO manages ClusterRoles and ClusterRoleBindings so that MachineConfig{Controller, Server, Daemon} can access and manage:

CRDs
MachineConfigs and Pools (cluster scoped)
Node annotation updates

Should MCO be given root access to cluster?

/cc @aaronlevy @eparis

MCD doesn't always fully apply new configs

The top-level logic of the daemon currently goes something like this -

daemon starts up
check if the machine is in the desired state, as defined by the desiredConfig annotation. this call returns true if currentConfig == desiredConfig [1], and if it doesn't it checks to make sure all the files, systemd units, and and the operating system are all what we want them to be.
- if it's not, make sure the config is reconcilable, update, and reboot
- if it is, set the update as done and make the machine schedulable
wait for the desiredConfig annotation to change
- when it does, jump back to the "checking machine state" step and continue

this logic is the source of the bug. if the desired machine config contains only irreconcilable differences between it and the current config, the update is never attempted, because the validation logic only validates that all the reconcilable changes are the same. it should be setting the node as degraded. it's also a time-bomb; if someone later makes a valid, reconcilable change, we attempt to apply the changes and only then does everything get rejected and the node set to degraded.

there are three ways I can see to fix this issue -

check that a machine is reconcilable every time the desired annotation changes, before validating machine state. this would also happen on startup, because it's entirely possible that while we weren't running the desiredConfig annotation was updated.
update the validation logic so that it checks everything defined by ignition regardless of whether we can reconcile it or not.
only validate the node state either when a machine starts up or when we apply an update that we decided doesn't need a reboot (that doesn't actually ever happen right now but the logic is written with that possibility in mind for the future). the node validation logic acts more explicitly as a gate for marking the update as complete. we always attempt to apply config changes when told to, whether or not they seem to represent the same machine state we are already in.

option 1 lets us reject configs we can't reconcile (fixing this problem). it also seems like it may reduce the number of reboots, if, say, we are told to update to a config that we can reconcile, but it turns out that we don't actually need to make any changes to the machine (although, of the top of my head, I can't think of any examples of when this would happen that aren't bugs in the validation logic). the problem with this approach is that it relies much more heavily on our validation logic being totally complete and correct, since if it thinks that a machine is in the correct state, it never goes through the update application logic.

option 2 seems like it would be a lot of work, and it seems like unecessary work, making the assumption that we are the only people who can modify a node under normal operation (plus, even if the node is modified, arguably it's better to just leave that modification be then wipe everything away and start over, since presumably there is a reason for the change).

option 3 is essentially in line with the logic I've always had in mind for the node-agent, now mcd. we rely on the mcc to only tell us to update to a new configuration if there is actually a difference (which it does right now) for minimizing our reboots. if there are subtle side-effects of the update application that we don't account for when validating the machine state, they still happen. this is the option I'm leaning towards.

[1]: I don't think that this should be in the validation logic in general, but I'm going to be opening a separate bug for the issue of removing it since it's not really related to this bug.

/cc @ashcrow @jlebon @kikisdeliveryservice

Parameter 'nodeInformer' in daemon.new() is defined but not used

In pkg/daemon/daemon.go, the parameter 'nodeInformer' of function New() is defined but not used. It should be removed.

func New(
	rootMount string,
	nodeName string,
	operatingSystem string,
	nodeUpdaterClient NodeUpdaterClient,
	fileSystemClient FileSystemClient,
	onceFrom string,
	fromIgnition bool,
	nodeInformer coreinformersv1.NodeInformer,
	kubeletHealthzEnabled bool,
	kubeletHealthzEndpoint string,
	nodeWriter *NodeWriter,
) (*Daemon, error) {
	loginClient, err := login1.New()
	if err != nil {
		return nil, fmt.Errorf("Error establishing connection to logind dbus: %v", err)
	}

	osImageURL := ""
	// Only pull the osImageURL from OSTree when we are on RHCOS
	if operatingSystem == MachineConfigDaemonOSRHCOS {
		osImageURL, osVersion, err := nodeUpdaterClient.GetBootedOSImageURL(rootMount)
		if err != nil {
			return nil, fmt.Errorf("Error reading osImageURL from rpm-ostree: %v", err)
		}
		glog.Infof("Booted osImageURL: %s (%s)", osImageURL, osVersion)
	}
	dn := &Daemon{
		name:                   nodeName,
		OperatingSystem:        operatingSystem,
		NodeUpdaterClient:      nodeUpdaterClient,
		loginClient:            loginClient,
		rootMount:              rootMount,
		fileSystemClient:       fileSystemClient,
		bootedOSImageURL:       osImageURL,
		onceFrom:               onceFrom,
		fromIgnition:           fromIgnition,
		kubeletHealthzEnabled:  kubeletHealthzEnabled,
		kubeletHealthzEndpoint: kubeletHealthzEndpoint,
		nodeWriter:             nodeWriter,
	}

	return dn, nil
}

Add a basic HACKING.md

It'd be nice if there was a HACKING.md with instructions on how to build and test this code in a real cluster. I know a lot of this stuff is still 🚧 under construction 🚧, though such a doc, even if primitive/high-level, would help other contributors start hacking early on.

MCD System Bind Mounts

I was reminded by @lucab this morning that the MCD will be running as a container. Since this container will need to modify system files and execute system level binaries we will need to mount in volumes to the container (https://github.com/openshift/machine-config-operator/blob/master/Dockerfile.machine-config-daemon).

Is there currently an expected mount point or sets of mount points we should reference within the container when modifying system content? OR should we define them ourselves just for the MCD?

For an example of what brings this question up see #17

/cc @cgwalters @lucab @sdemos

MCD rebooting early after set up

From @crawford:

I1129 21:37:29.566704    8105 start.go:51] Version: 3.11.0-278-g628ad5d1-dirty
I1129 21:37:29.567579    8105 start.go:78] starting node writer
I1129 21:37:29.572698    8105 run.go:22] Running captured: chroot /rootfs rpm-ostree status --json
I1129 21:37:29.971743    8105 daemon.go:115] Booted osImageURL: registry.svc.ci.openshift.org/rhcos/maipo@sha256:9c7f465ed8a84ed9d47356258ede44b7aec8b11b4e89defaf3ecee3cef33f0a0 (47.153)
I1129 21:37:30.009173    8105 start.go:127] Calling chroot("/rootfs")
E1129 21:37:30.029005    8105 daemon.go:303] Marking degraded due to: machineconfigs.machineconfiguration.openshift.io "0788cf371d56ac505e5b0e78392ccb14" not found
I1129 21:37:30.044992    8105 start.go:146] Starting MachineConfigDaemon
I1129 21:37:30.045012    8105 daemon.go:187] Enabling Kubelet Healthz Monitor
W1129 21:37:30.804003    8105 reflector.go:341] github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/informers/factory.go:130: watch of *v1.Node ended with: very short watch: github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/informers/factory.go:130: Unexpected watch close - watch lasted less than a second and no items received
E1129 21:37:31.805751    8105 reflector.go:205] github.com/openshift/machine-config-operator/vendor/k8s.io/client-go/informers/factory.go:130: Failed to list *v1.Node: Get https://172.30.0.1:443/api/v1/nodes?limit=500&resourceVersion=0: dial tcp 172.30.0.1:443: connect: connection refused
...
I1129 21:38:09.151806    8105 update.go:489] Rebooting

I1129 21:40:52.851825    5989 start.go:51] Version: 3.11.0-278-g628ad5d1-dirty
I1129 21:40:52.856779    5989 start.go:78] starting node writer
I1129 21:40:52.870226    5989 run.go:22] Running captured: chroot /rootfs rpm-ostree status --json
I1129 21:40:53.263929    5989 daemon.go:115] Booted osImageURL: registry.svc.ci.openshift.org/rhcos/maipo@sha256:9c7f465ed8a84ed9d47356258ede44b7aec8b11b4e89defaf3ecee3cef33f0a0 (47.153)
I1129 21:40:57.364959    5989 start.go:127] Calling chroot("/rootfs")
I1129 21:40:57.387466    5989 daemon.go:293] Node is degraded; going to sleep

MCD: Failure after reboot when applying new machine config

Even with 9a53b9b, I'm still seeing errors upon applying a new machine config requiring a restart:

# kubectl logs -p -n openshift-machine-config-operator machine-config-daemon-b2hvq
I1017 19:42:06.534306   15382 start.go:42] Version: 3.11.0-158-g8f6563ae
I1017 19:42:06.565619   15382 run.go:22] Running captured: chroot /rootfs rpm-ostree status --json
I1017 19:42:06.565788   15382 daemon.go:81] Booted osImageURL:  ()
I1017 19:42:06.565806   15382 start.go:85] Calling chroot("/rootfs")
I1017 19:42:06.565835   15382 daemon.go:101] Starting MachineConfigDaemon
I1017 20:16:58.980305   15382 update.go:26] Updating node with new config
...
I1017 20:16:59.009178   15382 update.go:455] Updating OS to ://dummy
I1017 20:16:59.009225   15382 run.go:13] Running: /bin/pivot ://dummy
pivot version 0.0.1
I1017 20:16:59.022150   30919 run.go:27] Running: rpm-ostree status --json
I1017 20:16:59.144617   30919 run.go:27] Running: skopeo inspect docker://://dummy
time="2018-10-17T20:16:59Z" level=fatal msg="invalid reference format" 
F1017 20:16:59.201422   30919 run.go:33] skopeo: exit status 1
E1017 20:16:59.202069   15382 daemon.go:121] Marking degraded due to: exit status 255
I1017 20:16:59.223021   15382 daemon.go:122] Shutting down MachineConfigDaemon

yaml used: the example one:

# test.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
	machineconfiguration.openshift.io/role: worker
  name: test-file
spec:
  config:
	storage:
  	files:
  	- contents:
      	source: data:,hello%20world%0A
      	verification: {}
    	filesystem: root
    	mode: 420
    	path: /home/core/test

RHCOS image:

rpm-ostree status
State: idle
AutomaticUpdates: disabled
Deployments:
* ostree://rhcos:openshift/3.10/x86_64/os
                   Version: 4.0.6912 (2018-10-17 01:00:41)
                    Commit: 8999e2f9635ed074a7b52c62da87b49a0d6be40b07ac16c07024e5b580819739

Installer: latest

Further with:
#136

I see:

# oc logs -n openshift-machine-config-operator machine-config-daemon-72mts --config=auth/kubeconfig
I1017 20:30:02.745536    8360 start.go:42] Version: 3.11.0-167-ge3efe448
I1017 20:30:02.774378    8360 run.go:22] Running captured: chroot /rootfs rpm-ostree status --json
F1017 20:30:02.774553    8360 start.go:82] failed to initialize daemon: Error reading osImageURL from rpm-ostree: exec: "chroot": executable file not found in $PATH

Although if I ssh in, I get

# which chroot
/sbin/chroot

MCD: pods are crashlooping

@sallyom reported that all the mcd pods are crashlooping.

installer(master)$ oc logs machine-config-daemon-5d862 -n openshift-machine-config-operator
I1015 15:56:56.802774    5350 start.go:42] Version: 3.11.0-152-g0fe65acd-dirty
I1015 15:56:56.828854    5350 run.go:22] Running captured: chroot /rootfs rpm-ostree status --json
W1015 15:56:56.884945    5350 rpm-ostree.go:88] Working around "://dummy" OS image URL until installer ➰ pivots
I1015 15:56:56.885256    5350 daemon.go:81] Booted osImageURL: ://dummy (4.0.6844)
I1015 15:56:56.885341    5350 start.go:85] Calling chroot("/rootfs")
I1015 15:56:56.885424    5350 daemon.go:101] Starting MachineConfigDaemon
I1015 15:56:56.893192    5350 daemon.go:110] Shutting down MachineConfigDaemon
F1015 15:56:56.893217    5350 start.go:99] failed to run: Node is degraded; exiting loudly...

/cc @sdemos

consider making MCD a component of the host

I'm opening this issue to see if it makes sense to have the MCD be a component of the host operating system.

there are some benefits we would get from this. for one, we are currently discussing the need to have the daemon chroot/nsenter to make sure it can properly make changes to the host. it seems to me like it's a little silly to run a program in a container just to manually nsenter back into the host machine. since the MCD is required to manipulate the host machine in fundamental ways, it needs a lot of permissions and behavior like this to make sure it can, and even then I'm sure we will make mistakes and something won't work right.

there are some downsides as well -

if I understand it correctly, the update flow for openshift is - 1. cvo gets updated -> 2. cvo updates operators -> 3. operators update things under their control -> 4. those things reconcile the state they are responsible for

moving the daemon to the machine would mean that the daemon update moves from step 3 to step 4, but the controller stays in step 3, which means those things are liable to get out of sync, and perhaps forget how to communicate with each other if we are not careful. it seems like it would make us not able (or make it a lot harder) to make incompatible changes to the communication protocol, since the new controller has to update the old daemon. it also kind of subverts the intention of the operator/controller/daemon model we are using, since we move the responsibility of updating the daemon away from the operator to the version of the underlying os that we are using.

also, I don't know how much the daemon is going to get updated when the os is not getting updated, so I'm not sure if this will also increase the burdon on rchos to release more os updates then we would normally.

I guess the question I'm asking in this kind of rambling issue is: if we have to bend over backwards to put the daemon in a container, what is are the benefits, and do the benefits justify the unusual implementation and possible future pain?

/cc: @ashcrow since we had the initial discussion that precipitated this
/cc: @crawford @abhinavdahiya for thoughts (and maybe corrections)

Add 10m Timeout for Node Informer

Per 2 in #199, lt's add a 10 minute timeout attempting to connect to the master for our informer. If we can't connect after 10 minutes we should assume something is wrong and go into degraded mode.

MCD needs more perms to reboot the node

I0914 21:07:35.208898    6711 update.go:44] Update completed. Draining the node.
E0914 21:07:35.215929    6711 daemon.go:85] Marking degraded due to: pods is forbidden: User "system:serviceaccount:openshift-machine-config-operator:machine-config-daemon" cannot list pods at the cluster scope: no RBAC policy matched

the MCD shouldn't mark a node degraded because of it's own bugs

There are several bugs right now (#66, #68 for example) which, because they trigger errors, set the node into a degraded state. It seems to me like we should consider breaking up "this error means there is something wrong with the node that we can't fix" and "this error means there is something wrong with us or our configuration" wherever possible, so that we aren't throwing around the degraded label when it might not actually apply.

Like, for instance, I think it wouldn't be great if we had the node-reprovision-on-degraded-state setup, and then we hit #66 and discover that our nodes are getting reprovisioned every 20 minutes because of a bug in the daemon.

We should be able to do this by having an additional return value plumbed through, where we say whether or not this is an error that indicates a degraded state or not.

Thoughts?

Docs: Add docs around the path from templates to machineconfigs

https://github.com/openshift/machine-config-operator/blob/master/docs/MachineConfigController.md#templatecontroller should include a section on how the templates controller reads the teamplates directory to create MachineConfigs for various roles and platform.

stemmed from discussion here

Resize machines bug

I initially instantiated a libvirt cluster with the default 2048 of RAM and 1 vcpu. I then edited the machineset via:

oc edit machineset --namespace openshift-cluster-api

and changed the vcpu to 4, the memory to 4096, and the replica count to 3. I expected to see all the workers get replaced with the higher requirements, but the original worker stayed at the default 2048 MB of RAM. The two new workers did get created correctly with the memory and vcpu requirements.

Regression from chroot; service account cert left behind

$ oc logs machine-config-daemon-4jf24
I0905 20:35:05.283801       1 start.go:42] Version: 0.0.0-106-ge7b9b06d
I0905 20:35:05.283899       1 client_builder.go:49] Using in-cluster kube client config
I0905 20:35:05.284150       1 start.go:66] chrooting into rootMount /rootfs
I0905 20:35:05.284164       1 start.go:72] moving to / inside the chroot
panic: open /var/run/secrets/kubernetes.io/serviceaccount/ca.crt: no such file or directory
goroutine 1 [running]:
github.com/openshift/machine-config-operator/pkg/generated/clientset/versioned/typed/machineconfiguration.openshift.io/v1.NewForConfigOrDie(0xc4200f2e00, 0xc42032cfb0)
        /code/go/src/github.com/openshift/machine-config-operator/pkg/generated/clientset/versioned/typed/machineconfiguration.openshift.io/v1/machineconfiguration.openshift.io_client.go:59 +0x65
github.com/openshift/machine-config-operator/pkg/generated/clientset/versioned.NewForConfigOrDie(0xc4200f2e00, 0x117d329)
        /code/go/src/github.com/openshift/machine-config-operator/pkg/generated/clientset/versioned/clientset.go:69 +0x49
github.com/openshift/machine-config-operator/cmd/common.(*ClientBuilder).MachineConfigClientOrDie(0xc42009a658, 0x117d329, 0x15, 0x0, 0x0)
        /code/go/src/github.com/openshift/machine-config-operator/cmd/common/client_builder.go:21 +0x50
main.runStartCmd(0x19da300, 0xc4202e6960, 0x0, 0x2)
        /code/go/src/github.com/openshift/machine-config-operator/cmd/machine-config-daemon/start.go:80 +0x3de
github.com/openshift/machine-config-operator/vendor/github.com/spf13/cobra.(*Command).execute(0x19da300, 0xc4202e68c0, 0x2, 0x2, 0x19da300, 0xc4202e68c0)
        /code/go/src/github.com/openshift/machine-config-operator/vendor/github.com/spf13/cobra/command.go:766 +0x2c1
github.com/openshift/machine-config-operator/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x19da0a0, 0x0, 0xc4200e1f78, 0x403dfc)
        /code/go/src/github.com/openshift/machine-config-operator/vendor/github.com/spf13/cobra/command.go:852 +0x30a
github.com/openshift/machine-config-operator/vendor/github.com/spf13/cobra.(*Command).Execute(0x19da0a0, 0xc4200e1f60, 0x0)
        /code/go/src/github.com/openshift/machine-config-operator/vendor/github.com/spf13/cobra/command.go:800 +0x2b
main.main()
        /code/go/src/github.com/openshift/machine-config-operator/cmd/machine-config-daemon/main.go:27 +0x31

Looks like we need to copy the cert before we chroot.

Though this raises the higher level question of how much will chroot() affect how well we mesh with Kubernetes. (How common is it to do an early chroot() on the main process in Kube containers?)

Assign a priority class to pods

Priority classes docs:
https://docs.openshift.com/container-platform/3.11/admin_guide/scheduling/priority_preemption.html#admin-guide-priority-preemption-priority-class

Example: https://github.com/openshift/cluster-monitoring-operator/search?q=priority&unscoped_q=priority

Notes: The pre-configured system priority classes (system-node-critical and system-cluster-critical) can only be assigned to pods in kube-system or openshift-* namespaces. Most likely, core operators and their pods should be assigned system-cluster-critical. Please do not assign system-node-critical (the highest priority) unless you are really sure about it.

MCD: Not cleaning up files whe removed from MC

See: https://bugzilla.redhat.com/show_bug.cgi?id=1663376

Early OSTree / oscontainer pivots in served ignition

The config server will want to vendor the installer code which grows out of openshift/installer#281 to get workers up on the right oscontainer / OSTree commit.

MCC racing with itself?

I just did oc edit machineconfigs/00-worker and changed OSImageURL; no MC was generated, and in the MCC logs I see:

I1220 15:08:45.772524       1 node_controller.go:340] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again
I1220 15:08:46.272919       1 node_controller.go:340] Error syncing machineconfigpool worker: Operation cannot be fulfilled on machineconfigpools.machineconfiguration.openshift.io "worker": the object has been modified; please apply your changes to the latest version and try again

MCO doesn't wipe out added `command` in spec

oc edit ds machine-config-server
# add to spec e.g.:
command: ["/bin/bash"]
args: ["-c", "echo hello"]

The operator correctly restores the args var, but not the command, presumably because there is no command defined in the manifest either.

Update k8s drain lib so we can use stable daemonset API group

See #70 (comment).

MCD: consider taking action if validation fails multiple times

a node is validated after being updated but before being marked as completed. this validation normally happens after a reboot. if this validation fails, the update is attempted again, ad infinitum.

there are three reasons I can think of that would cause validation to fail.

a bug in the validation logic.
a bug in the update application logic.
there is something wrong with the machine that is causing it to either revert the changes or never actually be applied.

for situation 3, it seems like the obvious solution is to set the node status to degraded. after all, that's what that status is essentially for (something's wrong with this machine that we can't fix! help!).

however, it's not possible to reliably distinguish between the situations, and setting the node as degraded for situations 1 and 2 is not ideal.

so I guess the question is, how should we handle this situation? do we mark the node as degraded anyway? do we just keep trying to attempt the update forever? should there be some kind of exponential backoff strategy? do we stop trying after a while and bubble up this issue so a cluster administrator will notice the problem, but without setting the node as degraded (which has other semantic meaning)? thoughts?

MCD: Skeleton code

Reminder that we are waiting on MCD skeleton code which implements the contract to MCO. We will then pick up and implement the internals (diffs, ostree integration, etc..).