Altough this was already discussed in IRC I allow myself to open an issue to track the problem and progress with this issue.
Starting position
Create Debian Jessie container on a Jessie LXC host with debops:
lxc_containers:
- name: 'jessie01'
template_options: '--release jessie'
This will install systemd by default.
Error
When trying to start the container, the following error appears:
# lxc-start -n jessie01
Failed to mount tmpfs at /dev/shm: Operation not permitted
Reason
'cap_sys_admin' is dropped in /var/lib/lxc/jessie01/config
as defined in defaults/main.yml and therefore prevents systemd
to mount some required file systems:
# List of default POSIX capabilities which should be dropped in all LXC containers
lxc_capabilities_drop: [ 'mknod', 'sys_admin', 'sys_rawio', 'syslog', 'wake_alarm' ]
Known Work-Arounds
- Remove systemd from your Jessie installation. NOTE:
lxc.autodev = 1
and lxc.kmesg = 0
must be removed from the container configuration to make this work.
- Don't drop 'cap_sys_admin' in your container. This makes
systemd
to fully work without further
configuration. NOTE: This has a huge negative security impact.
Unsuccessful Work-Around
I also tried to drop 'cap_sys_admin' and make LXC mount the required file systems without systemd
involvement. For this I added:
lxc.mount.entry = tmpfs dev/shm tmpfs nosuid,nodev 0 0
lxc.mount.entry = tmpfs run tmpfs nosuid,relatime 0 0
lxc.mount.entry = tmpfs run/lock tmpfs nosuid,nodev,noexec,relatime 0 0
Unfortunately this fails with the message that /run/lock
doesn't exist:
lxc-start: No such file or directory - failed to mount 'tmpfs' on '/usr/lib/x86_64-linux-gnu/lxc/rootfs/run/lock'
Bugs
- Debian #775067: preventing
journald
to forward messages to syslog in case 'cap_sys_admin' is dropped. This is only fixed in systemd_218-4
in experimental now.
As I could live with the mentioned systemd bug, I'm still trying to find a way to run it without 'cap_sys_admin'. The challenges then are:
- Is there any configuration twist for LXC which would allow me to create the nested mount path
/run/lock
before actually mounting it?
- Or is there a configuration option for
systemd
to not mount a separate file system for /run/lock
?
If there are some other possible work-arounds or any hints regarding my open questions, please let me know. I'll update once I found out more