Comments (14)
Or, another idea. Like #32033, networkd should read configs from credentials. Then it is neither necessary to add another default .network file, nor customizing existing image.
Hmm, this is already supported? #30827
from systemd.
hmm, which container mgr do you have in mind to make use of this?
i mean, i am fine adding something like this, but nspawn at least wouldn't use this, but what would? we usually want to see some avenue for this to be actually used in something?
If we add this I wonder if it shouldn't be a separate .network file, dunno. so that you can disable it independently? and maybe we want slightly different config for this?
from systemd.
I am working on a container manager that allows running a "fat" container without requiring any privileges that a typical user account would lack. While podman/docker allow running unprivileged application containers (without systemd), running a "fat" container (systemd is pid 1) without elevated privileges (e.g. lxc/lxd/incus or using privileged docker/podman containers) is a difficult thing to do still.
I have a vaguely working PoC demonstrating that unprivileged fat containers do work though it is far from usable at this time. Roughly speaking what you need is unprivileged user namespaces (typically enabled by default these days), a subuid allocation and newuidmap
/newgidmap
(commonly installed by default or pulled by flatpak/bubblewrap) and a delegated cgroup (systemd-run --user --scope
creates it). Plugging these together, the requirements of the systemd container specification can mostly be met and systemd successfully boots a container with few quirks. One of the quirks is that I lack privileges to connect a veth
interface anywhere and use slirp4netns
or pasta
instead (which is very similar to kvm -net user
). When doing so, systemd-networkd
ignores my host0
interface as it is of the wrong kind.
The goal of such containers is not running services. The use of slirp networking definitely impacts performance. It is meant for experimentation and as an alternative to running virtual machines. In particular, it should be possible to directly boot a virtual machine image as a container (by mounting its root filesystem via fuse into the container).
Regardless of whether my experiment or PoC turns out to be useful, I expect that the ability to run unprivileged containers is a very useful thing to do and technically speaking you only have four options there:
- Share the host network and do not unshare a network namespace. (Works badly if you run services)
- Use a tun interface and have systemd-networkd configure it. (This issue)
- Use a tun interface that comes pre-configured by the container runtime. (Integrates badly with systemd-resolved)
- Use a veth interface. (See above for the resulting complications)
I hope my reasoning is convincing or you uncover another technical way that I didn't see earlier.
from systemd.
I am not sure about the setup, but why not to add a .network file for the tun interface by hand?
If we added another(?) networking mode to nspawn that uses tun, then I think it is OK to add relevant .network files. Otherwise, we should not add further default .network file unless it is really really common use case.
from systemd.
It is technically possible to add a .network
file, yes. It can either be added by the container creation utility, but then the benefit of being able to boot existing VM and container images as containers would go away. It could also be added by the container runtime in /run/systemd/network
. This technically and practically just works, but still feels like a layer violation. Having this as a separate configuration file also works for me. Generally, I'd like to have stuff just work rather than requiring a customization. Hence I looked into veth
first, but that does not seem to technically work out.
Having systemd-nspawn
work without root would be really great. If that happens to work in another way, consider this issue closed as that's the use case I'm asking for.
from systemd.
Or, another idea. Like #32033, networkd should read configs from credentials. Then it is neither necessary to add another default .network file, nor customizing existing image.
from systemd.
Or, another idea. Like #32033, networkd should read configs from credentials. Then it is neither necessary to add another default .network file, nor customizing existing image.
Hmm, this is already supported? #30827
Oh. I forgot. That's really nice.
@helmutg Does it work for you?
from systemd.
Thanks for the suggestions. I haven't turned it into code yet, but I'm positive that it'll work out.
from systemd.
Let me note that in systemd 252, the relevant .network file did not match on Kind=
at all and that happened to just work for my use case. The commit that broke my use case is f139393. If I now set up a .network file and run with an older systemd, I get two conflicting configurations.
from systemd.
I think your usecase is valid, and adding what you are asking for makes sense.
I am still not entirely sure whether this should be done within the same .network file as you propose, or with two distinct ones.
from systemd.
I am working on a container manager that allows running a "fat" container without requiring any privileges that a typical user account would lack. While podman/docker allow running unprivileged application containers (without systemd), running a "fat" container (systemd is pid 1) without elevated privileges (e.g. lxc/lxd/incus or using privileged docker/podman containers) is a difficult thing to do still.
Note that nspawn in git allows unpriv invocation. Only supports running off disk images though.
from systemd.
I am not sure whether it should be a separate .network
file or not either and prefer to leave that at your discretion.
I am aware (thanks to Luca Boccassi) that nspawn gains unpriv invocation, but I am still unhappy with its terms. It requires a complex syscall filter to be attached and as far as I understand it the disk images require verity (or privileges).
On the other hand, I can mostly run a full systemd-as-pid-1 container without the need for BPF on current systemd already. Admittedly, my vehicle also is an ext4 disk image (driven with fuse2fs), but I could also use an extracted tree and it is mounted RW. The ability to run off my own crafted images is important to me.
from systemd.
I am not sure whether it should be a separate
.network
file or not either and prefer to leave that at your discretion.
I was kinda hoping someone else would argue for one of the two solutions in an informed way, so that I can just agree. ;-)
(I am somewhat leaning towards making this a separate .network thing, because it keeps the door open to have different settings sooner or later. I think veth and tun are sufficiently different so that we eventually might want to do that. In particular as veth generally mimics an ethernet iface while tun mimics a raw ip iface (i.e. layer2 vs. layer3) and we have various options that only applay to layer3 devices in .network)
from systemd.
anyway, if you want to see this implemented, please prep a PR.
from systemd.
Related Issues (20)
- System doesn't survive upgrade from v255 to v256-rc1 HOT 17
- logind `CanGraphical` state change only after DRM driver init HOT 9
- On Fedora with systemd 256 dracut cannot write hooks to `/lib/dracut/hooks` during boot because `/lib` is `/usr/lib`, which is read-only with new ProtectSystem feature HOT 5
- PID 1 complains about `Unknown serialization item 'handoff-timestamp-fds=93 94', ignoring.` on every daemon-reexec
- ProtectClock= is incompatible with DeviceAllow= in systemd-analyze security
- BTRFS mount usability with subvolumes and umlauts HOT 3
- [systemd-boot] Add ability to persistently set-oneshot when using @saved
- systemd silently fails to set persistent network names if they are too long HOT 2
- IPv6 Compliance RFC4861: Redirected On-link: Valid (Hosts Only) [v6LC.2.3.1 Part C] HOT 5
- systemd-resolved 255.5 fails DNSSEC verification of www.youtube.com (unsigned domain), was working in 255.4 HOT 10
- SocketBindAllow extend to more IP protocols
- systemd-boot does not show boot entries defined on the extended boot partition with nvme device HOT 1
- systemd.crash_shell runs in tandem with serial-getty@ttyS0 causing issues
- journalctl --sync should guarantee all currently in flight logs are written to the journal HOT 5
- Transient hostname behaves weirdly with DHCP leases HOT 9
- systemctl link ./*.container is not a valid unit name (podman) HOT 2
- systemd-resolved causes Firefox to take forever to load without work around HOT 12
- Add per unit LogForwardToXXX= settings HOT 3
- configurable nologin minimum time on shutdown
- systemd-resolved "DNSSEC=allow-downgrade" fails HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from systemd.