Comments (9)
I cobbled together something to indicate the issue even though it doesn't cause errors since I couldn't get it to boot with /var
initially read-only.
$ cat /usr/local/bin/var-overlay-setup
#!/bin/sh -e
mkdir -p /run/var-overlay-upper /run/var-overlay-work
mount -t overlay -o lowerdir=/var,upperdir=/run/var-overlay-upper,workdir=/run/var-overlay-work var-overlay /var
$ cat /etc/systemd/system/var-overlay.service
[Unit]
DefaultDependencies=no
Before=shutdown.target local-fs.target
Conflicts=shutdown.target
RequiresMountsFor=/var
[Service]
Type=oneshot
ExecStart=/usr/local/bin/var-overlay-setup
RemainAfterExit=yes
[Install]
WantedBy=local-fs.target
Here's the journal logs:
Thu 2024-04-18 22:43:24 MDT endless init.scope[1]: Mounted var.mount - /var.
Thu 2024-04-18 22:43:24 MDT endless init.scope[1]: systemd-pstore.service - Platform Persistent Storage Archival was skipped because of an unmet condition che
ck (ConditionDirectoryNotEmpty=/sys/fs/pstore).
Thu 2024-04-18 22:43:24 MDT endless init.scope[1]: Starting var-overlay.service...
Thu 2024-04-18 22:43:24 MDT endless init.scope[1]: Starting systemd-journal-flush.service - Flush Journal to Persistent Storage...
Thu 2024-04-18 22:43:24 MDT endless init.scope[1]: Starting systemd-random-seed.service - Load/Save OS Random Seed...
Thu 2024-04-18 22:43:24 MDT endless systemd-journald.service[328]: Time spent on flushing to /var/log/journal/f6a1c4895ff144eb8e2b5866c4ce1498 is 47.023ms for 913 entries.
Thu 2024-04-18 22:43:24 MDT endless systemd-journald.service[328]: System Journal (/var/log/journal/f6a1c4895ff144eb8e2b5866c4ce1498) is 43.8M, max 45.2M, 1.4M free.
Thu 2024-04-18 22:43:24 MDT endless systemd-journald.service[328]: Received client request to flush runtime journal.
Thu 2024-04-18 22:43:24 MDT endless kernel: overlayfs: "xino" feature enabled using 2 upper inode bits.
Thu 2024-04-18 22:43:24 MDT endless systemd-journald.service[328]: /var/log/journal/f6a1c4895ff144eb8e2b5866c4ce1498/system.journal: Journal file uses a different sequence number ID, rotating.
Thu 2024-04-18 22:43:24 MDT endless systemd-journald.service[328]: Rotating system journal.
Thu 2024-04-18 22:43:24 MDT endless audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=var-overlay comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Thu 2024-04-18 22:43:24 MDT endless audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=systemd-random-seed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Thu 2024-04-18 22:43:24 MDT endless audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=systemd-boot-update comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Thu 2024-04-18 22:43:24 MDT endless audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=plymouth-read-write comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Thu 2024-04-18 22:43:24 MDT endless audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=systemd-binfmt comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Thu 2024-04-18 22:43:24 MDT endless audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=systemd-udev-trigger comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Thu 2024-04-18 22:43:24 MDT endless init.scope[1]: Finished var-overlay.service.
Thu 2024-04-18 22:43:24 MDT endless init.scope[1]: Reached target local-fs.target - Local File Systems.
Notice how systemd-journal-flush.service
and systemd-random-seed.service
start before the var-overlay.service
unit completes. You can see from the output that the journal files are being rotated essentially concurrently with overlayfs being initialized. Are these modifications being recorded in the overlay upper or lower directory? I don't know, but if the units started after local-fs.target
, they'd start at basically the same time but after this mount post-processing.
from systemd.
Note, networkd should handle such case gracefully, as it touches files under /var only after systemd-networkd-persistent-storage.service is started. Ah, in this case, overlay... Hm.
from systemd.
If the lower directory /var is not a mount point, by creating var.mount with Type=overlay
and some required options should work, though I have not tested. The upper directory can be created by setting RuntimeDirectory= (with PreserveRuntirmDirectory=yes) to the mount unit, or by another service.
from systemd.
If the underlying /var is also a partition, then how about to mount /var with a service unit, and set Before=var.mount
?
from systemd.
Several systemd units such as
systemd-rfkill.service
that persist data to/var
fail if the overlayfs hasn't been mounted since the underlying mount is read only. These units are ordered correctly with respect to the mount they care about, but they all haveDefaultDependencies=no
and do not haveAfter=local-fs.target
. While this is technically correct, I think it would make more sense if they hadAfter=local-fs.target
since that's the defined point where local filesystem mounting has completed.
This is actually very carefully done, to minimize dependencies in early boot.
Note that these units are also ordered after systemd-remount-fs.service btw, which is supposed to be the point where / becomes writable if it was originally mounted read-only and is supposed to be writable.
It appears to me you could simply do your overlayfs replacement donce before systemd-remount-fs.service and things should mostly work.
from systemd.
Several systemd units such as
systemd-rfkill.service
that persist data to/var
fail if the overlayfs hasn't been mounted since the underlying mount is read only. These units are ordered correctly with respect to the mount they care about, but they all haveDefaultDependencies=no
and do not haveAfter=local-fs.target
. While this is technically correct, I think it would make more sense if they hadAfter=local-fs.target
since that's the defined point where local filesystem mounting has completed.This is actually very carefully done, to minimize dependencies in early boot.
Sure, I figured as much. However, these are the only services I've ever seen that write to real filesystems that aren't ordered after local-fs.target
. Every service that doesn't have DefaultDependencies=no
already gets that automatically, and every other service I've seen that sets DefaultDependencies=no
and writes to a real filesystem orders itself after local-fs.target
.
I guess what it comes down to is that my desired semantics are that if your unit writes to real filesystems then it should order itself after local-fs.target
. To me that's the purpose of the target - local filesystems are ready to be used. That's the only way to reliably do this type of mount stacking since there can only be one mount unit per path.
Note that these units are also ordered after systemd-remount-fs.service btw, which is supposed to be the point where / becomes writable if it was originally mounted read-only and is supposed to be writable.
It appears to me you could simply do your overlayfs replacement donce before systemd-remount-fs.service and things should mostly work.
I don't think I would want to order before systemd-remount-fs.service
since systemd-remount-fs
would then be applying mount options to the overlayfs mount.
The most reliable thing would be to take over all the mount units that I care about with a generator. @yuwata's suggestion to order before var.mount
(for example) would also work similarly. What I don't want to have to do, though, is reimplement the logic that the fstab and other generators have for creating the normal mount units. In a service unit, I could use systemctl show
or similar to find the mount parameters that the generators determined. In a generator I'd have to reimplement, though, since generators execute in parallel and I couldn't read the units that another generator came up with.
from systemd.
I guess what it comes down to is that my desired semantics are that if your unit writes to real filesystems then it should order itself after
local-fs.target
. To me that's the purpose of the target - local filesystems are ready to be used. That's the only way to reliably do this type of mount stacking since there can only be one mount unit per path.
Sorry, but I vehemently disagree with this. The thing is that some services such as journald, timesyncd, networkd, coredump are so fundamental that they should not be delayed longer than necessary, and I am sorry, there's really no reason to wait for /home to be mounted to just allow journald to do its thing...
So no, we are certainly not going to move all early boot stuff behind local-fs.target, if we know precisely what it needs. And we do know for those services what they need.
I don't think I would want to order before
systemd-remount-fs.service
sincesystemd-remount-fs
would then be applying mount options to the overlayfs mount.
Why wouldn't that be fine? Also it only does that if you actually list your rootfs on /etc/fstab. Why would you do that?
Alternatively, just list the five services explicitly with ordering deps on the service that establishes overlayfs for you?
from systemd.
So, I checked services, e.g. systemd-rfkill.service, have StateDirectory=. So, they actually wait for /var is mounted.
Problem here is that you implement overlay mount with a .service unit, rather than .mount unit. So, PID1 does not automatically add dependencies for the overlay mount to the services which requires StateDirectory=.
As I said, please mount the overlayfs through .mount. Then everything should work as expected.
from systemd.
So no, we are certainly not going to move all early boot stuff behind local-fs.target, if we know precisely what it needs. And we do know for those services what they need.
Fair enough.
I don't think I would want to order before
systemd-remount-fs.service
sincesystemd-remount-fs
would then be applying mount options to the overlayfs mount.Why wouldn't that be fine? Also it only does that if you actually list your rootfs on /etc/fstab. Why would you do that?
In my narrow use case it doesn't. In general, though, I don't know what mount options are in fstab. I don't want to apply a mount option to overlayfs that's only valid for ext4, for instance.
Alternatively, just list the five services explicitly with ordering deps on the service that establishes overlayfs for you?
This is what I originally did, but it proved problematic.
-
I can't order
[email protected]
(or any other templated unit) since that apparently gets intepreted as[email protected]
. I looked through the documentation, but it doesn't appear that there's any way to order against all instances of a template unit. Is that a bug or just how it works? -
Maintaining the list of units to order before is bound to go stale. It's also not 5 units. It was 9 units in v254, but that's more than doubled now:
$ git grep -l -e StateDirectory -e CacheDirectory -e LogsDirectory -e var.mount -e 'RequiresMountsFor=.*/var' units | xargs grep -l -e DefaultDependencies=no units/[email protected] units/[email protected] units/systemd-journal-flush.service units/systemd-networkd-persistent-storage.service units/systemd-pcrlock-file-system.service.in units/systemd-pcrlock-firmware-code.service.in units/systemd-pcrlock-firmware-config.service.in units/systemd-pcrlock-machine-id.service.in units/systemd-pcrlock-make-policy.service.in units/systemd-pcrlock-secureboot-authority.service.in units/systemd-pcrlock-secureboot-policy.service.in units/[email protected] units/systemd-pstore.service.in units/systemd-rfkill.service.in units/systemd-rfkill.socket units/systemd-timesyncd.service.in units/systemd-tpm2-setup.service.in units/systemd-update-utmp-runlevel.service.in units/systemd-update-utmp.service.in
It also sucks to have to go analyze all of those units (and any others on the system) and determine exactly what their filesystem requirements are.
Anyways, it does seem that having Before=var.mount
and handling the /var
mount manually works. I wouldn't want to do that in general, but for this narrow use case it's not bad. Feel free to close this issue.
from systemd.
Related Issues (20)
- Missing credentials in `ExecStartPost=/ExecStop*=` when `ReadWritePaths=` is used HOT 3
- LUKS unlock failure with a FIDO2 token when using gpt-auto-generator HOT 1
- Not able to add device/event to watch list of systemd-logind HOT 4
- systemd-homed fscrypt-backed files are still visible after logout HOT 1
- `systemctl disable [--now] someunit@*` gives wrong error message HOT 2
- TPM2 support: Compatibility with older Intel PTT HOT 3
- Log output can be lost from services using log namespaces that only produce output immediately before exiting HOT 2
- `systemctl disable [--now] someunit@*` gives an error message HOT 2
- Try unlock with FIDO2 key before asking for password(/PIN). HOT 2
- systemd.network Kind=/Type= ambiguity
- systemd.network Name=enp* matches eno* HOT 4
- Strange results when using sd_device_monitor to monitor USB device events HOT 15
- IPv6 Compliance Failure Summary (April 25, 2024)
- systemctl hibernate error message on "not enough space" could use more detail
- machined: Assertion '(_error) != 0' failed at src/shared/discover-image.c:1450, function image_read_metadata(). Aborting. (in developer mode)
- WorkingDirectory= feature to normalize or ability to use ".." ? HOT 1
- systemd-repart: document implied copy deny lists, and how to cancel them out HOT 3
- IPv6 Compliance RFC4862: Address Lifetime Expiry (Hosts Only) [v6LC.3.2.2] HOT 3
- sysupdate / Automatic Boot Assessment: For rootfs image as well?
- systemd-repart refuses to copy blocks from unaligned files HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from systemd.