Coder Social home page Coder Social logo

mhx / dwarfs Goto Github PK

View Code? Open in Web Editor NEW
1.9K 19.0 51.0 12.38 MB

A fast high compression read-only file system for Linux, Windows and macOS

License: GNU General Public License v3.0

CMake 2.68% C++ 92.90% Thrift 0.66% Python 0.57% C 2.49% Batchfile 0.01% Shell 0.61% Makefile 0.08%
gpl-license filesystem squashfs compression fuse-filesystem cpp lzma zstd lrzip zpaq

dwarfs's Introduction

Latest Release Total Downloads DwarFS CI Build Build Status Codacy Badge codecov OpenSSF Best Practices

DwarFS

The Deduplicating Warp-speed Advanced Read-only File System.

A fast high compression read-only file system for Linux and Windows.

Table of contents

Overview

Windows Screen Capture

Linux Screen Capture

DwarFS is a read-only file system with a focus on achieving very high compression ratios in particular for very redundant data.

This probably doesn't sound very exciting, because if it's redundant, it should compress well. However, I found that other read-only, compressed file systems don't do a very good job at making use of this redundancy. See here for a comparison with other compressed file systems.

DwarFS also doesn't compromise on speed and for my use cases I've found it to be on par with or perform better than SquashFS. For my primary use case, DwarFS compression is an order of magnitude better than SquashFS compression, it's 6 times faster to build the file system, it's typically faster to access files on DwarFS and it uses less CPU resources.

To give you an idea of what DwarFS is capable of, here's a quick comparison of DwarFS and SquashFS on a set of video files with a total size of 39 GiB. The twist is that each unique video file has two sibling files with a different set of audio streams (this is an actual use case). So there's redundancy in both the video and audio data, but as the streams are interleaved and identical blocks are typically very far apart, it's challenging to make use of that redundancy for compression. SquashFS essentially fails to compress the source data at all, whereas DwarFS is able to reduce the size by almost a factor of 3, which is close to the theoretical maximum:

$ du -hs dwarfs-video-test
39G     dwarfs-video-test
$ ls -lh dwarfs-video-test.*fs
-rw-r--r-- 1 mhx users 14G Jul  2 13:01 dwarfs-video-test.dwarfs
-rw-r--r-- 1 mhx users 39G Jul 12 09:41 dwarfs-video-test.squashfs

Furthermore, when mounting the SquashFS image and performing a random-read throughput test using fio-3.34, both squashfuse and squashfuse_ll top out at around 230 MiB/s:

$ fio --readonly --rw=randread --name=randread --bs=64k --direct=1 \
      --opendir=mnt --numjobs=4 --ioengine=libaio --iodepth=32 \
      --group_reporting --runtime=60 --time_based
[...]
   READ: bw=230MiB/s (241MB/s), 230MiB/s-230MiB/s (241MB/s-241MB/s), io=13.5GiB (14.5GB), run=60004-60004msec

In comparison, DwarFS manages to sustain random read rates of 20 GiB/s:

  READ: bw=20.2GiB/s (21.7GB/s), 20.2GiB/s-20.2GiB/s (21.7GB/s-21.7GB/s), io=1212GiB (1301GB), run=60001-60001msec

Distinct features of DwarFS are:

  • Clustering of files by similarity using a similarity hash function. This makes it easier to exploit the redundancy across file boundaries.

  • Segmentation analysis across file system blocks in order to reduce the size of the uncompressed file system. This saves memory when using the compressed file system and thus potentially allows for higher cache hit rates as more data can be kept in the cache.

  • Categorization framework to categorize files or even fragments of files and then process individual categories differently. For example, this allows you to not waste time trying to compress incompressible files or to compress PCM audio data using FLAC compression.

  • Highly multi-threaded implementation. Both the file system creation tool as well as the FUSE driver are able to make good use of the many cores of your system.

History

I started working on DwarFS in 2013 and my main use case and major motivation was that I had several hundred different versions of Perl that were taking up something around 30 gigabytes of disk space, and I was unwilling to spend more than 10% of my hard drive keeping them around for when I happened to need them.

Up until then, I had been using Cromfs for squeezing them into a manageable size. However, I was getting more and more annoyed by the time it took to build the filesystem image and, to make things worse, more often than not it was crashing after about an hour or so.

I had obviously also looked into SquashFS, but never got anywhere close to the compression rates of Cromfs.

This alone wouldn't have been enough to get me into writing DwarFS, but at around the same time, I was pretty obsessed with the recent developments and features of newer C++ standards and really wanted a C++ hobby project to work on. Also, I've wanted to do something with FUSE for quite some time. Last but not least, I had been thinking about the problem of compressed file systems for a bit and had some ideas that I definitely wanted to try.

The majority of the code was written in 2013, then I did a couple of cleanups, bugfixes and refactors every once in a while, but I never really got it to a state where I would feel happy releasing it. It was too awkward to build with its dependency on Facebook's (quite awesome) folly library and it didn't have any documentation.

Digging out the project again this year, things didn't look as grim as they used to. Folly now builds with CMake and so I just pulled it in as a submodule. Most other dependencies can be satisfied from packages that should be widely available. And I've written some rudimentary docs as well.

Building and Installing

Prebuilt Binaries

Each release has pre-built, statically linked binaries for Linux-x86_64, Linux-aarch64 and Windows-AMD64 available for download. These should run without any dependencies and can be useful especially on older distributions where you can't easily build the tools from source.

Universal Binaries

In addition to the binary tarballs, there's a universal binary available for each architecture. These universal binaries contain all tools (mkdwarfs, dwarfsck, dwarfsextract and the dwarfs FUSE driver) in a single executable. These executables are compressed using upx, so they are much smaller than the individual tools combined. However, it also means the binaries need to be decompressed each time they are run, which can have a signficant overhead. If that is an issue, you can either stick to the "classic" individual binaries or you can decompress the universal binary, e.g.:

upx -d dwarfs-universal-0.7.0-Linux-aarch64

The universal binaries can be run through symbolic links named after the proper tool. e.g.:

$ ln -s dwarfs-universal-0.7.0-Linux-aarch64 mkdwarfs
$ ./mkdwarfs --help

This also works on Windows if the file system supports symbolic links:

> mklink mkdwarfs.exe dwarfs-universal-0.7.0-Windows-AMD64.exe
> .\mkdwarfs.exe --help

Alternatively, you can select the tool by passing --tool=<name> as the first argument on the command line:

> .\dwarfs-universal-0.7.0-Windows-AMD64.exe --tool=mkdwarfs --help

Note that just like the dwarfs.exe Windows binary, the universal Windows binary depends on the winfsp-x64.dll from the WinFsp project. However, for the universal binary, the DLL is loaded lazily, so you can still use all other tools without the DLL. See the Windows Support section for more details.

Dependencies

DwarFS uses CMake as a build tool.

It uses both Boost and Folly, though the latter is included as a submodule since very few distributions actually offer packages for it. Folly itself has a number of dependencies, so please check here for an up-to-date list.

It also uses Facebook Thrift, in particular the frozen library, for storing metadata in a highly space-efficient, memory-mappable and well defined format. It's also included as a submodule, and we only build the compiler and a very reduced library that contains just enough for DwarFS to work.

Other than that, DwarFS really only depends on FUSE3 and on a set of compression libraries that Folly already depends on (namely lz4, zstd and liblzma).

The dependency on googletest will be automatically resolved if you build with tests.

A good starting point for apt-based systems is probably:

$ apt install \
    gcc \
    g++ \
    clang \
    git \
    ccache \
    ninja-build \
    cmake \
    make \
    bison \
    flex \
    ronn \
    fuse3 \
    pkg-config \
    binutils-dev \
    libacl1-dev \
    libarchive-dev \
    libbenchmark-dev \
    libboost-chrono-dev \
    libboost-context-dev \
    libboost-filesystem-dev \
    libboost-iostreams-dev \
    libboost-program-options-dev \
    libboost-regex-dev \
    libboost-system-dev \
    libboost-thread-dev \
    libbrotli-dev \
    libevent-dev \
    libhowardhinnant-date-dev \
    libjemalloc-dev \
    libdouble-conversion-dev \
    libiberty-dev \
    liblz4-dev \
    liblzma-dev \
    libmagic-dev \
    librange-v3-dev \
    libssl-dev \
    libunwind-dev \
    libdwarf-dev \
    libelf-dev \
    libfmt-dev \
    libfuse3-dev \
    libgoogle-glog-dev \
    libutfcpp-dev \
    libflac++-dev \
    python3-mistletoe

Note that when building with gcc, the optimization level will be set to -O2 instead of the CMake default of -O3 for release builds. At least with versions up to gcc-10, the -O3 build is up to 70% slower than a build with -O2.

Building

Firstly, either clone the repository...

$ git clone --recurse-submodules https://github.com/mhx/dwarfs
$ cd dwarfs

...or unpack the release archive:

$ tar xvf dwarfs-x.y.z.tar.bz2
$ cd dwarfs-x.y.z

Once all dependencies have been installed, you can build DwarFS using:

$ mkdir build
$ cd build
$ cmake .. -DWITH_TESTS=1
$ make -j$(nproc)

You can then run tests with:

$ make test

All binaries use jemalloc as a memory allocator by default, as it is typically uses much less system memory compared to the glibc or tcmalloc allocators. To disable the use of jemalloc, pass -DUSE_JEMALLOC=0 on the cmake command line.

Installing

Installing is as easy as:

$ sudo make install

Though you don't have to install the tools to play with them.

Static Builds

Attempting to build statically linked binaries is highly discouraged and not officially supported. That being said, here's how to set up an environment where you might be able to build static binaries.

This has been tested with ubuntu-22.04-live-server-amd64.iso. First, install all the packages listed as dependencies above. Also install:

$ apt install ccache ninja libacl1-dev

ccache and ninja are optional, but help with a speedy compile.

Depending on your distibution, you'll need to build and install static versions of some libraries, e.g. libarchive and libmagic for Ubuntu:

$ wget https://github.com/libarchive/libarchive/releases/download/v3.6.2/libarchive-3.6.2.tar.xz
$ tar xf libarchive-3.6.2.tar.xz && cd libarchive-3.6.2
$ ./configure --prefix=/opt/static-libs --without-iconv --without-xml2 --without-expat
$ make && sudo make install
$ wget ftp://ftp.astron.com/pub/file/file-5.44.tar.gz
$ tar xf file-5.44.tar.gz && cd file-5.44
$ ./configure --prefix=/opt/static-libs --enable-static=yes --enable-shared=no
$ make && make install

That's it! Now you can try building static binaries for DwarFS:

$ git clone --recurse-submodules https://github.com/mhx/dwarfs
$ cd dwarfs && mkdir build && cd build
$ cmake .. -GNinja -DWITH_TESTS=1 -DSTATIC_BUILD_DO_NOT_USE=1 \
           -DSTATIC_BUILD_EXTRA_PREFIX=/opt/static-libs
$ ninja
$ ninja test

Usage

Please check out the manual pages for mkdwarfs, dwarfs, dwarfsck and dwarfsextract. You can also access the manual pages using the --man option to each binary, e.g.:

$ mkdwarfs --man

The dwarfs manual page also shows an example for setting up DwarFS with overlayfs in order to create a writable file system mount on top a read-only DwarFS image.

A description of the DwarFS filesystem format can be found in dwarfs-format.

A high-level overview of the internal operation of mkdwarfs is shown in this sequence diagram.

Windows Support

Support for the Windows operating system is currently experimental. Having worked pretty much exclusively in a Unix world for the past two decades, my experience with Windows development is rather limited and I'd expect there to definitely be bugs and rough edges in the Windows code.

The Windows version of the DwarFS filesystem driver relies on the awesome WinFsp project and its winfsp-x64.dll must be discoverable by the dwarfs.exe driver.

The different tools should behave pretty much the same whether you're using them on Linux or Windows. The file system images can be copied between Linux and Windows and images created on one OS should work fine on the other.

There are a few things worth pointing out, though:

  • DwarFS supports both hardlinks and symlinks on Windows, just as it does on Linux. However, creating hardlinks and symlinks seems to require admin privileges on Windows, so if you want to e.g. extract a DwarFS image that contains links of some sort, you might run into errors if you don't have the right privileges.

  • Due to a problem in WinFsp, symlinks cannot currently point outside of the mounted file system. Furthermore, due to another problem in WinFsp, symlinks with a drive letter will appear with a mangled target path.

  • The DwarFS driver on Windows correctly reports hardlink counts via its API, but currently these counts are not correctly propagated to the Windows file system layer. This is presumably due to a problem in WinFsp.

  • When mounting a DwarFS image on Windows, the mount point must not exist. This is different from Linux, where the mount point must actually exist. Also, it's possible to mount a DwarFS image as a drive letter, e.g.

    dwarfs.exe image.dwarfs Z:

  • Filter rules for mkdwarfs always require Unix path separators, regardless of whether it's running on Windows or Linux.

Building on Windows

Building on Windows is not too complicated thanks to vcpkg. You'll need to install:

WinFsp is expected to be installed in C:\Program Files (x68)\WinFsp; if it's not, you'll need to set WINFSP_PATH when running CMake via cmake/win.bat.

Now you need to clone vcpkg and dwarfs:

> cd %HOMEPATH%
> mkdir git
> cd git
> git clone https://github.com/Microsoft/vcpkg.git
> git clone https://github.com/mhx/dwarfs

Then, bootstrap vcpkg:

> .\vcpkg\bootstrap-vcpkg.bat

And build DwarFS:

> cd dwarfs
> mkdir build
> cd build
> ..\cmake\win.bat
> ninja

Once that's done, you should be able to run the tests. Set CTEST_PARALLEL_LEVEL according to the number of CPU cores in your machine.

> set CTEST_PARALLEL_LEVEL=10
> ninja test

macOS Support

Support for the macOS operating system is currently experimental.

The macOS version of the DwarFS filesystem driver relies on the awesome macFUSE project.

Building on macOS

Building on macOS involves a few steps, but should be relatively straightforward:

  • Install Homebrew

  • Use Homebrew to install the necessary dependencies:

$ brew install cmake ninja ronn macfuse python3 brotli howard-hinnant-date \
               double-conversion fmt glog libarchive libevent flac openssl \
               pkg-config range-v3 utf8cpp xxhash boost zstd jemalloc
  • When installing macFUSE for the first time, you'll need to explicitly allow the sofware in System Preferences / Privacy & Security. It's quite likely that you'll have to reboot after this.

  • Clone the DwarFS repository:

$ git clone --recurse-submodules https://github.com/mhx/dwarfs
  • Prepare the build by installing the mistletoe python module in a virtualenv:
$ cd dwarfs
$ python3 -m venv @buildenv
$ source ./@buildenv/bin/activate
$ pip3 install mistletoe
  • Build DwarFS and run its tests:
$ git checkout v0.9.4
$ git submodule update
$ mkdir build && cd build
$ cmake .. -GNinja -DWITH_TESTS=ON
$ ninja
$ export CTEST_PARALLEL_LEVEL=$(sysctl -n hw.logicalcpu)
$ ninja test
  • Install DwarFS:
$ ninja install

That's it!

Use Cases

Astrophotography

Astrophotography can generate huge amounts of raw image data. During a single night, it's not unlikely to end up with a few dozens of gigabytes of data. With most dedicated astrophotography cameras, this data ends up in the form of FITS images. These are usually uncompressed, don't compress very well with standard compression algorithms, and while there are certain compressed FITS formats, these aren't widely supported.

One of the compression formats (simply called "Rice") compresses reasonably well and is really fast. However, its implementation for compressed FITS has a few drawbacks. The most severe drawbacks are that compression isn't quite as good as it could be for color sensors and sensors with a less than 16 bits of resolution.

DwarFS supports the ricepp (Rice++) compression, which builds on the basic idea of Rice compression, but makes a few enhancements: it compresses color and low bit depth images significantly better and always searches for the optimum solution during compression instead of relying on a heuristic.

Let's look at an example using 129 images (darks, flats and lights) taken with an ASI1600MM camera. Each image is 32 MiB, so a total of 4 GiB of data. Compressing these with the standard fpack tool takes about 16.6 seconds and yields a total output size of 2.2 GiB:

$ time fpack */*.fit */*/*.fit

user	14.992
system	1.592
total	16.616

$ find . -name '*.fz' -print0 | xargs -0 cat | wc -c
2369943360

However, this leaves you with *.fz files that not every application can actually read.

Using DwarFS, here's what we get:

$ mkdwarfs -i ASI1600 -o asi1600-20.dwarfs -S 20 --categorize
I 08:47:47.459077 scanning "ASI1600"
I 08:47:47.491492 assigning directory and link inodes...
I 08:47:47.491560 waiting for background scanners...
I 08:47:47.675241 scanning CPU time: 1.051s
I 08:47:47.675271 finalizing file inodes...
I 08:47:47.675330 saved 0 B / 3.941 GiB in 0/258 duplicate files
I 08:47:47.675360 assigning device inodes...
I 08:47:47.675371 assigning pipe/socket inodes...
I 08:47:47.675381 building metadata...
I 08:47:47.675393 building blocks...
I 08:47:47.675398 saving names and symlinks...
I 08:47:47.675514 updating name and link indices...
I 08:47:47.675796 waiting for segmenting/blockifying to finish...
I 08:47:50.274285 total ordering CPU time: 616.3us
I 08:47:50.274329 total segmenting CPU time: 1.132s
I 08:47:50.279476 saving chunks...
I 08:47:50.279622 saving directories...
I 08:47:50.279674 saving shared files table...
I 08:47:50.280745 saving names table... [1.047ms]
I 08:47:50.280768 saving symlinks table... [743ns]
I 08:47:50.282031 waiting for compression to finish...
I 08:47:50.823924 compressed 3.941 GiB to 1.201 GiB (ratio=0.304825)
I 08:47:50.824280 compression CPU time: 17.92s
I 08:47:50.824316 filesystem created without errors [3.366s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
waiting for block compression to finish
5 dirs, 0/0 soft/hard links, 258/258 files, 0 other
original size: 3.941 GiB, hashed: 315.4 KiB (18 files, 0 B/s)
scanned: 3.941 GiB (258 files, 117.1 GiB/s), categorizing: 0 B/s
saved by deduplication: 0 B (0 files), saved by segmenting: 0 B
filesystem: 3.941 GiB in 4037 blocks (4550 chunks, 516/516 fragments, 258 inodes)
compressed filesystem: 4037 blocks/1.201 GiB written

In less than 3.4 seconds, it compresses the data down to 1.2 GiB, almost half the size of the fpack output.

In addition to saving a lot of disk space, this can also be useful when your data is stored on a NAS. Here's a comparison of the same set of data accessed over a 1 Gb/s network connection, first using the uncompressed raw data:

find /mnt/ASI1600 -name '*.fit' -print0 | xargs -0 -P4 -n1 cat | dd of=/dev/null status=progress
4229012160 bytes (4.2 GB, 3.9 GiB) copied, 36.0455 s, 117 MB/s

And next using a DwarFS image on the same share:

$ dwarfs /mnt/asi1600-20.dwarfs mnt

$ find mnt -name '*.fit' -print0 | xargs -0 -P4 -n1 cat | dd of=/dev/null status=progress
4229012160 bytes (4.2 GB, 3.9 GiB) copied, 14.3681 s, 294 MB/s

That's roughly 2.5 times faster. You can very likely see similar results with slow external hard drives.

Dealing with Bit Rot

Currently, DwarFS has no built-in ability to add recovery information to a file system image. However, for archival purposes, it's a good idea to have such recovery infomation in order to be able to repair a damaged image.

This is fortunately relatively straightforward using something like par2cmdline:

$ par2create -n1 asi1600-20.dwarfs

This will create two additional files that you can place alongside the image (or on a different storage), as you'll only need them if DwarFS has detected an issue with the file system image. If there's an issue, you can run

$ par2repair asi1600-20.dwarfs

which will very likely be able to recover the image if less than 5% (that's the default used by par2create) of the image are damaged.

Extended Attributes

Preserving Extended Attributes in DwarFS Images

Extended attributes are not currently supported. Any extended attributes stored in the source file system will not currently be preserved when building a DwarFS image using mkdwarfs.

Extended Attributes exposed by the FUSE Driver

That being said, the root inode of a mounted DwarFS image currently exposes one or two extended attributes on Linux:

$ attr -l mnt
Attribute "dwarfs.driver.pid" has a 4 byte value for mnt
Attribute "dwarfs.driver.perfmon" has a 4849 byte value for mnt

The dwarfs.driver.pid attribute simply contains the PID of the DwarFS FUSE driver. The dwarfs.driver.perfmon attribute contains the current results of the performance monitor.

Furthermore, each regular file exposes an attribute dwarfs.inodeinfo with information about the undelying inode:

$ attr -l "05 Disappear.caf"
Attribute "dwarfs.inodeinfo" has a 448 byte value for 05 Disappear.caf

The attribute contains a JSON object with information about the underlying inode:

$ attr -qg dwarfs.inodeinfo "05 Disappear.caf"
{
  "chunks": [
    {
      "block": 2,
      "category": "pcmaudio/metadata",
      "offset": 270976,
      "size": 4096
    },
    {
      "block": 414,
      "category": "pcmaudio/waveform",
      "offset": 37594368,
      "size": 29514492
    },
    {
      "block": 419,
      "category": "pcmaudio/waveform",
      "offset": 0,
      "size": 29385468
    }
  ],
  "gid": 100,
  "mode": 33188,
  "modestring": "----rw-r--r--",
  "uid": 1000
}

This is useful, for example, to check how a particular file is spread across multiple blocks or which categories have been assigned to the file.

Comparison

The SquashFS, xz, lrzip, zpaq and wimlib tests were all done on an 8 core Intel(R) Xeon(R) E-2286M CPU @ 2.40GHz with 64 GiB of RAM.

The Cromfs tests were done with an older version of DwarFS on a 6 core Intel(R) Xeon(R) CPU D-1528 @ 1.90GHz with 64 GiB of RAM.

The EROFS tests were done using DwarFS v0.9.8 and EROFS v1.7.1 on an Intel(R) Core(TM) i9-13900K with 64 GiB of RAM.

The systems were mostly idle during all of the tests.

With SquashFS

The source directory contained 1139 different Perl installations from 284 distinct releases, a total of 47.65 GiB of data in 1,927,501 files and 330,733 directories. The source directory was freshly unpacked from a tar archive to an XFS partition on a 970 EVO Plus 2TB NVME drive, so most of its contents were likely cached.

I'm using the same compression type and compression level for SquashFS that is the default setting for DwarFS:

$ time mksquashfs install perl-install.squashfs -comp zstd -Xcompression-level 22
Parallel mksquashfs: Using 16 processors
Creating 4.0 filesystem on perl-install-zstd.squashfs, block size 131072.
[=========================================================/] 2107401/2107401 100%

Exportable Squashfs 4.0 filesystem, zstd compressed, data block size 131072
        compressed data, compressed metadata, compressed fragments,
        compressed xattrs, compressed ids
        duplicates are removed
Filesystem size 4637597.63 Kbytes (4528.90 Mbytes)
        9.29% of uncompressed filesystem size (49922299.04 Kbytes)
Inode table size 19100802 bytes (18653.13 Kbytes)
        26.06% of uncompressed inode table size (73307702 bytes)
Directory table size 19128340 bytes (18680.02 Kbytes)
        46.28% of uncompressed directory table size (41335540 bytes)
Number of duplicate files found 1780387
Number of inodes 2255794
Number of files 1925061
Number of fragments 28713
Number of symbolic links  0
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 330733
Number of ids (unique uids + gids) 2
Number of uids 1
        mhx (1000)
Number of gids 1
        users (100)

real    32m54.713s
user    501m46.382s
sys     0m58.528s

For DwarFS, I'm sticking to the defaults:

$ time mkdwarfs -i install -o perl-install.dwarfs
I 11:33:33.310931 scanning install
I 11:33:39.026712 waiting for background scanners...
I 11:33:50.681305 assigning directory and link inodes...
I 11:33:50.888441 finding duplicate files...
I 11:34:01.120800 saved 28.2 GiB / 47.65 GiB in 1782826/1927501 duplicate files
I 11:34:01.122608 waiting for inode scanners...
I 11:34:12.839065 assigning device inodes...
I 11:34:12.875520 assigning pipe/socket inodes...
I 11:34:12.910431 building metadata...
I 11:34:12.910524 building blocks...
I 11:34:12.910594 saving names and links...
I 11:34:12.910691 bloom filter size: 32 KiB
I 11:34:12.910760 ordering 144675 inodes using nilsimsa similarity...
I 11:34:12.915555 nilsimsa: depth=20000 (1000), limit=255
I 11:34:13.052525 updating name and link indices...
I 11:34:13.276233 pre-sorted index (660176 name, 366179 path lookups) [360.6ms]
I 11:35:44.039375 144675 inodes ordered [91.13s]
I 11:35:44.041427 waiting for segmenting/blockifying to finish...
I 11:37:38.823902 bloom filter reject rate: 96.017% (TPR=0.244%, lookups=4740563665)
I 11:37:38.823963 segmentation matches: good=454708, bad=6819, total=464247
I 11:37:38.824005 segmentation collisions: L1=0.008%, L2=0.000% [2233254 hashes]
I 11:37:38.824038 saving chunks...
I 11:37:38.860939 saving directories...
I 11:37:41.318747 waiting for compression to finish...
I 11:38:56.046809 compressed 47.65 GiB to 430.9 MiB (ratio=0.00883101)
I 11:38:56.304922 filesystem created without errors [323s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
waiting for block compression to finish
330733 dirs, 0/2440 soft/hard links, 1927501/1927501 files, 0 other
original size: 47.65 GiB, dedupe: 28.2 GiB (1782826 files), segment: 15.19 GiB
filesystem: 4.261 GiB in 273 blocks (319178 chunks, 144675/144675 inodes)
compressed filesystem: 273 blocks/430.9 MiB written [depth: 20000]
█████████████████████████████████████████████████████████████████████████████▏100% |

real    5m23.030s
user    78m7.554s
sys     1m47.968s

So in this comparison, mkdwarfs is more than 6 times faster than mksquashfs, both in terms of CPU time and wall clock time.

$ ll perl-install.*fs
-rw-r--r-- 1 mhx users  447230618 Mar  3 20:28 perl-install.dwarfs
-rw-r--r-- 1 mhx users 4748902400 Mar  3 20:10 perl-install.squashfs

In terms of compression ratio, the DwarFS file system is more than 10 times smaller than the SquashFS file system. With DwarFS, the content has been compressed down to less than 0.9% (!) of its original size. This compression ratio only considers the data stored in the individual files, not the actual disk space used. On the original XFS file system, according to du, the source folder uses 52 GiB, so the DwarFS image actually only uses 0.8% of the original space.

Here's another comparison using lzma compression instead of zstd:

$ time mksquashfs install perl-install-lzma.squashfs -comp lzma

real    13m42.825s
user    205m40.851s
sys     3m29.088s
$ time mkdwarfs -i install -o perl-install-lzma.dwarfs -l9

real    3m43.937s
user    49m45.295s
sys     1m44.550s
$ ll perl-install-lzma.*fs
-rw-r--r-- 1 mhx users  315482627 Mar  3 21:23 perl-install-lzma.dwarfs
-rw-r--r-- 1 mhx users 3838406656 Mar  3 20:50 perl-install-lzma.squashfs

It's immediately obvious that the runs are significantly faster and the resulting images are significantly smaller. Still, mkdwarfs is about 4 times faster and produces and image that's 12 times smaller than the SquashFS image. The DwarFS image is only 0.6% of the original file size.

So, why not use lzma instead of zstd by default? The reason is that lzma is about an order of magnitude slower to decompress than zstd. If you're only accessing data on your compressed filesystem occasionally, this might not be a big deal, but if you use it extensively, zstd will result in better performance.

The comparisons above are not completely fair. mksquashfs by default uses a block size of 128KiB, whereas mkdwarfs uses 16MiB blocks by default, or even 64MiB blocks with -l9. When using identical block sizes for both file systems, the difference, quite expectedly, becomes a lot less dramatic:

$ time mksquashfs install perl-install-lzma-1M.squashfs -comp lzma -b 1M

real    15m43.319s
user    139m24.533s
sys     0m45.132s
$ time mkdwarfs -i install -o perl-install-lzma-1M.dwarfs -l9 -S20 -B3

real    4m25.973s
user    52m15.100s
sys     7m41.889s
$ ll perl-install*.*fs
-rw-r--r-- 1 mhx users  935953866 Mar 13 12:12 perl-install-lzma-1M.dwarfs
-rw-r--r-- 1 mhx users 3407474688 Mar  3 21:54 perl-install-lzma-1M.squashfs

Even this is still not entirely fair, as it uses a feature (-B3) that allows DwarFS to reference file chunks from up to two previous filesystem blocks.

But the point is that this is really where SquashFS tops out, as it doesn't support larger block sizes or back-referencing. And as you'll see below, the larger blocks that DwarFS is using by default don't necessarily negatively impact performance.

DwarFS also features an option to recompress an existing file system with a different compression algorithm. This can be useful as it allows relatively fast experimentation with different algorithms and options without requiring a full rebuild of the file system. For example, recompressing the above file system with the best possible compression (-l 9):

$ time mkdwarfs --recompress -i perl-install.dwarfs -o perl-lzma-re.dwarfs -l9
I 20:28:03.246534 filesystem rewrittenwithout errors [148.3s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
filesystem: 4.261 GiB in 273 blocks (0 chunks, 0 inodes)
compressed filesystem: 273/273 blocks/372.7 MiB written
████████████████████████████████████████████████████████████████████▏100% \

real    2m28.279s
user    37m8.825s
sys     0m43.256s
$ ll perl-*.dwarfs
-rw-r--r-- 1 mhx users 447230618 Mar  3 20:28 perl-install.dwarfs
-rw-r--r-- 1 mhx users 390845518 Mar  4 20:28 perl-lzma-re.dwarfs
-rw-r--r-- 1 mhx users 315482627 Mar  3 21:23 perl-install-lzma.dwarfs

Note that while the recompressed filesystem is smaller than the original image, it is still a lot bigger than the filesystem we previously build with -l9. The reason is that the recompressed image still uses the same block size, and the block size cannot be changed by recompressing.

In terms of how fast the file system is when using it, a quick test I've done is to freshly mount the filesystem created above and run each of the 1139 perl executables to print their version.

$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'
  Time (mean ± σ):      1.810 s ±  0.013 s    [User: 1.847 s, System: 0.623 s]
  Range (min … max):    1.788 s …  1.825 s    10 runs

Benchmark #2: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'
  Time (mean ± σ):      1.333 s ±  0.009 s    [User: 1.993 s, System: 0.656 s]
  Range (min … max):    1.321 s …  1.354 s    10 runs

Benchmark #3: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P15 sh -c '$0 -v >/dev/null'
  Time (mean ± σ):      1.181 s ±  0.018 s    [User: 2.086 s, System: 0.712 s]
  Range (min … max):    1.165 s …  1.214 s    10 runs

Benchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'
  Time (mean ± σ):      1.149 s ±  0.015 s    [User: 2.128 s, System: 0.781 s]
  Range (min … max):    1.136 s …  1.186 s    10 runs

These timings are for initial runs on a freshly mounted file system, running 5, 10, 15 and 20 processes in parallel. 1.1 seconds means that it takes only about 1 millisecond per Perl binary.

Following are timings for subsequent runs, both on DwarFS (at mnt) and the original XFS (at install). DwarFS is around 15% slower here:

$ hyperfine -P procs 10 20 -D 10 -w1 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'" "ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'
  Time (mean ± σ):     347.0 ms ±   7.2 ms    [User: 1.755 s, System: 0.452 s]
  Range (min … max):   341.3 ms … 365.2 ms    10 runs

Benchmark #2: ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'
  Time (mean ± σ):     302.5 ms ±   3.3 ms    [User: 1.656 s, System: 0.377 s]
  Range (min … max):   297.1 ms … 308.7 ms    10 runs

Benchmark #3: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'
  Time (mean ± σ):     342.2 ms ±   4.1 ms    [User: 1.766 s, System: 0.451 s]
  Range (min … max):   336.0 ms … 349.7 ms    10 runs

Benchmark #4: ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'
  Time (mean ± σ):     302.0 ms ±   3.0 ms    [User: 1.659 s, System: 0.374 s]
  Range (min … max):   297.0 ms … 305.4 ms    10 runs

Summary
  'ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'' ran
    1.00 ± 0.01 times faster than 'ls -1 install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null''
    1.13 ± 0.02 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null''
    1.15 ± 0.03 times faster than 'ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null''

Using the lzma-compressed file system, the metrics for initial runs look considerably worse (about an order of magnitude):

$ hyperfine -c "umount mnt" -p "umount mnt; dwarfs perl-install-lzma.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1" -P procs 5 20 -D 5 "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P{procs} sh -c '\$0 -v >/dev/null'"
Benchmark #1: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P5 sh -c '$0 -v >/dev/null'
  Time (mean ± σ):     10.660 s ±  0.057 s    [User: 1.952 s, System: 0.729 s]
  Range (min … max):   10.615 s … 10.811 s    10 runs

Benchmark #2: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P10 sh -c '$0 -v >/dev/null'
  Time (mean ± σ):      9.092 s ±  0.021 s    [User: 1.979 s, System: 0.680 s]
  Range (min … max):    9.059 s …  9.126 s    10 runs

Benchmark #3: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P15 sh -c '$0 -v >/dev/null'
  Time (mean ± σ):      9.012 s ±  0.188 s    [User: 2.077 s, System: 0.702 s]
  Range (min … max):    8.839 s …  9.277 s    10 runs

Benchmark #4: ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '$0 -v >/dev/null'
  Time (mean ± σ):      9.004 s ±  0.298 s    [User: 2.134 s, System: 0.736 s]
  Range (min … max):    8.611 s …  9.555 s    10 runs

So you might want to consider using zstd instead of lzma if you'd like to optimize for file system performance. It's also the default compression used by mkdwarfs.

Now here's a comparison with the SquashFS filesystem:

$ hyperfine -c 'sudo umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'sudo umount mnt; sudo mount -t squashfs perl-install.squashfs mnt; sleep 1' -n squashfs-zstd "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
Benchmark #1: dwarfs-zstd
  Time (mean ± σ):      1.151 s ±  0.015 s    [User: 2.147 s, System: 0.769 s]
  Range (min … max):    1.118 s …  1.174 s    10 runs

Benchmark #2: squashfs-zstd
  Time (mean ± σ):      6.733 s ±  0.007 s    [User: 3.188 s, System: 17.015 s]
  Range (min … max):    6.721 s …  6.743 s    10 runs

Summary
  'dwarfs-zstd' ran
    5.85 ± 0.08 times faster than 'squashfs-zstd'

So, DwarFS is almost six times faster than SquashFS. But what's more, SquashFS also uses significantly more CPU power. However, the numbers shown above for DwarFS obviously don't include the time spent in the dwarfs process, so I repeated the test outside of hyperfine:

$ time dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4 -f

real    0m4.569s
user    0m2.154s
sys     0m1.846s

So, in total, DwarFS was using 5.7 seconds of CPU time, whereas SquashFS was using 20.2 seconds, almost four times as much. Ignore the 'real' time, this is only how long it took me to unmount the file system again after mounting it.

Another real-life test was to build and test a Perl module with 624 different Perl versions in the compressed file system. The module I've used, Tie::Hash::Indexed, has an XS component that requires a C compiler to build. So this really accesses a lot of different stuff in the file system:

  • The perl executables and its shared libraries

  • The Perl modules used for writing the Makefile

  • Perl's C header files used for building the module

  • More Perl modules used for running the tests

I wrote a little script to be able to run multiple builds in parallel:

#!/bin/bash
set -eu
perl=$1
dir=$(echo "$perl" | cut -d/ --output-delimiter=- -f5,6)
rsync -a Tie-Hash-Indexed/ $dir/
cd $dir
$1 Makefile.PL >/dev/null 2>&1
make test >/dev/null 2>&1
cd ..
rm -rf $dir
echo $perl

The following command will run up to 16 builds in parallel on the 8 core Xeon CPU, including debug, optimized and threaded versions of all Perl releases between 5.10.0 and 5.33.3, a total of 624 perl installations:

$ time ls -1 /tmp/perl/install/*/perl-5.??.?/bin/perl5* | sort -t / -k 8 | xargs -d $'\n' -P 16 -n 1 ./build.sh

Tests were done with a cleanly mounted file system to make sure the caches were empty. ccache was primed to make sure all compiler runs could be satisfied from the cache. With SquashFS, the timing was:

real    0m52.385s
user    8m10.333s
sys     4m10.056s

And with DwarFS:

real    0m50.469s
user    9m22.597s
sys     1m18.469s

So, frankly, not much of a difference, with DwarFS being just a bit faster. The dwarfs process itself used:

real    0m56.686s
user    0m18.857s
sys     0m21.058s

So again, DwarFS used less raw CPU power overall, but in terms of wallclock time, the difference is really marginal.

With SquashFS & xz

This test uses slightly less pathological input data: the root filesystem of a recent Raspberry Pi OS release. This file system also contains device inodes, so in order to preserve those, we pass --with-devices to mkdwarfs:

$ time sudo mkdwarfs -i raspbian -o raspbian.dwarfs --with-devices
I 21:30:29.812562 scanning raspbian
I 21:30:29.908984 waiting for background scanners...
I 21:30:30.217446 assigning directory and link inodes...
I 21:30:30.221941 finding duplicate files...
I 21:30:30.288099 saved 31.05 MiB / 1007 MiB in 1617/34582 duplicate files
I 21:30:30.288143 waiting for inode scanners...
I 21:30:31.393710 assigning device inodes...
I 21:30:31.394481 assigning pipe/socket inodes...
I 21:30:31.395196 building metadata...
I 21:30:31.395230 building blocks...
I 21:30:31.395291 saving names and links...
I 21:30:31.395374 ordering 32965 inodes using nilsimsa similarity...
I 21:30:31.396254 nilsimsa: depth=20000 (1000), limit=255
I 21:30:31.407967 pre-sorted index (46431 name, 2206 path lookups) [11.66ms]
I 21:30:31.410089 updating name and link indices...
I 21:30:38.178505 32965 inodes ordered [6.783s]
I 21:30:38.179417 waiting for segmenting/blockifying to finish...
I 21:31:06.248304 saving chunks...
I 21:31:06.251998 saving directories...
I 21:31:06.402559 waiting for compression to finish...
I 21:31:16.425563 compressed 1007 MiB to 287 MiB (ratio=0.285036)
I 21:31:16.464772 filesystem created without errors [46.65s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
waiting for block compression to finish
4435 dirs, 5908/0 soft/hard links, 34582/34582 files, 7 other
original size: 1007 MiB, dedupe: 31.05 MiB (1617 files), segment: 47.23 MiB
filesystem: 928.4 MiB in 59 blocks (38890 chunks, 32965/32965 inodes)
compressed filesystem: 59 blocks/287 MiB written [depth: 20000]
████████████████████████████████████████████████████████████████████▏100% |

real    0m46.711s
user    10m39.038s
sys     0m8.123s

Again, SquashFS uses the same compression options:

$ time sudo mksquashfs raspbian raspbian.squashfs -comp zstd -Xcompression-level 22
Parallel mksquashfs: Using 16 processors
Creating 4.0 filesystem on raspbian.squashfs, block size 131072.
[===============================================================\] 39232/39232 100%

Exportable Squashfs 4.0 filesystem, zstd compressed, data block size 131072
        compressed data, compressed metadata, compressed fragments,
        compressed xattrs, compressed ids
        duplicates are removed
Filesystem size 371934.50 Kbytes (363.22 Mbytes)
        35.98% of uncompressed filesystem size (1033650.60 Kbytes)
Inode table size 399913 bytes (390.54 Kbytes)
        26.53% of uncompressed inode table size (1507581 bytes)
Directory table size 408749 bytes (399.17 Kbytes)
        42.31% of uncompressed directory table size (966174 bytes)
Number of duplicate files found 1618
Number of inodes 44932
Number of files 34582
Number of fragments 3290
Number of symbolic links  5908
Number of device nodes 7
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 4435
Number of ids (unique uids + gids) 18
Number of uids 5
        root (0)
        mhx (1000)
        unknown (103)
        shutdown (6)
        unknown (106)
Number of gids 15
        root (0)
        unknown (109)
        unknown (42)
        unknown (1000)
        users (100)
        unknown (43)
        tty (5)
        unknown (108)
        unknown (111)
        unknown (110)
        unknown (50)
        mail (12)
        nobody (65534)
        adm (4)
        mem (8)

real    0m50.124s
user    9m41.708s
sys     0m1.727s

The difference in speed is almost negligible. SquashFS is just a bit slower here. In terms of compression, the difference also isn't huge:

$ ls -lh raspbian.* *.xz
-rw-r--r-- 1 mhx  users 297M Mar  4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz
-rw-r--r-- 1 root root  287M Mar  4 21:31 raspbian.dwarfs
-rw-r--r-- 1 root root  364M Mar  4 21:33 raspbian.squashfs

Interestingly, xz actually can't compress the whole original image better than DwarFS.

We can even again try to increase the DwarFS compression level:

$ time sudo mkdwarfs -i raspbian -o raspbian-9.dwarfs --with-devices -l9

real    0m54.161s
user    8m40.109s
sys     0m7.101s

Now that actually gets the DwarFS image size well below that of the xz archive:

$ ls -lh raspbian-9.dwarfs *.xz
-rw-r--r-- 1 root root  244M Mar  4 21:36 raspbian-9.dwarfs
-rw-r--r-- 1 mhx  users 297M Mar  4 21:32 2020-08-20-raspios-buster-armhf-lite.img.xz

Even if you actually build a tarball and compress that (instead of compressing the EXT4 file system itself), xz isn't quite able to match the DwarFS image size:

$ time sudo tar cf - raspbian | xz -9 -vT 0 >raspbian.tar.xz
  100 %     246.9 MiB / 1,037.2 MiB = 0.238    13 MiB/s       1:18

real    1m18.226s
user    6m35.381s
sys     0m2.205s
$ ls -lh raspbian.tar.xz
-rw-r--r-- 1 mhx users 247M Mar  4 21:40 raspbian.tar.xz

DwarFS also comes with the dwarfsextract tool that allows extraction of a filesystem image without the FUSE driver. So here's a comparison of the extraction speed:

$ time sudo tar xf raspbian.tar.xz -C out1

real    0m12.846s
user    0m12.313s
sys     0m1.616s
$ time sudo dwarfsextract -i raspbian-9.dwarfs -o out2

real    0m3.825s
user    0m13.234s
sys     0m1.382s

So, dwarfsextract is almost 4 times faster thanks to using multiple worker threads for decompression. It's writing about 300 MiB/s in this example.

Another nice feature of dwarfsextract is that it allows you to directly output data in an archive format, so you could create a tarball from your image without extracting the files to disk:

$ dwarfsextract -i raspbian-9.dwarfs -f ustar | xz -9 -T0 >raspbian2.tar.xz

This has the interesting side-effect that the resulting tarball will likely be smaller than the one built straight from the directory:

$ ls -lh raspbian*.tar.xz
-rw-r--r-- 1 mhx users 247M Mar  4 21:40 raspbian.tar.xz
-rw-r--r-- 1 mhx users 240M Mar  4 23:52 raspbian2.tar.xz

That's because dwarfsextract writes files in inode-order, and by default inodes are ordered by similarity for the best possible compression.

With lrzip

lrzip is a compression utility targeted especially at compressing large files. From its description, it looks like it does something very similar to DwarFS, i.e. it looks for duplicate segments before passsing the de-duplicated data on to an lzma compressor.

When I first read about lrzip, I was pretty certain it would easily beat DwarFS. So let's take a look. lrzip operates on a single file, so it's necessary to first build a tarball:

$ time tar cf perl-install.tar install

real    2m9.568s
user    0m3.757s
sys     0m26.623s

Now we can run lrzip:

$ time lrzip -vL9 -o perl-install.tar.lrzip perl-install.tar
The following options are in effect for this COMPRESSION.
Threading is ENABLED. Number of CPUs detected: 16
Detected 67106172928 bytes ram
Compression level 9
Nice Value: 19
Show Progress
Verbose
Output Filename Specified: perl-install.tar.lrzip
Temporary Directory set as: ./
Compression mode is: LZMA. LZO Compressibility testing enabled
Heuristically Computed Compression Window: 426 = 42600MB
File size: 52615639040
Will take 2 passes
Beginning rzip pre-processing phase
Beginning rzip pre-processing phase
perl-install.tar - Compression Ratio: 100.378. Average Compression Speed: 14.536MB/s.
Total time: 00:57:32.47

real    57m32.472s
user    81m44.104s
sys     4m50.221s

That definitely took a while. This is about an order of magnitude slower than mkdwarfs and it barely makes use of the 8 cores.

$ ll -h perl-install.tar.lrzip
-rw-r--r-- 1 mhx users 500M Mar  6 21:16 perl-install.tar.lrzip

This is a surprisingly disappointing result. The archive is 65% larger than a DwarFS image at -l9 that takes less than 4 minutes to build. Also, you can't just access the files in the .lrzip without fully unpacking the archive first.

That being said, it is better than just using xz on the tarball:

$ time xz -T0 -v9 -c perl-install.tar >perl-install.tar.xz
perl-install.tar (1/1)
  100 %      4,317.0 MiB / 49.0 GiB = 0.086    24 MiB/s      34:55

real    34m55.450s
user    543m50.810s
sys     0m26.533s
$ ll perl-install.tar.xz -h
-rw-r--r-- 1 mhx users 4.3G Mar  6 22:59 perl-install.tar.xz

With zpaq

zpaq is a journaling backup utility and archiver. Again, it appears to share some of the ideas in DwarFS, like segmentation analysis, but it also adds some features on top that make it useful for incremental backups. However, it's also not usable as a file system, so data needs to be extracted before it can be used.

Anyway, how does it fare in terms of speed and compression performance?

$ time zpaq a perl-install.zpaq install -m5

After a few million lines of output that (I think) cannot be turned off:

2258234 +added, 0 -removed.

0.000000 + (51161.953159 -> 8932.000297 -> 490.227707) = 490.227707 MB
2828.082 seconds (all OK)

real    47m8.104s
user    714m44.286s
sys     3m6.751s

So, it's an order of magnitude slower than mkdwarfs and uses 14 times as much CPU resources as mkdwarfs -l9. The resulting archive it pretty close in size to the default configuration DwarFS image, but it's more than 50% bigger than the image produced by mkdwarfs -l9.

$ ll perl-install*.*
-rw-r--r-- 1 mhx users 490227707 Mar  7 01:38 perl-install.zpaq
-rw-r--r-- 1 mhx users 315482627 Mar  3 21:23 perl-install-l9.dwarfs
-rw-r--r-- 1 mhx users 447230618 Mar  3 20:28 perl-install.dwarfs

What's really surprising is how slow it is to extract the zpaq archive again:

$ time zpaq x perl-install.zpaq
2798.097 seconds (all OK)

real    46m38.117s
user    711m18.734s
sys     3m47.876s

That's 700 times slower than extracting the DwarFS image.

With zpaqfranz

zpaqfranz is a derivative of zpaq. Much to my delight, it doesn't generate millions of lines of output. It claims to be multi-threaded and de-duplicating, so definitely worth taking a look. Like zpaq, it supports incremental backups.

We'll use a different input to compare zpaqfranz and DwarFS: The source code of 670 different releases of the "wine" emulator. That's 73 gigabytes of data in total, spread across slightly more than 3 million files. It's obviously highly redundant and should thus be a good data set to compare the tools. For reference, a .tar.xz of the directory is still 7 GiB in size and a SquashFS image of the data gets down to around 1.6 GiB. An "optimized" .tar.xz, where the input files were ordered by similarity, compresses down to 399 MiB, almost 20 times better than without ordering.

Now it's time to try zpaqfranz. The input data is stored on a fast SSD and a large fraction of it is already in the file system cache from previous runs, so disk I/O is not a bottleneck.

$ time ./zpaqfranz a winesrc.zpaq winesrc
zpaqfranz v58.8k-JIT-L(2023-08-05)
Creating winesrc.zpaq at offset 0 + 0
Add 2024-01-11 07:25:22 3.117.413     69.632.090.852 (  64.85 GB) 16T (362.904 dirs)
3.480.317 +added, 0 -removed.

0 + (69.632.090.852 -> 8.347.553.798 -> 617.600.892) = 617.600.892 @ 58.38 MB/s

1137.441 seconds (000:18:57) (all OK)

real    18m58.632s
user    11m51.052s
sys     1m3.389s

That is considerably faster than the original zpaq, and uses about 60 times less CPU resources. The output file is 589 MiB, so slightly larger than both the "optimized" .tar.gz and the zpaq output.

How does mkdwarfs do?

$ time mkdwarfs -i winesrc -o winesrc.dwarfs -l9
[...]
I 07:55:20.546636 compressed 64.85 GiB to 93.2 MiB (ratio=0.00140344)
I 07:55:20.826699 compression CPU time: 6.726m
I 07:55:20.827338 filesystem created without errors [2.283m]
[...]

real    2m17.100s
user    9m53.633s
sys     2m29.236s

It uses pretty much the same amount of CPU resources, but finishes more than 8 times faster. The DwarFS output file is more than 6 times smaller.

You can actually squeeze a bit more redundancy out of the original data by tweaking the similarity ordering and switching from lzma to brotli compression, albeit at a somewhat slower compression speed:

mkdwarfs -i winesrc -o winesrc.dwarfs -l9 -C brotli:quality=11:lgwin=26 --order=nilsimsa:max-cluster-size=200k
[...]
I 08:21:01.138075 compressed 64.85 GiB to 73.52 MiB (ratio=0.00110716)
I 08:21:01.485737 compression CPU time: 36.58m
I 08:21:01.486313 filesystem created without errors [5.501m]
[...]
real    5m30.178s
user    40m59.193s
sys     2m36.234s

That's almost a 1000x reduction in size.

Let's also look at decompression speed:

$ time zpaqfranz x winesrc.zpaq
zpaqfranz v58.8k-JIT-L(2023-08-05)
/home/mhx/winesrc.zpaq:
1 versions, 3.480.317 files, 617.600.892 bytes (588.99 MB)
Extract 69.632.090.852 bytes (64.85 GB) in 3.117.413 files (362.904 folders) / 16 T
        99.18% 00:00:00  (  64.32 GB)=>(  64.85 GB)  548.83 MB/sec

125.636 seconds (000:02:05) (all OK)

real    2m6.968s
user    1m36.177s
sys     1m10.980s
$ time dwarfsextract -i winesrc.dwarfs

real    1m49.182s
user    0m34.667s
sys     1m28.733s

Decompression time is pretty much in the same ballpark, with just slightly shorter times for the DwarFS image.

With wimlib

wimlib is a really interesting project that is a lot more mature than DwarFS. While DwarFS at its core has a library component that could potentially be ported to other operating systems, wimlib already is available on many platforms. It also seems to have quite a rich set of features, so it's definitely worth taking a look at.

I first tried wimcapture on the perl dataset:

$ time wimcapture --unix-data --solid --solid-chunk-size=16M install perl-install.wim
Scanning "install"
47 GiB scanned (1927501 files, 330733 directories)
Using LZMS compression with 16 threads
Archiving file data: 19 GiB of 19 GiB (100%) done

real    15m23.310s
user    174m29.274s
sys     0m42.921s
$ ll perl-install.*
-rw-r--r-- 1 mhx users  447230618 Mar  3 20:28 perl-install.dwarfs
-rw-r--r-- 1 mhx users  315482627 Mar  3 21:23 perl-install-l9.dwarfs
-rw-r--r-- 1 mhx users 4748902400 Mar  3 20:10 perl-install.squashfs
-rw-r--r-- 1 mhx users 1016981520 Mar  6 21:12 perl-install.wim

So, wimlib is definitely much better than squashfs, in terms of both compression ratio and speed. DwarFS is however about 3 times faster to create the file system and the DwarFS file system less than half the size. When switching to LZMA compression, the DwarFS file system is more than 3 times smaller (wimlib uses LZMS compression by default).

What's a bit surprising is that mounting a wim file takes quite a bit of time:

$ time wimmount perl-install.wim mnt
[WARNING] Mounting a WIM file containing solid-compressed data; file access may be slow.

real    0m2.038s
user    0m1.764s
sys     0m0.242s

Mounting the DwarFS image takes almost no time in comparison:

$ time git/github/dwarfs/build-clang-11/dwarfs perl-install-default.dwarfs mnt
I 00:23:39.238182 dwarfs (v0.4.0, fuse version 35)

real    0m0.003s
user    0m0.003s
sys     0m0.000s

That's just because it immediately forks into background by default and initializes the file system in the background. However, even when running it in the foreground, initializing the file system takes only about 60 milliseconds:

$ dwarfs perl-install.dwarfs mnt -f
I 00:25:03.186005 dwarfs (v0.4.0, fuse version 35)
I 00:25:03.248061 file system initialized [60.95ms]

If you actually build the DwarFS file system with uncompressed metadata, mounting is basically instantaneous:

$ dwarfs perl-install-meta.dwarfs mnt -f
I 00:27:52.667026 dwarfs (v0.4.0, fuse version 35)
I 00:27:52.671066 file system initialized [2.879ms]

I've tried running the benchmark where all 1139 perl executables print their version with the wimlib image, but after about 10 minutes, it still hadn't finished the first run (with the DwarFS image, one run took slightly more than 2 seconds). I then tried the following instead:

$ ls -1 /tmp/perl/install/*/*/bin/perl5* | xargs -d $'\n' -n1 -P1 sh -c 'time $0 -v >/dev/null' 2>&1 | grep ^real
real    0m0.802s
real    0m0.652s
real    0m1.677s
real    0m1.973s
real    0m1.435s
real    0m1.879s
real    0m2.003s
real    0m1.695s
real    0m2.343s
real    0m1.899s
real    0m1.809s
real    0m1.790s
real    0m2.115s

Judging from that, it would have probably taken about half an hour for a single run, which makes at least the --solid wim image pretty much unusable for actually working with the file system.

The --solid option was suggested to me because it resembles the way that DwarFS actually organizes data internally. However, judging by the warning when mounting a solid image, it's probably not ideal when using the image as a mounted file system. So I tried again without --solid:

$ time wimcapture --unix-data install perl-install-nonsolid.wim
Scanning "install"
47 GiB scanned (1927501 files, 330733 directories)
Using LZX compression with 16 threads
Archiving file data: 19 GiB of 19 GiB (100%) done

real    8m39.034s
user    64m58.575s
sys     0m32.003s

This is still more than 3 minutes slower than mkdwarfs. However, it yields an image that's almost 10 times the size of the DwarFS image and comparable in size to the SquashFS image:

$ ll perl-install-nonsolid.wim -h
-rw-r--r-- 1 mhx users 4.6G Mar  6 23:24 perl-install-nonsolid.wim

This still takes surprisingly long to mount:

$ time wimmount perl-install-nonsolid.wim mnt

real    0m1.603s
user    0m1.327s
sys     0m0.275s

However, it's really usable as a file system, even though it's about 4-5 times slower than the DwarFS image:

$ hyperfine -c 'umount mnt' -p 'umount mnt; dwarfs perl-install.dwarfs mnt -o cachesize=1g -o workers=4; sleep 1' -n dwarfs "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'" -p 'umount mnt; wimmount perl-install-nonsolid.wim mnt; sleep 1' -n wimlib "ls -1 mnt/*/*/bin/perl5* | xargs -d $'\n' -n1 -P20 sh -c '\$0 -v >/dev/null'"
Benchmark #1: dwarfs
  Time (mean ± σ):      1.149 s ±  0.019 s    [User: 2.147 s, System: 0.739 s]
  Range (min … max):    1.122 s …  1.187 s    10 runs

Benchmark #2: wimlib
  Time (mean ± σ):      7.542 s ±  0.069 s    [User: 2.787 s, System: 0.694 s]
  Range (min … max):    7.490 s …  7.732 s    10 runs

Summary
  'dwarfs' ran
    6.56 ± 0.12 times faster than 'wimlib'

With Cromfs

I used Cromfs in the past for compressed file systems and remember that it did a pretty good job in terms of compression ratio. But it was never fast. However, I didn't quite remember just how slow it was until I tried to set up a test.

Here's a run on the Perl dataset, with the block size set to 16 MiB to match the default of DwarFS, and with additional options suggested to speed up compression:

$ time mkcromfs -f 16777216 -qq -e -r100000 install perl-install.cromfs
Writing perl-install.cromfs...
mkcromfs: Automatically enabling --24bitblocknums because it seems possible for this filesystem.
Root pseudo file is 108 bytes
Inotab spans 0x7f3a18259000..0x7f3a1bfffb9c
Root inode spans 0x7f3a205d2948..0x7f3a205d294c
Beginning task for Files and directories: Finding identical blocks
2163608 reuse opportunities found. 561362 unique blocks. Block table will be 79.4% smaller than without the index search.
Beginning task for Files and directories: Blockifying
Blockifying:  0.04% (140017/2724970) idx(siz=80423,del=0) rawin(20.97 MB)rawout(20.97 MB)diff(1956 bytes)
Termination signalled, cleaning up temporaries

real    29m9.634s
user    201m37.816s
sys     2m15.005s

So, it processed 21 MiB out of 48 GiB in half an hour, using almost twice as much CPU resources as DwarFS for the whole file system. At this point I decided it's likely not worth waiting (presumably) another month (!) for mkcromfs to finish. I double checked that I didn't accidentally build a debugging version, mkcromfs was definitely built with -O3.

I then tried once more with a smaller version of the Perl dataset. This only has 20 versions (instead of 1139) of Perl, and obviously a lot less redundancy:

$ time mkcromfs -f 16777216 -qq -e -r100000 install-small perl-install.cromfs
Writing perl-install.cromfs...
mkcromfs: Automatically enabling --16bitblocknums because it seems possible for this filesystem.
Root pseudo file is 108 bytes
Inotab spans 0x7f00e0774000..0x7f00e08410a8
Root inode spans 0x7f00b40048f8..0x7f00b40048fc
Beginning task for Files and directories: Finding identical blocks
25362 reuse opportunities found. 9815 unique blocks. Block table will be 72.1% smaller than without the index search.
Beginning task for Files and directories: Blockifying
Compressing raw rootdir inode (28 bytes)z=982370,del=2) rawin(641.56 MB)rawout(252.72 MB)diff(388.84 MB)
 compressed into 35 bytes
INOTAB pseudo file is 839.85 kB
Inotab inode spans 0x7f00bc036ed8..0x7f00bc036ef4
Beginning task for INOTAB: Finding identical blocks
0 reuse opportunities found. 13 unique blocks. Block table will be 0.0% smaller than without the index search.
Beginning task for INOTAB: Blockifying
mkcromfs: Automatically enabling --packedblocks because it is possible for this filesystem.
Compressing raw inotab inode (52 bytes)
 compressed into 58 bytes
Compressing 9828 block records (4 bytes each, total 39312 bytes)
 compressed into 15890 bytes
Compressing and writing 16 fblocks...

16 fblocks were written: 35.31 MB = 13.90 % of 254.01 MB
Filesystem size: 35.33 MB = 5.50 % of original 642.22 MB
End

real    27m38.833s
user    277m36.208s
sys     11m36.945s

And repeating the same task with mkdwarfs:

$ time mkdwarfs -i install-small -o perl-install-small.dwarfs
21:13:38.131724 scanning install-small
21:13:38.320139 waiting for background scanners...
21:13:38.727024 assigning directory and link inodes...
21:13:38.731807 finding duplicate files...
21:13:38.832524 saved 267.8 MiB / 611.8 MiB in 22842/26401 duplicate files
21:13:38.832598 waiting for inode scanners...
21:13:39.619963 assigning device inodes...
21:13:39.620855 assigning pipe/socket inodes...
21:13:39.621356 building metadata...
21:13:39.621453 building blocks...
21:13:39.621472 saving names and links...
21:13:39.621655 ordering 3559 inodes using nilsimsa similarity...
21:13:39.622031 nilsimsa: depth=20000, limit=255
21:13:39.629206 updating name and link indices...
21:13:39.630142 pre-sorted index (3360 name, 2127 path lookups) [8.014ms]
21:13:39.752051 3559 inodes ordered [130.3ms]
21:13:39.752101 waiting for segmenting/blockifying to finish...
21:13:53.250951 saving chunks...
21:13:53.251581 saving directories...
21:13:53.303862 waiting for compression to finish...
21:14:11.073273 compressed 611.8 MiB to 24.01 MiB (ratio=0.0392411)
21:14:11.091099 filesystem created without errors [32.96s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
waiting for block compression to finish
3334 dirs, 0/0 soft/hard links, 26401/26401 files, 0 other
original size: 611.8 MiB, dedupe: 267.8 MiB (22842 files), segment: 121.5 MiB
filesystem: 222.5 MiB in 14 blocks (7177 chunks, 3559/3559 inodes)
compressed filesystem: 14 blocks/24.01 MiB written
██████████████████████████████████████████████████████████████████████▏100% \

real    0m33.007s
user    3m43.324s
sys     0m4.015s

So, mkdwarfs is about 50 times faster than mkcromfs and uses 75 times less CPU resources. At the same time, the DwarFS file system is 30% smaller:

$ ls -l perl-install-small.*fs
-rw-r--r-- 1 mhx users 35328512 Dec  8 14:25 perl-install-small.cromfs
-rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs

I noticed that the blockifying step that took ages for the full dataset with mkcromfs ran substantially faster (in terms of MiB/second) on the smaller dataset, which makes me wonder if there's some quadratic complexity behaviour that's slowing down mkcromfs.

In order to be completely fair, I also ran mkdwarfs with -l 9 to enable LZMA compression (which is what mkcromfs uses by default):

$ time mkdwarfs -i install-small -o perl-install-small-l9.dwarfs -l 9
21:16:21.874975 scanning install-small
21:16:22.092201 waiting for background scanners...
21:16:22.489470 assigning directory and link inodes...
21:16:22.495216 finding duplicate files...
21:16:22.611221 saved 267.8 MiB / 611.8 MiB in 22842/26401 duplicate files
21:16:22.611314 waiting for inode scanners...
21:16:23.394332 assigning device inodes...
21:16:23.395184 assigning pipe/socket inodes...
21:16:23.395616 building metadata...
21:16:23.395676 building blocks...
21:16:23.395685 saving names and links...
21:16:23.395830 ordering 3559 inodes using nilsimsa similarity...
21:16:23.396097 nilsimsa: depth=50000, limit=255
21:16:23.401042 updating name and link indices...
21:16:23.403127 pre-sorted index (3360 name, 2127 path lookups) [6.936ms]
21:16:23.524914 3559 inodes ordered [129ms]
21:16:23.525006 waiting for segmenting/blockifying to finish...
21:16:33.865023 saving chunks...
21:16:33.865883 saving directories...
21:16:33.900140 waiting for compression to finish...
21:17:10.505779 compressed 611.8 MiB to 17.44 MiB (ratio=0.0284969)
21:17:10.526171 filesystem created without errors [48.65s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
waiting for block compression to finish
3334 dirs, 0/0 soft/hard links, 26401/26401 files, 0 other
original size: 611.8 MiB, dedupe: 267.8 MiB (22842 files), segment: 122.2 MiB
filesystem: 221.8 MiB in 4 blocks (7304 chunks, 3559/3559 inodes)
compressed filesystem: 4 blocks/17.44 MiB written
██████████████████████████████████████████████████████████████████████▏100% /

real    0m48.683s
user    2m24.905s
sys     0m3.292s
$ ls -l perl-install-small*.*fs
-rw-r--r-- 1 mhx users 18282075 Dec 10 21:17 perl-install-small-l9.dwarfs
-rw-r--r-- 1 mhx users 35328512 Dec  8 14:25 perl-install-small.cromfs
-rw-r--r-- 1 mhx users 25175016 Dec 10 21:14 perl-install-small.dwarfs

It takes about 15 seconds longer to build the DwarFS file system with LZMA compression (this is still 35 times faster than Cromfs), but reduces the size even further to make it almost half the size of the Cromfs file system.

I would have added some benchmarks with the Cromfs FUSE driver, but sadly it crashed right upon trying to list the directory after mounting.

With EROFS

EROFS is a read-only compressed file system that has been added to the Linux kernel recently. Its goals are different from those of DwarFS, though. It is designed to be lightweight (which DwarFS is definitely not) and to run on constrained hardware like embedded devices or smartphones. It is not designed to provide maximum compression. It currently supports LZ4 and LZMA compression.

Running it on the full Perl dataset using options given in the README for "well-compressed images":

$ time mkfs.erofs -C1048576 -Eztailpacking,fragments,all-fragments,dedupe -zlzma,9 perl-install-lzma9.erofs perl-install
mkfs.erofs 1.7.1-gd93a18c9
<W> erofs: It may take a longer time since MicroLZMA is still single-threaded for now.
Build completed.
------
Filesystem UUID: 538ce164-5f9d-4a6a-9808-5915f17ced30
Filesystem total blocks: 599854 (of 4096-byte blocks)
Filesystem total inodes: 2255795
Filesystem total metadata blocks: 74253
Filesystem total deduplicated bytes (of source files): 29625028195

user	2:35:08.03
system	1:12.65
total	2:39:25.35

$ ll -h perl-install-lzma9.erofs
-rw-r--r-- 1 mhx mhx 2.3G Apr 15 16:23 perl-install-lzma9.erofs

That's definitely slower than SquashFS, but also significantly smaller.

For a fair comparison, let's use the same 1 MiB block size with DwarFS, but also tweak the options for best compression:

$ time mkdwarfs -i perl-install -o perl-install-1M.dwarfs -l9 -S20 -B64 --order=nilsimsa:max-cluster-size=150000
[...]
330733 dirs, 0/2440 soft/hard links, 1927501/1927501 files, 0 other
original size: 47.49 GiB, hashed: 43.47 GiB (1920025 files, 1.451 GiB/s)
scanned: 19.45 GiB (144675 files, 159.3 MiB/s), categorizing: 0 B/s
saved by deduplication: 28.03 GiB (1780386 files), saved by segmenting: 15.4 GiB
filesystem: 4.053 GiB in 4151 blocks (937069 chunks, 144674/144674 fragments, 144675 inodes)
compressed filesystem: 4151 blocks/806.2 MiB written
[...]
user	24:27.47
system	4:20.74
total	3:26.79

That's significantly smaller and, almost more importantly, 46 times faster than mkfs.erofs.

Actually using the file system images, here's how DwarFS performs:

$ dwarfs perl-install-1M.dwarfs mnt -oworkers=8
$ find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress
50392172594 bytes (50 GB, 47 GiB) copied, 19 s, 2.7 GB/s
0+1662649 records in
0+1662649 records out
51161953159 bytes (51 GB, 48 GiB) copied, 19.4813 s, 2.6 GB/s

Reading every single file from 16 parallel processes took less than 20 seconds. The FUSE driver consumed 143 seconds of CPU time.

Here's the same for EROFS:

$ erofsfuse perl-install-lzma9.erofs mnt
$ find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress
2594306810 bytes (2.6 GB, 2.4 GiB) copied, 300 s, 8.6 MB/s^C
0+133296 records in
0+133296 records out
2595212832 bytes (2.6 GB, 2.4 GiB) copied, 300.336 s, 8.6 MB/s

Note that I've stopped this after 5 minutes. The DwarFS FUSE driver delivered about 300 times faster throughput compared to EROFS. The EROFS FUSE driver consumed 50 minutes (!) of CPU time for only about 5% of the data, i.e. more than 400 times the CPU time consumed by the DwarFS FUSE driver.

I've tried two more EROFS configurations on the same set of data. The first one uses more or less just the defaults:

$ time mkfs.erofs -zlz4hc,12 perl-install-lz4hc.erofs perl-install
mkfs.erofs 1.7.1-gd93a18c9
Build completed.
------
Filesystem UUID: b75142ed-6cf3-46a4-84f3-12693f7759a0
Filesystem total blocks: 5847130 (of 4096-byte blocks)
Filesystem total inodes: 2255794
Filesystem total metadata blocks: 419699
Filesystem total deduplicated bytes (of source files): 0

user	3:38:23.36
system	1:10.84
total	3:41:37.33

The second one additionally enables the -Ededupe option:

$ time mkfs.erofs -zlz4hc,12 -Ededupe perl-install-lz4hc-dedupe.erofs perl-install
mkfs.erofs 1.7.1-gd93a18c9
Build completed.
------
Filesystem UUID: 0ccf581e-ad3b-4d08-8b10-5b7e15f8e3cd
Filesystem total blocks: 1510091 (of 4096-byte blocks)
Filesystem total inodes: 2255794
Filesystem total metadata blocks: 435599
Filesystem total deduplicated bytes (of source files): 19220717568

user	4:19:57.61
system	1:21.62
total	4:23:55.85

I don't know why these are even slower than the first, seemingly more complex, set of options. As was to be expected, the resulting images were significantly bigger:

$ ll -h perl-install*.erofs
-rw-r--r-- 1 mhx mhx 5.8G Apr 16 02:46 perl-install-lz4hc-dedupe.erofs
-rw-r--r-- 1 mhx mhx  23G Apr 15 22:34 perl-install-lz4hc.erofs
-rw-r--r-- 1 mhx mhx 2.3G Apr 15 16:23 perl-install-lzma9.erofs

The good news is that these perform much better and even outperform DwarFS, albeit by a small margin:

$ erofsfuse perl-install-lz4hc.erofs mnt
$ find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress
49920168315 bytes (50 GB, 46 GiB) copied, 16 s, 3.1 GB/s
0+1493031 records in
0+1493031 records out
51161953159 bytes (51 GB, 48 GiB) copied, 16.4329 s, 3.1 GB/s

The deduplicated version is even a tiny bit faster:

$ erofsfuse perl-install-lz4hc-dedupe.erofs mnt
find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress
50808037121 bytes (51 GB, 47 GiB) copied, 16 s, 3.2 GB/s
0+1499949 records in
0+1499949 records out
51161953159 bytes (51 GB, 48 GiB) copied, 16.1184 s, 3.2 GB/s

The EROFS kernel driver wasn't any faster than the FUSE driver.

The FUSE driver used about 27 seconds of CPU time in both cases, substantially less than before and 5 times less than DwarFS.

DwarFS can get close to the throughput of EROFS by using zstd instead of lzma compression:

$ dwarfs perl-install-1M-zstd.dwarfs mnt -oworkers=8
find mnt -type f -print0 | xargs -0 -P16 -n64 cat | dd of=/dev/null bs=1M status=progress
49224202357 bytes (49 GB, 46 GiB) copied, 16 s, 3.1 GB/s
0+1529018 records in
0+1529018 records out
51161953159 bytes (51 GB, 48 GiB) copied, 16.6716 s, 3.1 GB/s

With fuse-archive

I came across fuse-archive while looking for FUSE drivers to mount archives and it seems to be the most versatile of the alternatives (and the one that actually compiles out of the box).

An interesting test case straight from fuse-archive's README is in the Performance section: an archive with a single huge file full of zeroes. Let's make the example a bit more extreme and use a 1 GiB file instead of just 256 MiB:

$ mkdir zerotest
$ truncate --size=1G zerotest/zeroes

Now, we build several different archives and a DwarFS image:

$ time mkdwarfs -i zerotest -o zerotest.dwarfs -W16 --log-level=warn --progress=none

real    0m7.604s
user    0m7.521s
sys     0m0.083s

$ time zip -9 zerotest.zip zerotest/zeroes
  adding: zerotest/zeroes (deflated 100%)

real    0m4.923s
user    0m4.840s
sys     0m0.080s

$ time 7z a -bb0 -bd zerotest.7z zerotest/zeroes

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,16 CPUs Intel(R) Xeon(R) E-2286M  CPU @ 2.40GHz (906ED),ASM,AES-NI)

Scanning the drive:
1 file, 1073741824 bytes (1024 MiB)

Creating archive: zerotest.7z

Items to compress: 1

Files read from disk: 1
Archive size: 157819 bytes (155 KiB)
Everything is Ok

real    0m5.535s
user    0m48.281s
sys     0m1.116s

$ time tar --zstd -cf zerotest.tar.zstd zerotest/zeroes

real    0m0.449s
user    0m0.510s
sys     0m0.610s

Turns out that tar --zstd is easily winning the compression speed test. Looking at the file sizes did actually blow my mind just a bit:

$ ll zerotest.* --sort=size
-rw-r--r-- 1 mhx users 1042231 Jul  1 15:24 zerotest.zip
-rw-r--r-- 1 mhx users  157819 Jul  1 15:26 zerotest.7z
-rw-r--r-- 1 mhx users   33762 Jul  1 15:28 zerotest.tar.zstd
-rw-r--r-- 1 mhx users     848 Jul  1 15:23 zerotest.dwarfs

I definitely didn't expect the DwarFS image to be that small. Dropping the section index would actually save another 100 bytes. So, if you want to archive lots of zeroes, DwarFS is your friend.

Anyway, let's look at how fast and efficiently the zeroes can be read from the different archives. First, the zip archive:

$ time dd if=mnt/zerotest/zeroes of=/dev/null status=progress
1020117504 bytes (1.0 GB, 973 MiB) copied, 2 s, 510 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.10309 s, 511 MB/s

real    0m2.104s
user    0m0.264s
sys     0m0.486s

CPU time used by the FUSE driver was 1.8 seconds and mount time was in the milliseconds.

Now, the 7z archive:

 $ time dd if=mnt/zerotest/zeroes of=/dev/null status=progress
594759168 bytes (595 MB, 567 MiB) copied, 1 s, 595 MB/s
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.76904 s, 607 MB/s

real    0m1.772s
user    0m0.229s
sys     0m0.572s

CPU time used by the FUSE driver was 2.9 seconds and mount time was just over 1.0 seconds.

Now, the .tar.zstd archive:

$ time dd if=mnt/zerotest/zeroes of=/dev/null status=progress
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.799409 s, 1.3 GB/s

real    0m0.801s
user    0m0.262s
sys     0m0.537s

CPU time used by the FUSE driver was 0.53 seconds and mount time was 0.13 seconds.

Last but not least, let's look at DwarFS:

$ time dd if=mnt/zeroes of=/dev/null status=progress
2097152+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.753 s, 1.4 GB/s

real    0m0.757s
user    0m0.220s
sys     0m0.534s

CPU time used by the FUSE driver was 0.17 seconds and mount time was less than a millisecond.

If we increase the block size for the dd command, we can get even higher throughput. For fuse-archive with the .tar.zstd:

$ time dd if=mnt/zerotest/zeroes of=/dev/null status=progress bs=16384
65536+0 records in
65536+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.318682 s, 3.4 GB/s

real    0m0.323s
user    0m0.005s
sys     0m0.154s

And for DwarFS:

$ time dd if=mnt/zeroes of=/dev/null status=progress bs=16384
65536+0 records in
65536+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.172226 s, 6.2 GB/s

real    0m0.176s
user    0m0.020s
sys     0m0.141s

This is all nice, but what about a more real-life use case? Let's take the 1.82.0 boost release archives:

$ ll --sort=size boost_1_82_0.*
-rw-r--r-- 1 mhx users 208188085 Apr 10 14:25 boost_1_82_0.zip
-rw-r--r-- 1 mhx users 142580547 Apr 10 14:23 boost_1_82_0.tar.gz
-rw-r--r-- 1 mhx users 121325129 Apr 10 14:23 boost_1_82_0.tar.bz2
-rw-r--r-- 1 mhx users 105901369 Jun 28 12:47 boost_1_82_0.dwarfs
-rw-r--r-- 1 mhx users 103710551 Apr 10 14:25 boost_1_82_0.7z

Here are the timings for mounting each archive and then using tar to build another archive from the mountpoint and just counting the number of bytes in that archive, e.g.:

$ time tar cf - mnt | wc -c
803614720

real    0m4.602s
user    0m0.156s
sys     0m1.123s

Here are the results in terms of wallclock time and FUSE driver CPU time:

Archive Mount Time tar Wallclock Time FUSE Driver CPU Time
.zip 0.458s 5.073s 4.418s
.tar.gz 1.391s 3.483s 3.943s
.tar.bz2 15.663s 17.942s 32.040s
.7z 0.321s 32.554s 31.625s
.dwarfs 0.013s 2.974s 1.984s

DwarFS easily wins all categories while still compressing the data almost as well as 7z.

What about accessing files more randomly?

$ find mnt -type f -print0 | xargs -0 -P32 -n32 cat | dd of=/dev/null status=progress

It turns out that fuse-archive grinds to a halt in this case, so I had to run the test on a subset (the boost subdirectory) of the data. The .tar.bz2 and .7z archives were so slow to read that I stopped them after a few minutes.

Archive Throughput Wallclock Time FUSE Driver CPU Time
.zip 1.8 MB/s 83.245s 83.669s
.tar.gz 1.2 MB/s 121.377s 122.711s
.tar.bz2 0.2 MB/s - -
.7z 0.3 MB/s - -
.dwarfs 598.0 MB/s 0.249s 1.099s

Performance Monitoring

Both the FUSE driver and dwarfsextract by default have support for simple performance monitoring. You can build binaries without this feature (-DENABLE_PERFMON=OFF), but impact should be negligible even if performance monitoring is enabled at run-time.

To enable the performance monitor, you pass a list of components for which you want to collect latency metrics, e.g.:

$ dwarfs test.dwarfs mnt -f -operfmon=fuse

When the driver exits, you will see output like this:

[fuse.op_read]
      samples: 45145
      overall: 3.214s
  avg latency: 71.2us
  p50 latency: 131.1us
  p90 latency: 131.1us
  p99 latency: 262.1us

[fuse.op_readdir]
      samples: 2
      overall: 51.31ms
  avg latency: 25.65ms
  p50 latency: 32.77us
  p90 latency: 67.11ms
  p99 latency: 67.11ms

[fuse.op_lookup]
      samples: 16
      overall: 19.98ms
  avg latency: 1.249ms
  p50 latency: 2.097ms
  p90 latency: 4.194ms
  p99 latency: 4.194ms

[fuse.op_init]
      samples: 1
      overall: 199.4us
  avg latency: 199.4us
  p50 latency: 262.1us
  p90 latency: 262.1us
  p99 latency: 262.1us

[fuse.op_open]
      samples: 16
      overall: 122.2us
  avg latency: 7.641us
  p50 latency: 4.096us
  p90 latency: 32.77us
  p99 latency: 32.77us

[fuse.op_getattr]
      samples: 1
      overall: 5.786us
  avg latency: 5.786us
  p50 latency: 8.192us
  p90 latency: 8.192us
  p99 latency: 8.192us

The metrics should be self-explanatory. However, note that the percentile metrics are logarithmically quantized in order to use as little resources as possible. As a result, you will only see values that look an awful lot like powers of two.

Currently, the supported components are fuse for the FUSE operations, filesystem_v2 for the DwarFS file system component and inode_reader_v2 for the component that handles all read() system calls.

The FUSE driver also exposes the performance monitor metrics via an extended attribute.

Other Obscure Features

Setting Worker Thread CPU Affinity

This only works on Linux and usually only makes sense if you have CPUs with different types of cores (e.g. "performance" vs "efficiency" cores) and are really trying to squeeze the last ounce of speed out of DwarFS.

By setting the environment variable DWARFS_WORKER_GROUP_AFFINITY, you can set the CPU affinity of different worker thread groups, e.g.:

export DWARFS_WORKER_GROUP_AFFINITY=blockify=3:compress=6,7

This will set the affinity of the blockify worker group to CPU 3 and the affinity of the compress worker group to CPUs 6 and 7.

You can use this feature for all tools that use one or more worker thread groups. For example, the FUSE driver dwarfs and dwarfsextract use a worker group blkcache that the block cache (i.e. block decompression and lookup) runs on. mkdwarfs uses a whole array of different worker groups, namely compress for compression, scanner for scanning, ordering for input ordering, and blockify for segmenting. blockify is what you would typically want to run on your "performance" cores.

Stargazers over Time

Stargazers over Time

dwarfs's People

Contributors

concatime avatar kspalaiologos avatar maxirmx avatar mhx avatar mrwitek avatar rarogcmex avatar txkxgit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dwarfs's Issues

FUSE graceful exit on initialization error

I've tried to use -o mlock=must and (as it should) failed due to the per user limits.

However the dwarfs (both v2 and v3 FUSE drivers) aborted with an exception, and failed to properly unmount the file-system.

I think the FUSE driver first should try to execute all initialization steps, and if all succeed, only then to try to mount the filesystem.

`cmake' won't configure on Debian

After cloning the repository and following the steps i get the following error:

 0 [16:04] ~/workspace % git clone --recurse-submodules https://github.com/mhx/dwarfs
[...]
 0 [16:04] ~/workspace % cd dwarfs
 0 [16:04] ~/workspace/dwarfs@main % mkdir build
 0 [16:04] ~/workspace/dwarfs@main % cd build
 0 [16:04] workspace/dwarfs@main/build % cmake .. -DWITH_TESTS=1
-- The C compiler identification is GNU 10.2.1
-- The CXX compiler identification is GNU 10.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Setting build type to 'Release'
CMake Error at cmake/version.cmake:35 (message):
  missing version files
Call Stack (most recent call first):
  CMakeLists.txt:61 (include)


-- Configuring incomplete, errors occurred!
See also "/home/palaiologos/workspace/dwarfs/build/CMakeFiles/CMakeOutput.log".
 1 [16:04] workspace/dwarfs@main/build %

Attaching my CMakeOutput.log. I'm using the following to build dwarfs:

 129 [16:08] workspace/dwarfs@main/build % cmake --version
cmake version 3.18.4

CMake suite maintained and supported by Kitware (kitware.com/cmake).
 0 [16:08] workspace/dwarfs@main/build % git --version
git version 2.32.0.rc0
 0 [16:08] workspace/dwarfs@main/build % gcc --version
gcc (Debian 10.2.1-6) 10.2.1 20210110
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 0 [16:08] workspace/dwarfs@main/build % clang --version
Debian clang version 11.0.1-2
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
 0 [16:08] workspace/dwarfs@main/build %

Add new memory management options for FUSE driver

Granted you might be right. Next time that I try DwarFS, I'll issue a sysctl -q vm.drop_caches=3 which if I'm not mistaken should drop the kernel file-system caches.


(In what follows I refer to the dwarfs image as just image, and to the uncompressed files exposed through the mount point as files.)

However, on the same topic, wouldn't it be useful to have the following complementary options:

  • whether to let the kernel cache the files (not the image), like all normal file-systems do; (I think this is the default;)
  • whether dwarfs daemon accesses the image without using the kernel cache (either via O_DIRECT or by using madvise with MADV_DONTNEED in case of mmap access after a block was used);

At the moment I think that both the files and the image are eventually cached by the kernel, thus increasing the memory pressure of the system.

However by using the two proposed options, one could fine tune the CPU / memory usage to fit one's particular use-case:

  • disable the kernel cache for the files, but enable the kernel cache for the image, one trades CPU for and saves some memory; (useful for example when the application reading the files already has its own caches;)
  • (my proposed default) enable the kernel cache for files, but disable the kernel cache for the image, one saves on memory for the image but trades some CPU (less than in the previous case); (I think this would be the closest thing to how a normal file-system works, only actual files are cached, but not the block device data;)
  • disable the kernel cache for both files and image, one would heavily trade CPU for minimal memory usage; (this would be useful for example when one needs only a single pass of the stored files;)
  • (the current default?) enable the kernel cache for both files and image, one would trade memory for minimal CPU usage;

Originally posted by @cipriancraciun in #9 (comment)

Error:variable ‘folly::symbolizer::Symbolizer symbolizer’ has initializer but incomplete type

I'm trying to create ebuild for 0.3.0 dwarfs version. And here is an error:
/usr/bin/x86_64-pc-linux-gnu-g++ -DDWARFS_HAVE_LIBLZ4 -DDWARFS_HAVE_LIBLZMA -DDWARFS_HAVE_LIBZSTD -DDWARFS_STATIC_BUILD=OFF -DDWARFS_USE_JEMALLOC -DDWARFS_VERSION="" -DFMT_LOCALE -DFMT_SHARED -DGFLAGS_IS_A_DLL=0 -Ddwarfs_EXPORTS -Iinclude -I/usr/include/libiberty -isystem folly -isystem thrift -isystem fbthrift -isystem zstd/lib -isystem xxHash -isystem . -march=skylake -mtune=skylake -O2 -pipe -mmmx -msse -msse2 -msse3 -mssse3 -mcx16 -msahf -maes -mpclmul -mpopcnt -mabm -mfma -mbmi -msgx -mbmi2 -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mrtm -mhle -mrdrnd -mf16c -mfsgsbase -mrdseed -mprfchw -madx -mfxsr -mxsave -mxsaveopt -mclflushopt -mxsavec -mxsaves --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=6144 -fPIC -Wall -Wextra -pedantic -pthread -std=c++17 -MD -MT CMakeFiles/dwarfs.dir/src/dwarfs/logger.cpp.o -MF CMakeFiles/dwarfs.dir/src/dwarfs/logger.cpp.o.d -o CMakeFiles/dwarfs.dir/src/dwarfs/logger.cpp.o -c src/dwarfs/logger.cpp
src/dwarfs/logger.cpp: In member function ‘virtual void dwarfs::stream_logger::write(dwarfs::logger::level_type, const string&, const char*, int)’:
src/dwarfs/logger.cpp:102:51: error: variable ‘folly::symbolizer::Symbolizer symbolizer’ has initializer but incomplete type
102 | Symbolizer symbolizer(LocationInfoMode::FULL);
| ^

Image contents are not accessible when mounting on 0.5.2 and up

As the title says, mounting any dwarfs image results in a seemingly successful mount, with all the filesystem contents visible, however no files within are accessible. Notably, this issue does not occur with the -f flag enabled, something I noticed when trying to get debug output. Also note that my distribution uses fuse version 29. This happens with both the prebuilt dwarfs2 binary on the releases page and my own local builds.

mkdwarfs 0.5.0 crashes at creating images

Unlike the previous time, now it crashes with any compression level.

** Aborted at 1617652815 (Unix time, try 'date -d @1617652815') ***
*** Signal 4 (SIGILL) (0x4fd686) received by PID 9296 (pthread TID 0x2c0e1c0) (linux TID 9296) (code: illegal operand), stack trace: ***
Illegal instruction (core dumped)

mkdwarfs-0.5.0-coredump.zip

This is with the static 0.5.0 binary from the releases page. Manually compiled dynamically linked build works fine.

Add rubygen-ronn dependency

to the README and somehow integrate the man/Makefile into cmake.
I had to manually do it, and only then was able to do a sudo make install

Sharing the test archive

I noticed that, during the comparison with zpaq, the "placebo" compression mode (-m5) was used while, in reality, the default one (-m1) is almost always used.
Could you please share the file you used for testing to make some analysis?
Thanks

Enhance `mkdwarfs` to support specifying a list of files to include (similar to `cpio`)

A very nice feature to cpio (actually the only operation mode) or tar (via the --files-from) is the option of specifying a list of files to include (instead of recursing through the root folder).

Such a feature would allow one to easily exclude certain files from the source, without having to resort to rsync for example to build a temporary tree.

This could work in conjunction with -i as such: any file within the list are treated as relative to the -i folder, regardless if they start with /, ./ or plain path. Also warn if one tries to traverse outside the -i folder. For example, given that -i source is used:

  • whatever is actually source/whatever;
  • ./whatever is the same as above;
  • /whatever is the same as above;
  • ../whatever would issue an error as it tries to escape the source;
  • a/b/../../c is actually source/c, although it could issue a warning;
  • /some-folder (given it is a folder) would not be recursed, but only itself is created within the resulting image; (it is assumed that one would add other files afterwards);

Also it would be nice to have an option to zero-terminate files list instead of newline.


The above could be quite simple to implement, however an even more useful option would be something like this:

  • in the Linux kernel there is a small tool gen_init_cpio.c (https://github.com/torvalds/linux/blob/master/usr/gen_init_cpio.c#L452) which takes a file describing how an cpio archive (to be used for the initramfs) should be created (see the source code at the hinted line for the file syntax); thus in addition to the previous feature of file-lists, such a "file-system" descriptor would allow one to create (without root credentials on his machine) a file-system with any layout;
  • as an extension to the above, perhaps JSON would be a better choice; :)

dwarfsextract aborts instead of skipping when corrupt files are encountered

Extracting 2b2tca.dwarfs...
E 03:24:38.703663 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714209 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714279 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714363 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714432 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714515 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714588 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714693 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714805 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.714884 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715012 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715088 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715217 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715300 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715390 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715468 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715540 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715607 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715676 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715740 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715820 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715897 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.715975 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716048 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716123 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716199 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716266 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716341 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716408 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716484 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716555 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716654 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716719 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716794 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716876 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.716950 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717018 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717081 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717174 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717247 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717354 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717426 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717502 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717570 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717641 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717710 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717773 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717842 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717909 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.717983 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718189 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718259 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718332 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718401 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718487 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718557 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718631 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718690 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718769 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718831 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718902 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.718977 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
E 03:24:38.719046 dwarfs::runtime_error: LZMA: decompression failed (data is corrupt)
I 03:24:41.164877 blocks created: 1298
I 03:24:41.164926 blocks evicted: 1290
I 03:24:41.164955 request sets merged: 9576
I 03:24:41.164983 total requests: 150010
I 03:24:41.165008 active hits (fast): 22247
I 03:24:41.165033 active hits (slow): 122699
I 03:24:41.165063 cache hits (fast): 3184
I 03:24:41.165090 cache hits (slow): 582
I 03:24:41.165117 total bytes decompressed: 87106428928
I 03:24:41.180369 average block decompression: 100.0%
I 03:24:41.180425 fast hit rate: 16.953%
I 03:24:41.180464 slow hit rate: 82.182%
I 03:24:41.180502 miss rate: 0.865%
dwarfs::runtime_error: extraction aborted

Mounts just fine with dwarfs, and almost every file reads just fine - however, as soon as dwarfsextract encounters invalid data, it seems to just completely bail out, unlike dwarfs. I would use dwarfs to extract instead, but the performance seems to be orders of magnitude slower.

I think the expected behavior ought to be printing a warning and skipping the file instead.

PS: dwarfsextract doesn't seem to give realtime progress updates like mkdwarfs does either :(

Add comparison with `erofs` (present Linux mainline since 5.x)

Apparently the erofs read-only file-system has made it into the mainline stable Linux kernel (since 5.x):

https://www.kernel.org/doc/html/latest/filesystems/erofs.html

Therefore it would be nice to compare dwarfs against erofs, especially since its purpose seems similar (i.e. performance).

From my own limited experimentation with erofs it seems to be twice as fast compared to squashfs.

For example compressing dwarfs own source folder and build folder (with a block size of 4 KiB and using LZ4 or LZ4HC where possible), yields the following:

# mkdwarfs -i . -o /tmp/dwfs.img -l 9 -S 12 -C lz4hc --no-owner --no-time
197M    /tmp/dwfs.img

# mkdwarfs -i . -o /tmp/dwfs.img-d --no-owner --no-time
114M    /tmp/dwfs.img-d

# mkfs.erofs -z lz4,9 -x -1 -T 0 -E force-inode-extended -- /tmp/erofs.img .
216M    /tmp/erofs.img

# mksquashfs . /tmp/sqfs.img -b 4K -reproducible -mkfs-time 0 -all-time 0 -no-exports -no-xattrs -all-root -progress -comp lz4 -Xhc -noappend
196M    /tmp/sqfs.img

Mounting them and reading yields:

  • erofs ~900 MiB/s, no difference on repeats;
  • squashfs ~400 MiB/s, no difference on repeats;
  • dwarfs (4KB) ~250 MiB/s, and on repeat ~900 MiB/s;
  • dwarfs (all defaults) ~300 MiB/s, and on repeat ~900 MiB/s;

Thus, and this is a guess, I would say that erofs is as fast as dwarfs, at least on repeat reads (although all images were stored on tmpfs).

systemd recipe?

I'm struggling to come up with a proper systemd recipe. I'm a systemd noob.
When starting it as systemd service dependent on local-fs my mount is not readable by the local user.

systemctl enable dwarfs-mount
systemctl start dwarfs-mount
$ ls -al
drwxr-xr-x.  5 rurban 69632 Nov 30 13:48 perl
d??????????  ? ?          ?            ? perl.s

when starting it locally it works fine.

sudo cat /etc/systemd/user/dwarfs-mount.service
[Unit]
Description=Local DwarfFS Mounts
Documentation=man:dwarfs(1) https://github.com/mhx/dwarfs
DefaultDependencies=no
#ConditionKernelCommandLine=
OnFailure=emergency.target
Conflicts=umount.target
# Run after core mounts
After=-.mount var.mount
After=systemd-remount-fs.service
# But we run *before* most other core bootup services that need write access to /etc and /var
#Before=local-fs.target umount.target
#Before=systemd-random-seed.service plymouth-read-write.service systemd-journal-flush.service
#Before=systemd-tmpfiles-setup.service

[Service]
Type=oneshot
User=rurban
Group=root
RemainAfterExit=yes
ExecStart=/usr/local/bin/dwarfs /usr/src/perl/perl.dwarfs /usr/src/perl.s
StandardInput=null
StandardOutput=journal
StandardError=journal+console

[Install]
WantedBy=local-fs.target

Or maybe should I just add it to my fstab?

EDIT: Beware With an syntax error in such a system unit, depending on local-fs, you can lock yourself out and need to boot from USB. emergency mode i.e. rescue.target will not work.

SIGBUS happened again - twice!

⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
scanning: /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2tca/backups/backup-2b2tca-10june2019/main_map_nether/DIM-1/region/r.11.-2.mca
678293 dirs, 297774/10 soft/hard links, 46525/5413471 files, 0 other
original size: 91.28 GiB, dedupe: 24.74 GiB (15061 files), segment: 0 B
filesystem: 0 B in 0 blocks (0 chunks, 31454/5398400 inodes)
compressed filesystem: 0 blocks/0 B written
▏                                                                                                                                                                                      ▏  0% /
*** Aborted at 1623187919 (Unix time, try 'date -d @1623187919') ***
*** Signal 7 (SIGBUS) (0x7f3559c9b000) received by PID 5233 (pthread TID 0x7f35853eb700) (linux TID 5254) (code: nonexistent physical address), stack trace: ***
Bus error (core dumped)

The same issue, as described in issue #45, happened again. On the pre-compiled 0.5.5 release binaries. Twice, actually - the first time was on a completely different system, but the second time I was able to get a core dump. Even better, I actually think I know what's causing it.

When I ran mksquashfs instead of mkdwarfs on the exact same data, this happened:

nabla@satella /media/veracrypt3/dwarfs $ doas mksquashfs /media/veracrypt3/dwarfs/mount/ /media/veracrypt2/LiterallyEverything-08-Jun-2021.sqsh -comp zstd -Xcompression-level 22 -b 1M
Parallel mksquashfs: Using 12 processors
Creating 4.0 filesystem on /media/veracrypt2/LiterallyEverything-08-Jun-2021.sqsh, block size 1048576.
[|                                                                                                                                                                     ]   33715/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.-105.2.mca, creating empty file
[\                                                                                                                                                                     ]   34894/7846602   0%
Read failed because Input/output error
[|                                                                                                                                                                     ]   34894/7846602   0%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.-13.5.mca, creating empty file
[/                                                                                                                                                                     ]   35075/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.-1332.157.mcr, creating empty file
[/                                                                                                                                                                     ]   35771/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.-18.-17.mcr, creating empty file
[/                                                                                                                                                                     ]   36003/7846602   0%
Read failed because Input/output error
[-                                                                                                                                                                     ]   36003/7846602   0%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.-19.6.mcr, creating empty file
[-                                                                                                                                                                     ]   47362/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.15.7.mca, creating empty file
[/                                                                                                                                                                     ]   47552/7846602   0%
Read failed because Input/output error
[-                                                                                                                                                                     ]   47553/7846602   0%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.16.-30.mcr, creating empty file
[=/                                                                                                                                                                    ]   48764/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.19531.31249.mca, creating empty file
[=/                                                                                                                                                                    ]   54736/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map/region/r.45.-34.mcr, creating empty file
[=/                                                                                                                                                                    ]   63050/7846602   0%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map_nether/DIM-1/region/r.-14.-9.mcr, creating empty file
[=|                                                                                                                                                                    ]   63522/7846602   0%
Read failed because Input/output error
[=/                                                                                                                                                                    ]   63522/7846602   0%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map_nether/DIM-1/region/r.-16.-8.mca, creating empty file
[=|                                                                                                                                                                    ]   63758/7846602   0%
Read failed because Input/output error
[=/                                                                                                                                                                    ]   63760/7846602   0%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map_nether/DIM-1/region/r.-17.5.mcr, creating empty file
[=-                                                                                                                                                                    ]   66762/7846602   0%
Read failed because Input/output error
[=\                                                                                                                                                                    ]   66762/7846602   0%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2t.ca world and plugins/main map_nether/DIM-1/region/r.-33.21.mcr, creating empty file
[=\                                                                                                                                                                    ]   88907/7846602   1%
Read failed because Input/output error

Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2tca/backups/backup-2b2tca-10june2019/main_map/region/r.-17.-6.mca, creating empty file
[=\                                                                                                                                                                    ]   91183/7846602   1%
Read failed because Input/output error
[=|                                                                                                                                                                    ]   91183/7846602   1%
Failed to read file /media/veracrypt3/dwarfs/mount//2b2tca.dwarfs/2b2tca/backups/backup-2b2tca-10june2019/main_map/region/r.-3.31.mca, creating empty file
[==-                                                                                                                                                                   ]  115004/7846602   1%

So, here's my theory (and I'm bad at these theories, so please take it with a grain of salt): mksquashfs probably just reads these files with regular old open() and read() calls, so whenever it encounters an I/O error, it can just skip the file and create an empty one as if nothing happened. But mkdwarfs, as @mhx mentioned in the previous issue about this, makes extensive use of mmap, so perhaps every time an I/O error occurs, the region of memory represented by the file it's trying to read becomes inaccessible, and a SIGBUS is triggered instead?

Perhaps this SIGBUS could be caught, and behavior similar to mksquashfs could be preserved whereby the file is simply skipped and replaced with an empty one, or maybe we could try re-reading the file several times before giving up and moving on instead?

Also of note - these I/O errors are coming from a mounted DwarFS filesystem. When I got the SIGBUS error in #45, I was trying to read from a bunch of SquashFS filesystems, not a bunch of DwarFS filesystems, so this could be an issue with the physical disk (Although I still think DwarFS should definitely be robust enough to skip these errors rather than completely bailing out, as I uh... do actually need to recover these files)

Even more bizarrely, despite both SquashFS and DwarFS failing consistently when trying to read roughly the same files, when I re-mounted the 2b2tca.dwarfs filesystem in question, I was able to read all of its content without any I/O errors at all. Truly baffling.

Anyway, I have replied to the email I sent @mhx of the core dump for the previous issue with the new core dump (which is actually much smaller this time), so hopefully it's possible to figure out where this is specifically happening.

`mkdwarfs` refuses to start if the source is a symlink to a folder

As the title says, if the given source argument is a symlink to a folder, mkdwarfs fails with an error stating it wants a folder.

As a workaround one could call mkdwarfs -i .../folder/ -o .../output.

I would suggest allowing (perhaps warning) a symlink to a folder as a source argument.

Question about read-only vs read-write

Hi,

DwarFS looks absolutely brilliant. But are there any plans to make it read-write, or is the plan to keep it as a read-only file system?

I would love to try it out for checking out several large (and similar, but different) git repositories, and then building them. We have several 300GB repositories at work, but most developers only have a 1T disk, so it quickly fills up.

Would you recommend an overlay filesystem on top of DwarFS for this use case, for now?

Thanks!

Static build fails for dwarfsextract

Static build fails for dwarfsextract due to unresloved references from libarchive.a
I guess it happens because libarchive from focal binary package is statically linked to more dependencies that are specified in static_link.sh (as per https://launchpad.net/ubuntu/focal/+source/libarchive)

So dwarfs static build requires either custom libarchive.a that matches supported formats or larger set of dependencies

[430/431] Linking CXX executable dwarfsextract FAILED: dwarfsextract : && /bin/bash /mnt/d/Projects/5.Projects/tebako/deps/src/_dwarfs/cmake/static_link.sh dwarfsextract CMakeFiles/dwarfsextract.dir/src/dwarfsextract.cpp.o && : ... /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_xar.o): in function compression_code_bzip2':
(.text+0xc5d): undefined reference to BZ2_bzCompress' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_xar.o): in function xar_options':
(.text+0x11b1): undefined reference to lzma_cputhreads' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_xar.o): in function compression_end_bzip2':
(.text+0x17ec): undefined reference to BZ2_bzCompressEnd' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_xar.o): in function xar_compression_init_encoder':
(.text+0x20fa): undefined reference to BZ2_bzCompressInit' /usr/bin/ld: (.text+0x2261): undefined reference to lzma_stream_encoder_mt'
/usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_cryptor.o): in function aes_ctr_encrypt_counter': (.text+0x52): undefined reference to nettle_aes_set_encrypt_key'
/usr/bin/ld: (.text+0x6d): undefined reference to nettle_aes_encrypt' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_cryptor.o): in function pbkdf2_sha1':
(.text+0x2a1): undefined reference to nettle_pbkdf2_hmac_sha1' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha512final':
(.text+0x11): undefined reference to nettle_sha512_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha384update':
(.text+0x32): undefined reference to nettle_sha512_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha512init':
(.text+0x49): undefined reference to nettle_sha512_init' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha384final':
(.text+0x71): undefined reference to nettle_sha384_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha384init':
(.text+0x89): undefined reference to nettle_sha384_init' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha256final':
(.text+0xb1): undefined reference to nettle_sha256_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha256update':
(.text+0xd2): undefined reference to nettle_sha256_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha256init':
(.text+0xe9): undefined reference to nettle_sha256_init' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha1final':
(.text+0x111): undefined reference to nettle_sha1_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha1update':
(.text+0x132): undefined reference to nettle_sha1_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha1init':
(.text+0x149): undefined reference to nettle_sha1_init' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_ripemd160final':
(.text+0x171): undefined reference to nettle_ripemd160_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_ripemd160update':
(.text+0x192): undefined reference to nettle_ripemd160_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_ripemd160init':
(.text+0x1a9): undefined reference to nettle_ripemd160_init' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_md5final':
(.text+0x1d1): undefined reference to nettle_md5_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_md5update':
(.text+0x1f2): undefined reference to nettle_md5_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_md5init':
(.text+0x209): undefined reference to nettle_md5_init' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_digest.o): in function __archive_nettle_sha512update':
(.text+0x232): undefined reference to nettle_sha512_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_hmac.o): in function __hmac_sha1_init':
(.text+0x92): undefined reference to nettle_hmac_sha1_set_key' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_hmac.o): in function __hmac_sha1_final':
(.text+0x4d): undefined reference to nettle_hmac_sha1_digest' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_hmac.o): in function __hmac_sha1_update':
(.text+0x6e): undefined reference to nettle_hmac_sha1_update' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_7zip.o): in function compression_code_bzip2':
(.text+0x32d): undefined reference to BZ2_bzCompress' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_7zip.o): in function compression_end_bzip2':
(.text+0xc0c): undefined reference to BZ2_bzCompressEnd' /usr/bin/ld: /usr/lib/x86_64-linux-gnu/libarchive.a(archive_write_set_format_7zip.o): in function _7z_compression_init_encoder':`

arch does not match x86_64 — It seems like architecture doesn't always defined correctly

There are bug fix which disables SSE2/AVX2 compile flags for LtHash SIMD code when arch does not match x86_64

-- arch does not match x86_64, skipping setting SSE2/AVX2 compile flags for LtHash SIMD code

In a certain configurations autodetecting does not work as expected :)
It disabled in Intel Skylake (Gentoo)

uname -a
Linux RCEngine 5.11.0-pf6-RarogCmex #1 SMP PREEMPT Fri Apr 2 10:42:06 +05 2021 x86_64 Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz GenuineIntel GNU/Linux

cat /proc/cpuinfo

cpuinfo.txt

FUSE daemon `-o cachesize` issue

So I've created quite a largish image, of about 800 MiB, which uncompressed had around 2.5 GiB.

I've tried starting dwarfs with -o cachesize=128m, ran a find /tmp/dwarfs -type f -exec md5sum {} + and after it was done the FUSE daemon process still retains ~800 MiB to ~1 GiB of RAM. (This is not the virtual memory, but instead the RES column of htop which reports the actual memory committed. When dwarfs starts it reports ~ 150 MiB.)

Now given that there are no more any open files, and the fact that I've already passed through all the files, there shouldn't be any uncompressed blocks (thus uncompression state) lingering around.

Even using an uncompressed image (i.e. -l 0) of ~400 MiB uncompressed size, results in ~400 MiB RAM usage.

Clear compile-time jemalloc definition

Good daytime!
I'm Gentoo GURU project (semi-official) maintainer. I've ported dwarfs to Gentoo.
From dependency list dwarfs requires libjemalloc-dev, but It was succesfully build without ones? Does jemalloc really need?
Could you set up clear compile-time cmake option (something like current WITH_LUA) to set up that jemalloc really used?

Comparison with wimlib

DwarFS seems pretty nice. File access within the archive is quick. Main feature seems to be deduplication. Judging by resulting file sizes, I'm guessing this is based on whole file deduplication rather than being block-based?

Downside... DwarFS seems slow to make, compared to both wimlib wimcapture and squashfs.

Testing with a copy of every released Wine version, extracted by doing for tag in $(git tag); do git archive --prefix=$tag/ $tag | tar -xC /mnt/wine; done (requires, naturally, the Wine git repository):

$ time mkdwarfs -i wine -o wine.dwarfs
10:04:50.260867 scanning wine
10:04:59.692484 waiting for background scanners...
14:10:05.911222 assigning directory and link inodes...
14:10:06.312963 finding duplicate files...
14:10:11.475558 saved 53.69 GiB / 64.85 GiB in 2907224/3117413 duplicate files
14:10:11.475645 ordering 210189 inodes by similarity...
14:10:11.642889 210189 inodes ordered [167.2ms]
14:10:11.642926 assigning file inodes...
14:10:11.644702 building metadata...
14:10:11.644753 building blocks...
14:10:11.644802 saving names and links...
14:10:12.103973 updating name and link indices...
14:43:56.247557 waiting for block compression to finish...
14:43:56.247884 saving chunks...
14:43:56.275000 saving directories...
14:43:58.693785 waiting for compression to finish...
14:43:58.813425 compressed 64.85 GiB to 183.4 MiB (ratio=0.00276233)
14:43:59.328251 filesystem created without errors [1.675e+04s]
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
waiting for block compression to finish
scanned/found: 362904/362904 dirs, 0/0 links, 3117413/3117413 files
original size: 64.85 GiB, dedupe: 53.69 GiB (2907224 files), segment: 6.309 GiB
filesystem: 4.847 GiB in 311 blocks (1460981 chunks, 210189/210189 inodes)
compressed filesystem: 311 blocks/183.4 MiB written
█████████████████████████████████████████████████████████████████████████▏100% /

real	279m9.270s
user	26m17.945s
sys	3m53.332s
$ time mksquashfs wine wine.squashfs -comp zstd
Parallel mksquashfs: Using 12 processors
Creating 4.0 filesystem on wine.squashfs, block size 131072.
[=================================================================================|] 3284743/3284743 100%

Exportable Squashfs 4.0 filesystem, zstd compressed, data block size 131072
	compressed data, compressed metadata, compressed fragments,
	compressed xattrs, compressed ids
	duplicates are removed
Filesystem size 2074564.87 Kbytes (2025.94 Mbytes)
	3.04% of uncompressed filesystem size (68204545.10 Kbytes)
Inode table size 31817047 bytes (31071.33 Kbytes)
	28.29% of uncompressed inode table size (112449867 bytes)
Directory table size 28385936 bytes (27720.64 Kbytes)
	41.57% of uncompressed directory table size (68284423 bytes)
Number of duplicate files found 2907225
Number of inodes 3480317
Number of files 3117413
Number of fragments 47404
Number of symbolic links  0
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 362904
Number of ids (unique uids + gids) 1
Number of uids 1
	chungy (1000)
Number of gids 1
	chungy (1000)

real	153m19.319s
user	89m22.676s
sys	2m14.197s
$ time wimcapture --unix-data --solid wine wine.wim
Scanning "wine"
64 GiB scanned (3117413 files, 362904 directories)    
Using LZMS compression with 12 threads
Archiving file data: 11 GiB of 11 GiB (100%) done

real	79m20.722s
user	42m30.817s
sys	1m37.350s
$ du wine.*
184M	wine.dwarfs
2.0G	wine.squashfs
173M	wine.wim

wimlib is significantly faster to create this massive archive than DwarFS, and the resulting file size is marginally smaller. Git itself stores the Wine history in about 310MB, though that's not the fairest of comparisons given git's delta-based storage and the inclusion of every interim commit between the releases too.

DwarFS still beats out this particular WIM archive for performance as a mounted file system, because I used solid compression and random access in wimlib is not fast in this circumstance. I also think (correct me if I'm wrong!) that a solid archive was the better comparison, since DwarFS seems to group like files together and compress them as one unit (311 blocks in this particular file system). wimcapture's mode compresses each stream individually and the archive size balloons up to 2.4GB while making random access much quicker.

Create empty files when unable to access them

Would be useful to have an additional command line option to tell mkdwarfs to create empty files when it can't access them. Mksquashfs does this by default:

Failed to read file dir/filename, creating empty file

Useful for preserving the file structure of an input directory even when not all files are readble (for instance, if it's owned by another user).

Not a highly important feature, of course, but it would be nice to have it, if it's not too hard to implement.

Mkdwarfs crashes when creating images using compression levels 6+

Hello!

Mkdwarfs crashes for me when i use compression levels 6 and higher, lower levels work fine. This is the error i get:

** Aborted at 1616416371 (Unix time, try 'date -d @1616416371') ***
*** Signal 4 (SIGILL) (0x4b9ec2) received by PID 24703 (pthread TID 0x25a2140) (linux TID 24703) (code: illegal operand), stack trace: ***
Illegal instruction (core dumped)

mkdwarfs-coredump.zip

I'm using the latest (0.4.1) static executables from the releases page. OS is Arch Linux.

dwarfs attempts to load googletest even if it's already installed during compilation

If tests enabled with -DWITH_TESTS=ON and googletest ( dev-cpp/gtest ) installed globally, dwarfs' cmake tries to download it regardless, that causes crash with network-sandbox.

-- Configuring done
-- Generating done
-- Build files have been written to: /var/tmp/portage/sys-fs/dwarfs-0.5.2-r1/work/dwarfs-0.5.2/googletest-download
[1/9] Creating directories for 'googletest'
[2/9] Performing download step (git clone) for 'googletest'
FAILED: googletest-prefix/src/googletest-stamp/googletest-download 
cd /var/tmp/portage/sys-fs/dwarfs-0.5.2-r1/work/dwarfs-0.5.2 && /usr/bin/cmake -P /var/tmp/portage/sys-fs/dwarfs-0.5.2-r1/work/dwarfs-0.5.2/googletest-download/googletest-prefix/tmp/googletest-gitclone.cmake && /usr/bin/cmake -E touch /var/tmp/portage/sys-fs/dwarfs-0.5.2-r1/work/dwarfs-0.5.2/googletest-download/googletest-prefix/src/googletest-stamp/googletest-download
Cloning into 'googletest-src'...
fatal: unable to access 'https://github.com/google/googletest.git/': Could not resolve host: github.com
Cloning into 'googletest-src'...
fatal: unable to access 'https://github.com/google/googletest.git/': Could not resolve host: github.com
Cloning into 'googletest-src'...
fatal: unable to access 'https://github.com/google/googletest.git/': Could not resolve host: github.com
-- Had to git clone more than once:
          3 times.
CMake Error at googletest-download/googletest-prefix/tmp/googletest-gitclone.cmake:31 (message):
  Failed to clone repository: 'https://github.com/google/googletest.git'

Solution: it should use system googletest library

build failed with boost-1.77.0: block_compressor.cpp:400:10: error: no type named 'mutex' in namespace 'std'

build.log

Tail of error:

/var/tmp/portage/sys-fs/dwarfs-0.5.6-r1/work/dwarfs-0.5.6/src/dwarfs/block_compressor.cpp:400:10: error: no type named 'mutex' in namespace 'std'
    std::mutex mx_;
    ~~~~~^
/var/tmp/portage/sys-fs/dwarfs-0.5.6-r1/work/dwarfs-0.5.6/src/dwarfs/block_compressor.cpp:433:15: error: no type named 'mutex' in namespace 'std'
  static std::mutex s_mx;
         ~~~~~^
/var/tmp/portage/sys-fs/dwarfs-0.5.6-r1/work/dwarfs-0.5.6/src/dwarfs/block_compressor.cpp:386:22: error: expected ';' after expression
      std::lock_guard lock(mx_);
                     ^
                     ;
/var/tmp/portage/sys-fs/dwarfs-0.5.6-r1/work/dwarfs-0.5.6/src/dwarfs/block_compressor.cpp:386:12: error: no member named 'lock_guard' in namespace 'std'
      std::lock_guard lock(mx_);
      ~~~~~^

It happens after boost update

[I] dev-libs/boost
Available versions: 1.76.0-r1(0/1.76.0)^t (~)1.77.0-r2(0/1.77.0)^t {bzip2 context debug doc icu lzma mpi +nls numpy python static-libs +threads tools zlib zstd ABI_MIPS="n32 n64 o32" ABI_S390="32 64" ABI_X86="32 64 x32" PYTHON_TARGETS="python3_8 python3_9 python3_10"}
Installed versions: 1.77.0-r2(0/1.77.0)^t(16:50:23 09/18/21)(bzip2 context icu lzma nls zlib zstd -debug -doc -mpi -numpy -python -tools ABI_MIPS="-n32 -n64 -o32" ABI_S390="-32 -64" ABI_X86="64 -32 -x32" PYTHON_TARGETS="python3_9 -python3_8 -python3_10")
Homepage: https://www.boost.org/
Description: Boost Libraries for C++

Ability to mount images by offset

Would be useful (for the situations when a dwarfs image is a part of another file) to have an ability to mount dwarfs images by offset. Like:

dwarfs -o offset=123 file mountpoint

The release bundles don't seem to contain dependencies

I've just downloaded both 0.2.2 and 0.2.3 tar.gz bundles from GitHub (the releases pane on the right), and when trying to configure it with cmake it complains that it fails to find CMakeLists.txt in both folly and fbthrift. Inspecting those folders they are empty.

Doing a git clone and submodules update does seem to fix the issue.

Thus it would be a good idea to include in the bundles those two dependencies, or update the readme to point out to users how to populate those folders.

mkdwarfs aborted with SIGBUS after around 13 hours of runtime

Here's the log:

nabla@satella /media/veracrypt1/squash $ mkdwarfs -i /media/veracrypt1/squash/mp/ -o "/run/media/nabla/General Store/TEMP/everything.dwarfs"
I 17:46:07.266160 scanning /media/veracrypt1/squash/mp/
E 18:14:36.699276 error reading entry: readlink('/media/veracrypt1/squash/mp//raid0array0-2tb-2018.sqsh/Program Files (x86)/Internet Explorer/ExtExport.exe'): Invalid argument
I 19:27:28.763515 assigning directory and link inodes...
I 19:27:29.319281 waiting for background scanners...
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
scanning: /media/veracrypt1/squash/mp//pucktop-echidna-dec2020.sqsh/.local/share/Steam/steamapps/common/Half-Life 2/hl2/bin/server.so
694746 dirs, 299340/1488 soft/hard links, 1254591/5749940 files, 0 other
original size: 1.352 TiB, dedupe: 200.6 GiB (364325 files), segment: 0 B
filesystem: 0 B in 0 blocks (0 chunks, 888778/5384127 inodes)
compressed filesystem: 0 blocks/0 B written
▏                                                                                                                                ▏  0% /
*** Aborted at 1619766236 (Unix time, try 'date -d @1619766236') ***
*** Signal 7 (SIGBUS) (0x7fe8f3af8000) received by PID 15018 (pthread TID 0x7fe94c3e8640) (linux TID 15042) (code: nonexistent physical address), stack trace: ***
/usr/lib64/libfolly.so.0.58.0-dev(+0x2b64bf)[0x7fe9599e54bf]
/usr/lib64/libfolly.so.0.58.0-dev(_ZN5folly10symbolizer21SafeStackTracePrinter15printStackTraceEb+0x31)[0x7fe959924471]
/usr/lib64/libfolly.so.0.58.0-dev(+0x1f6112)[0x7fe959925112]
/lib64/libc.so.6(+0x396cf)[0x7fe9592286cf]
/usr/lib64/libxxhash.so.0(XXH3_64bits_update+0x774)[0x7fe958c6d584]
/usr/lib64/libdwarfs.so(+0x788cd)[0x7fe959e4f8cd]
/usr/lib64/libdwarfs.so(_ZN6dwarfs4file4scanERKSt10shared_ptrINS_4mmifEERNS_8progressE+0x95)[0x7fe959e5b525]
/usr/lib64/libdwarfs.so(+0xe9a89)[0x7fe959ec0a89]
/usr/lib64/libdwarfs.so(+0xf7f6b)[0x7fe959ecef6b]
/usr/lib/gcc/x86_64-pc-linux-gnu/10.2.0/libstdc++.so.6(+0xd315f)[0x7fe95949f15f]
/lib64/libpthread.so.0(+0x7fbd)[0x7fe959142fbd]
/lib64/libc.so.6(clone+0x3e)[0x7fe9592ee26e]
(safe mode, symbolizer not available)
Bus error

I left this running while trying to compress over 8 TiB of data, and after about 13 hours of scanning, it just sorta crashed and gave up. I don't really want to run it again to debug it or anything, so I'm just going to leave this here.

Running Gentoo Linux on a Ryzen 5 3600 with 64 GB of memory, if that helps.

Sorry about the lack of information. I'd really like to provide more - and if there's anything you'd like me to try to resolve this, let me know (I really like dwarfs, and was hoping it would work for this obscenely large dataset too!) Just uhhh.. keep in mind that I'm prooobably not going to wait 13 hours again unless I know it works :/

EDIT: Forgot to specify my version number. Whoops. I'm using 0.5.4-rc2 from the GURU repositoriy here https://github.com/gentoo/guru/blob/master/sys-fs/dwarfs/dwarfs-0.5.4-r2.ebuild - I did build with -O3, but dwarfs seems to work just fine with smaller inputs so I dunno. Specifically I'm using this: https://github.com/InBetweenNames/gentooLTO

Decompress/extracting dwarfs images?

I apologize if there is a way to do this and I'm just not seeing it after reading the documentation a few times over, but there doesn't appear to be any direct way to extract created dwarfs images which is a serious missing feature.

Of course, this can be done by instead mounting the FUSE filesystem and copying out, but it seems like a big oversight (assuming there is actually no way to do this and I'm not blind).

Possibility to build atop exact system libraries (unbundle its)

At this moment dwarfs use own bundled libraries. That's bad practice if we are building in source-based system like gentoo: I've just found off that revdep-rebuild (utility which checks if all package are consistent) triggers to dwarfs:
RarogCmexDell ~ # revdep-rebuild --pretend

  • This is the new python coded version
  • Please report any bugs found using it.
  • The original revdep-rebuild script is installed as revdep-rebuild.sh
  • Please file bugs at: https://bugs.gentoo.org/
  • Collecting system binaries and libraries
  • Checking dynamic linking consistency
  • Assign files to packages

emerge --pretend --oneshot --complete-graph=y sys-fs/dwarfs:0

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild R ~] sys-fs/dwarfs-0.3.1-r2

So it'll be good to unbound at least xxhash and zstd sources (at this moment it used in cmake, I'm not able to patch cmake because I don't know it) like previous 0.2.4 version which does not relly on bundled libraries.

The ebuild to play around in GURU repository:
eselect repository enable guru

Extra non-flag arguments are ignored (instead of issuing a warning or error)

At least for mkdwarfs if one sets an extra argument like for example mkdwarfs -i ... -o ... Z (i.e. the Z) it just ignores it without failing.

Although this is not a major issue, especially in case of wrapper scripts and automation tools, a small bug in the user's code could fail for example to prepend -S before the user input, and thus that input would just be silently ignored by the tool.

Bug report: dwarfs fails on tiny window sizes

I'm having a consistent problem with all filesystems created with -W values smaller than 8. When I try to copy the mounted filesystem to another location, or read several files sequentially, the process shortly gets paused indefinitely, and the process manager shows dwarfs at 0% CPU usage. I'm attaching a small sample file created with options -S 26 -B 8 -W 4
I'm using dwarfs (v0.5.6-16-g7345578, fuse version 35)

Grabbled output due to progress bar and mismatched `TERM`

For some reason, if running mkdwarfs over SSH under screen from urxvt (i.e. urxvt -e ssh user@host then run screen then run mkdwarfs) the progress bar gets grabbled and outputs â¯â¯â¯â¯â¯â¯.

I've run strace and the progress bar seems to be written as:

[pid 17539] write(2, "\342\216\257\342\216\257\342\216\257\342\216\257\342\216\257\342\216\257\342\216\257\342\216\257\342\216\257\342\216\257\342\216"..., 1206)

Running with an empty environment (i.e. env -i) and even setting TERM to any of vt100, linux, rxvt-unicode, screen, screen.rxvt, xterm doesn't seem to fix it.

Granted if one doesn't use screen the progress bar looks OK. (Although locally on my laptop, with a newer screen it does seem to work just fine.)

My assumption is that the progress bar characters trick screen into displaying wrong characters.

Perhaps add an option to use ASCII-only progress bar or VT100 only compliant codes. Alternatively, add an option to print the progress from time-to-time as simple print statements as opposed to the current nice progress dashboard.

Enhance `mkdwarfs` to support file permission normalization

Just like --no-owner and --no-time are useful to build "generic" images, it would also be useful to have an option that normalizes the file-system permissions. (At the moment they are take verbatim.)

Perhaps the easiest solution is the following:

  • add an option like --perms-norm that basically only cares if any executability bit is set (be it user, group or others), and thus creates entries like r-x r-x r-x or r-- r-- r--;
  • add another option like --perms-umask that takes a octal value and basically caps the permissions; for example --perms-umask 007 would only generate r-x r-x --- or r-- r-- ---;
  • (one could use each option independently of each-other;)

Segfault encountered while creating a large DwarFS image

I was trying to create an image containing the latest tip of the CDNJS repository (https://github.com/cdnjs/cdnjs), and I encountered the following error twice:

waiting for block compression to finish
scanned/found: 544213/544213 dirs, 121/121 links, 7511842/7511842 files
original size: 235.1 GiB, dedupe: 77.02 GiB (6225274 files), segment: 63.22 GiB
filesystem: 94.81 GiB in 6068 blocks (9108158 chunks, 1286568/1286568 inodes)
compressed filesystem: 6068 blocks/9.235 GiB written
ERROR: std::out_of_range: _Map_base::at
Command exited with non-zero status 1

(If you want to checkout the repository yourself, I strongly suggest to use a shallow checkout, --depth 1 --single-branch, and prepare for a lot of waiting...) :)

Add an option for static executables

After I've successfully built 0.2.0 I've tried to see how many dynamic libraries does dwarfs depend on. Unfortunately there are quite a few libraries most of which aren't installed by default...

I would be lovely to have an option to build a static linked executable that can be easily moved from one Linux instance to another.

For example, one important use-case for dwarfs, would be server deployments where one could build a static image of the application (imagine a large Python / Ruby virtual-env and application with lots of files, assets).

$ readelf -d ./dwarfs

 0x0000000000000001 (NEEDED)             Shared library: [libboost_date_time.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_filesystem.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_program_options.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_system.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libfmt.so.7]
 0x0000000000000001 (NEEDED)             Shared library: [libdouble-conversion.so.3]
 0x0000000000000001 (NEEDED)             Shared library: [libgflags.so.2.2]
 0x0000000000000001 (NEEDED)             Shared library: [libglog.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libunwind.so.8]
 0x0000000000000001 (NEEDED)             Shared library: [liblz4.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [liblzma.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libzstd.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libfuse3.so.3]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
$ readelf -d ./mkdwarfs

 0x0000000000000001 (NEEDED)             Shared library: [libboost_date_time.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_filesystem.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_program_options.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libboost_system.so.1.74.0]
 0x0000000000000001 (NEEDED)             Shared library: [libfmt.so.7]
 0x0000000000000001 (NEEDED)             Shared library: [libdouble-conversion.so.3]
 0x0000000000000001 (NEEDED)             Shared library: [libgflags.so.2.2]
 0x0000000000000001 (NEEDED)             Shared library: [libglog.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libcrypto.so.1.1]
 0x0000000000000001 (NEEDED)             Shared library: [libunwind.so.8]
 0x0000000000000001 (NEEDED)             Shared library: [liblz4.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [liblzma.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libzstd.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]

Speeding up mount times without sacrificing compression

So, I've been using DwarFS for a while now, and I'm loving it - I've had success with compressing my huge multi-terabyte backups with version 0.5.5, and all is well, except for one kind of massive problem: The mount times.... are atrocious!

I have a directory containing 1.5 TiB worth of separate DwarFS archives, but I'm going to focus on just one of them specifically for this example, 3TBDRV-PartC-05-Jun-2021.dwarfs, which is 129.1 GiB large, and 191.5 GiB uncompressed, with 452854 files. And oh boy, look at this:

I 17:17:22.057967 file system initialized [4460s]

That took over an hour to mount! (Ryzen 5 3600, 64 GB of RAM, DwarFS archive was stored on a Toshiba PC P300 3 TB hard disk)

In fact, this kind of extremely slow mounting time was consistent with all of the other archives too, even some stored on other drives:

I 16:03:03.018227 file system initialized [709.9ms]
I 16:03:03.849098 file system initialized [1.53s]
I 16:03:33.590178 file system initialized [31.27s]
I 16:03:55.387080 file system initialized [53.07s]
I 16:04:05.065723 file system initialized [62.75s]
I 16:04:29.313615 file system initialized [87s]
I 16:04:31.894768 file system initialized [89.57s]
I 16:04:37.869410 file system initialized [95.55s]
I 16:04:47.800754 file system initialized [105.5s]
I 16:07:38.591166 file system initialized [276.3s]
I 16:10:38.510593 file system initialized [456.2s]
I 16:11:29.503998 file system initialized [507.2s]
I 16:13:05.969340 file system initialized [603.7s]
I 16:26:54.248746 file system initialized [1432s]
I 16:27:27.300756 file system initialized [1465s]
I 16:30:56.814199 file system initialized [1675s]
I 16:45:29.011485 file system initialized [2547s]
I 16:45:40.813002 file system initialized [2558s]
I 16:46:33.700585 file system initialized [2611s]
I 17:12:16.627675 file system initialized [4154s]
I 17:13:47.702170 file system initialized [4245s]
I 17:17:22.057967 file system initialized [4460s]
I 17:27:46.640421 file system initialized [5084s]

I found this in the mkdwarfs documentation:

The metadata has been optimized for very little redundancy and leaving it uncompressed, the default for all levels below 7, has the benefit that it can be mapped to memory and used directly. This improves mount time for large file systems compared to e.g. an lzma compressed metadata block. If you don't care about mount time, you can safely choose lzma compression here, as the data will only have to be decompressed once when mounting the image.

If I'm reading this right, for all compression levels above or equal to 7, DwarFS is taking all of the metadata and decompressing it all at once at mount time. Is there any way this could be improved? Decompressing everything at once seems to be kind of a bad idea. I don't want to outright disable metadata compression, as there's kind of a lot of it and I get the feeling it would benefit from being compressed, but these mount times really are excessive - it actually kind of makes SquashFS preferable to DwarFS for the goal of having a compressed read-only filesystem that mounts quickly and is accessible quickly.

I'll admit I don't know much about DwarFS' actual internals, but how about this: what if the metadata was compressed in multiple separate chunks/blocks of a fixed size, and only the blocks that are actually needed get decompressed at any given time? Perhaps this could be made optional, or even the default at level 7 while 8 and 9 could compress the metadata all at once?

I'm not entirely sure on the specifics of how these mount times could be improved, but I feel like if level 7 is going to be the default then it should at least try to optimize the metadata for fast access (or at least, faster than this) somehow, without completely disabling compression, as compressing the metadata probably helps a lot with DwarFS' excellent space efficiency.

Or maybe compressing the metadata isn't worth it? The statement "The metadata has been optimized for very little redundancy" in the documentation seems to imply that compressing the metadata doesn't really help that much, are there any comparisons we can make between uncompressed and compressed metadata? How worthwhile even is compressing it? Should it continue to be enabled by default?

unbundling

Please provide a way to unbundle folly, fbthrift, fsst and parallel-hashmap. I'm packaging this for gentoo and I'll really appreciate

dwarfs throws link errors on arm64

make[1]: *** [CMakeFiles/Makefile2:459: CMakeFiles/dwarfsextract.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
/usr/bin/ld: metadata_v2.cpp:(.text._ZN6dwarfs9metadata_INS_19debug_logger_policyEEC2ERNS_6loggerEN5folly5RangeIPKhEES9_RKNS_16metadata_optionsEib[_ZN6dwarfs9metadata_INS_19debug_logger_policyEEC2ERNS_6loggerEN5folly5RangeIPKhEES9_RKNS_16metadata_optionsEib]+0x65c): undefined reference to fmt::v8::vformat[abi:cxx11](fmt::v8::basic_string_view<char>, fmt::v8::basic_format_args<fmt::v8::basic_format_context<fmt::v8::appender, char> >)' /usr/bin/ld: metadata_v2.cpp:(.text._ZN6dwarfs9metadata_INS_19debug_logger_policyEEC2ERNS_6loggerEN5folly5RangeIPKhEES9_RKNS_16metadata_optionsEib[_ZN6dwarfs9metadata_INS_19debug_logger_policyEEC2ERNS_6loggerEN5folly5RangeIPKhEES9_RKNS_16metadata_optionsEib]+0x6d0): undefined reference to fmt::v8::vformat[abi:cxx11](fmt::v8::basic_string_view, fmt::v8::basic_format_args<fmt::v8::basic_format_context<fmt::v8::appender, char> >)'
/usr/bin/ld: libdwarfs.a(metadata_v2.cpp.o):metadata_v2.cpp:(.text._ZNK6dwarfs9metadata_INS_19debug_logger_policyEE4dumpERSoiRKNS_15filesystem_infoERKSt8functionIFvRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEjEE[_ZNK6dwarfs9metadata_INS_19debug_logger_policyEE4dumpERSoiRKNS_15filesystem_infoERKSt8functionIFvRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEjEE]+0x4b8): more undefined references to `fmt::v8::vformat[abi:cxx11](fmt::v8::basic_string_view, fmt::v8::basic_format_args<fmt::v8::basic_format_context<fmt::v8::appender, char> >)' follow
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [CMakeFiles/dwarfsbench.dir/build.make:118: dwarfsbench] Error 1
make[1]: *** [CMakeFiles/Makefile2:599: CMakeFiles/dwarfsbench.dir/all] Error 2
/usr/bin/ld: folly/folly/experimental/exception_tracer/libfolly_exception_tracer.a(ExceptionStackTraceLib.cpp.o)(.debug_info+0x38): R_AARCH64_ABS64 used with TLS symbol _ZN12_GLOBAL__N_17invalidE
/usr/bin/ld: folly/folly/experimental/exception_tracer/libfolly_exception_tracer.a(ExceptionStackTraceLib.cpp.o)(.debug_info+0x5a): R_AARCH64_ABS64 used with TLS symbol _ZN12_GLOBAL__N_118uncaughtExceptionsE
/usr/bin/ld: folly/folly/experimental/exception_tracer/libfolly_exception_tracer.a(ExceptionStackTraceLib.cpp.o)(.debug_info+0x74): R_AARCH64_ABS64 used with TLS symbol _ZN12_GLOBAL__N_116caughtExceptionsE
/usr/bin/ld: folly/libfolly.a(CacheLocality.cpp.o)(.debug_info+0x13c2b): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly18SequentialThreadId3getEvE5local
/usr/bin/ld: folly/libfolly.a(AsyncStack.cpp.o)(.debug_info+0x64): R_AARCH64_ABS64 used with TLS symbol _ZN5folly12_GLOBAL__N_127currentThreadAsyncStackRootE
/usr/bin/ld: folly/folly/experimental/exception_tracer/libfolly_exception_tracer.a(ExceptionTracerLib.cpp.o)(.debug_info+0x132bf): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly15SharedMutexImplILb0EvSt6atomicNS_24SharedMutexPolicyDefaultEE26tls_lastDeferredReaderSlotEvE2tl
/usr/bin/ld: folly/folly/experimental/exception_tracer/libfolly_exception_tracer.a(ExceptionTracerLib.cpp.o)(.debug_info+0x1355d): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly15SharedMutexImplILb0EvSt6atomicNS_24SharedMutexPolicyDefaultEE21tls_lastTokenlessSlotEvE2tl
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [CMakeFiles/mkdwarfs.dir/build.make:118: mkdwarfs] Error 1
make[1]: *** [CMakeFiles/Makefile2:564: CMakeFiles/mkdwarfs.dir/all] Error 2
/usr/bin/ld: folly/libfolly.a(SharedMutex.cpp.o)(.debug_info+0x6105): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly15SharedMutexImplILb1EvSt6atomicNS_24SharedMutexPolicyDefaultEE21tls_lastTokenlessSlotEvE2tl
/usr/bin/ld: folly/libfolly.a(SharedMutex.cpp.o)(.debug_info+0x613c): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly15SharedMutexImplILb1EvSt6atomicNS_24SharedMutexPolicyDefaultEE26tls_lastDeferredReaderSlotEvE2tl
/usr/bin/ld: folly/libfolly.a(SharedMutex.cpp.o)(.debug_info+0x61a3): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly15SharedMutexImplILb0EvSt6atomicNS_24SharedMutexPolicyDefaultEE21tls_lastTokenlessSlotEvE2tl
/usr/bin/ld: folly/libfolly.a(SharedMutex.cpp.o)(.debug_info+0x61da): R_AARCH64_ABS64 used with TLS symbol _ZZN5folly15SharedMutexImplILb0EvSt6atomicNS_24SharedMutexPolicyDefaultEE26tls_lastDeferredReaderSlotEvE2tl
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [CMakeFiles/dwarfs_compat_test.dir/build.make:121: dwarfs_compat_test] Error 1
make[1]: *** [CMakeFiles/Makefile2:528: CMakeFiles/dwarfs_compat_test.dir/all] Error 2
make: *** [Makefile:163: all] Error 2

How to disable building binaries for fuse 2?

In case there are both fuse2 and fuse3 version is installed DwarFS will build binaries for both (/usr/sbin/dwarfs for fuse3 and /usr/sbin/dwarfs2 for fuse2). How to override that discouraging behaviour to build dwarfs with only fuse3 or fuse2?

Investigate memory consumption when compressing large files

I have to admit I've been doing most of my compression tasks on machines with 64GB of memory, so optimizing for low memory consumption hasn't really been a priority yet. There are some knobs you might be able to turn, though. I'm not sure large files per se are an issue, but a large number of files definitely is. You might be able to tweak --memory-limit a bit, which determines how many uncompressed blocks can be queued. If you lower this limit, the compressor pool may run out of blocks more quickly, resulting in overall slower compression. Reducing the number of workers (-N) might also help a bit.

A small update on this (apologies that this is on an unrelated issue). Did some experimentation and found that lowering memory limit and workers works in some instances but not in others. Large files seems to be the biggest hold up, in particular an instance when I tried to put a 3.1 GB file which seemingly had no way of compressing via dwarfs with my 16 GB of memory (even with very low options like -L1m -N1).

What I did find instead was that using the -l0 option and then recompressing the image works in these cases without issue. Creating the initial image with -S24 results in very well recompressed files in these instances, the 3.1 GB file compressing down to 2.3 GB whereas the default block size for -l0 resulted in a 2.6 GB file (which is approx what mksquashfs -comp zstd -b 1M -Xcompression-level 22 also gave me).

Originally posted by @Phantop in #33 (comment)

docker container

It took me 20 minutes to install everything for the project on one of my computer as a test.
I do not want to do it again.

A simple docker container with a volume would solve all of the problems.
Also a seperate project for the mounting part

CMake fails to detect missing `sparsehash` dependency

I'm trying to build 0.2.0 on OpenSUSE Tumbleweed, and after I've successfully run cmake and started building it broke stating it can't find sparsehash dependency:

[ 92%] Linking CXX static library libfolly.a
/tmp/dwarfs-0.2.0/src/dwarfs/block_manager.cpp:32:10: fatal error: sparsehash/dense_hash_map: No such file or directory
   32 | #include <sparsehash/dense_hash_map>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [CMakeFiles/dwarfs.dir/build.make:108: CMakeFiles/dwarfs.dir/src/dwarfs/block_manager.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
[ 92%] Built target folly
make[1]: *** [CMakeFiles/Makefile2:284: CMakeFiles/dwarfs.dir/all] Error 2
make: *** [Makefile:171: all] Error 2

I think and extra check in CMake should solve this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.