Comments (3)
mold does not use SIMD instructions explicitly, but SIMD is used at a lot of places in mold, because many library functions are implemented using SIMD. For example, glibc's strlen is implemented using SSE 4.2's instructions, I believe. Other example is xxhash3. There might be other places that I can use SIMD to improve mold's performance, but I can't come up with anything right now.
As to cryptographic hashing, I actually tried BLAKE3. We are currently using SHA-256 for Identical Comdat Folding and Build-ID computation. For the former use case, we need to compute a cryptographic hash for small data (typically less than 100 bytes). For the latter, we compute a SHA-256 for the entire output file, which can be as large as multi-gigabyte.
It looks like BLAKE3 is slower than SHA-256 at least on my machine for small data. This is perhaps due to high initialization and finalization cost. For large data, BLKAE3 is indeed faster than SHA-256 by a factor of two. If we have enough number of cores, build-id computation is bounded by memory bandwidth even with SHA-256, so I don't see an immediate need to switch to BLAKE3, though.
from mold.
SIMD can speed up large loops where each iteration does the same thing (no input-dependent branches), and each iteration does not depend on the previous one. For example, SIMD can speed up most kinds of non-cryptographic checksum calculation.
SIMD can not speed up anything else, including branchy code, recursion, too short loops, non-loops, and most data structure operations (arrays are fine, few other things are). Sometimes it's possible to rewrite a function into a SIMD-friendlier form, but it's rare.
I think mold spends most of its time in TBB hashmaps. The hash calculation may be SIMDable, unless TBB already does that; the rest of the hashmap is, as far as I know, not SIMDable.
SIMD can also only speed up your own code. Disk access belongs to the kernel, not to mold.
from mold.
I agree with you, I was more so referring to the resolving symbols step. simdjson uses some clever SIMD tricks discussed in their paper to avoid branching while still being able to resolve symbols, but I'm not certain how applicable it would be here.
As for hashing, https://github.com/BLAKE3-team/BLAKE3 (6.8 GiB/s versus SHA1's 1 GiB/s in their benchmark), would be a good candidate, but I imagine @rui314 isn't keen on adding more dependencies unless they're absolutely necessary.
from mold.
Related Issues (20)
- ninja error "FindFirstFileExA" during build on windows with vs2022 HOT 1
- Error with `cargo build --release` on PPC64LE: Missing `R_PPC64_TPREL16_LO_DS` HOT 2
- Is -z x86-64-vX supported? HOT 2
- Fails to build LLVM 18.1.6 with mold HOT 3
- Corrupts the `DWARF` section when relocating multiple objects into a single one HOT 3
- Feature request: Warn about reverse or cyclic dependencies
- MSYS2/UCRT64: LINKER_TYPE 'MOLD' is unknown or not supported by this toolchain. HOT 6
- Build mold from source with -march or -mcpu in the build script HOT 2
- mold support for Fortran HOT 1
- Segmentation fault when compiling ROOT HOT 4
- /usr/lib/mold/mold-wrapper.so: unsupported relocation type 1026 HOT 3
- Binary linked with mold segfaults immediately HOT 5
- [FEATURE REQ/BUG REP] SH4 endianas HOT 7
- x86_64-exception-multiple-ehframe test failure HOT 2
- corrupted .riscv.attributes ISA string
- Very slow symbol lookup speed with gdb HOT 3
- mold fails to build on Debian sid on most of the archs (`error: implicit declaration of function`)
- mold does not accelerate the compile process compare to normal config HOT 15
- Support `libdep` plugin HOT 8
- duplicate symbol of compiler_rt builtin functions HOT 16
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mold.