Coder Social home page Coder Social logo

OMT: maybe optimize by using faster hash algo combination, possibly using wyhash, XXH128, etc. as an alternative to SipHash in front of BLAKE3 about atree HOT 7 CLOSED

onflow avatar onflow commented on June 18, 2024
OMT: maybe optimize by using faster hash algo combination, possibly using wyhash, XXH128, etc. as an alternative to SipHash in front of BLAKE3

from atree.

Comments (7)

fxamacker avatar fxamacker commented on June 18, 2024 1

Preliminary Comparison of Go Hash Libraries

XXH128 is faster than SipHash-2-4-128. However, the XXH128 library (github.com/zeebo/xxh3) doesn't have API with seed, and adding an 8-byte seed as prefix to the input data makes it slower than SipHash-2-4-128.

Differences between hash libraries of 20 ns/op here might translate to very little in OMT performance. For example, XXH128 (unseeded) might outperform SipHash128 by a lot here (250%) but only by 2% in OMT.

hash seed input bytes ns/op allocs/op max digest size
wyhash_v3 64-bit 3-11 11.8 - 13.4 0 64-bit
xxh128 missing 3-11 13.3 - 15.7 0 128-bit
SipHash-2-4-128 128-bit 3-11 36.0 - 40.0 0 128-bit
xxh128 64-bit prefix 3-11 47.7 - 53.2 1 128-bit
BLAKE3_256 not needed 3-11 139.7 - 140.0 0 256-bit
SHA3_256 not needed 3-11 1321.4 - 1336.4 4 256-bit

Comparing SipHash2-4-128 and WyHashV3:

name                 old time/op    new time/op    delta
Hash/WYv3_3bytes-4     36.2ns ± 0%    11.8ns ± 0%  -67.34%  (p=0.000 n=9+10)
Hash/WYv3_4bytes-4     36.0ns ± 0%    12.4ns ± 0%  -65.38%  (p=0.000 n=7+9)
Hash/WYv3_5bytes-4     36.6ns ± 0%    12.5ns ± 1%  -65.76%  (p=0.000 n=10+10)
Hash/WYv3_7bytes-4     37.9ns ± 0%    12.5ns ± 0%  -67.15%  (p=0.000 n=10+9)
Hash/WYv3_11bytes-4    40.0ns ± 0%    13.4ns ± 2%  -66.53%  (p=0.000 n=10+10)

name                 old alloc/op   new alloc/op   delta
Hash/WY64_3bytes-4      0.00B          0.00B          ~     (all equal)
Hash/WY64_4bytes-4      0.00B          0.00B          ~     (all equal)
Hash/WY64_5bytes-4      0.00B          0.00B          ~     (all equal)
Hash/WY64_7bytes-4      0.00B          0.00B          ~     (all equal)
Hash/WY64_11bytes-4     0.00B          0.00B          ~     (all equal)

name                 old allocs/op  new allocs/op  delta
Hash/WY64_3bytes-4       0.00           0.00          ~     (all equal)
Hash/WY64_4bytes-4       0.00           0.00          ~     (all equal)
Hash/WY64_5bytes-4       0.00           0.00          ~     (all equal)
Hash/WY64_7bytes-4       0.00           0.00          ~     (all equal)
Hash/WY64_11bytes-4      0.00           0.00          ~     (all equal)

Comparing SipHash2-4-128 and XXH128 without prefix:

name                   old time/op    new time/op    delta
Hash/XXH128_3bytes-4     36.2ns ± 0%    13.4ns ± 0%  -63.02%  (p=0.000 n=9+10)
Hash/XXH128_4bytes-4     36.0ns ± 0%    13.3ns ± 0%  -62.88%  (p=0.000 n=7+8)
Hash/XXH128_5bytes-4     36.6ns ± 0%    13.4ns ± 0%  -63.44%  (p=0.000 n=10+10)
Hash/XXH128_7bytes-4     37.9ns ± 0%    13.4ns ± 0%  -64.77%  (p=0.000 n=10+10)
Hash/XXH128_11bytes-4    40.0ns ± 0%    15.7ns ± 0%  -60.68%  (p=0.000 n=10+10)

name                   old alloc/op   new alloc/op   delta
Hash/XXH128_3bytes-4      0.00B          0.00B          ~     (all equal)
Hash/XXH128_4bytes-4      0.00B          0.00B          ~     (all equal)
Hash/XXH128_5bytes-4      0.00B          0.00B          ~     (all equal)
Hash/XXH128_7bytes-4      0.00B          0.00B          ~     (all equal)
Hash/XXH128_11bytes-4     0.00B          0.00B          ~     (all equal)

name                   old allocs/op  new allocs/op  delta
Hash/XXH128_3bytes-4       0.00           0.00          ~     (all equal)
Hash/XXH128_4bytes-4       0.00           0.00          ~     (all equal)
Hash/XXH128_5bytes-4       0.00           0.00          ~     (all equal)
Hash/XXH128_7bytes-4       0.00           0.00          ~     (all equal)
Hash/XXH128_11bytes-4      0.00           0.00          ~     (all equal)

Comparing SipHash2-4-128 and XXH128 with 8-byte prefix:

name                    old time/op    new time/op    delta
Hash/XXH128P_3bytes-4     36.2ns ± 0%    47.8ns ± 0%  +31.87%  (p=0.000 n=9+10)
Hash/XXH128P_4bytes-4     36.0ns ± 0%    47.7ns ± 1%  +32.61%  (p=0.000 n=7+10)
Hash/XXH128P_5bytes-4     36.6ns ± 0%    48.6ns ± 0%  +32.94%  (p=0.000 n=10+10)
Hash/XXH128P_7bytes-4     37.9ns ± 0%    48.7ns ± 0%  +28.39%  (p=0.000 n=10+9)
Hash/XXH128P_11bytes-4    40.0ns ± 0%    53.2ns ± 1%  +33.09%  (p=0.000 n=10+10)

name                    old alloc/op   new alloc/op   delta
Hash/XXH128P_3bytes-4      0.00B         16.00B ± 0%    +Inf%  (p=0.000 n=10+10)
Hash/XXH128P_4bytes-4      0.00B         16.00B ± 0%    +Inf%  (p=0.000 n=10+10)
Hash/XXH128P_5bytes-4      0.00B         16.00B ± 0%    +Inf%  (p=0.000 n=10+10)
Hash/XXH128P_7bytes-4      0.00B         16.00B ± 0%    +Inf%  (p=0.000 n=10+10)
Hash/XXH128P_11bytes-4     0.00B         24.00B ± 0%    +Inf%  (p=0.000 n=10+10)

name                    old allocs/op  new allocs/op  delta
Hash/XXH128P_3bytes-4       0.00           1.00 ± 0%    +Inf%  (p=0.000 n=10+10)
Hash/XXH128P_4bytes-4       0.00           1.00 ± 0%    +Inf%  (p=0.000 n=10+10)
Hash/XXH128P_5bytes-4       0.00           1.00 ± 0%    +Inf%  (p=0.000 n=10+10)
Hash/XXH128P_7bytes-4       0.00           1.00 ± 0%    +Inf%  (p=0.000 n=10+10)
Hash/XXH128P_11bytes-4      0.00           1.00 ± 0%    +Inf%  (p=0.000 n=10+10)

Comparing SipHash2-4-128 and BLAKE3:

name                   old time/op    new time/op    delta
Hash/BLAKE3_3bytes-4     36.2ns ± 0%   140.0ns ± 0%  +286.51%  (p=0.000 n=9+9)
Hash/BLAKE3_4bytes-4     36.0ns ± 0%   139.7ns ± 0%  +288.71%  (p=0.000 n=7+10)
Hash/BLAKE3_5bytes-4     36.6ns ± 0%   139.9ns ± 0%  +282.68%  (p=0.000 n=10+10)
Hash/BLAKE3_7bytes-4     37.9ns ± 0%   140.0ns ± 0%  +269.33%  (p=0.000 n=10+9)
Hash/BLAKE3_11bytes-4    40.0ns ± 0%   139.8ns ± 0%  +249.42%  (p=0.000 n=10+10)

name                   old alloc/op   new alloc/op   delta
Hash/BLAKE3_3bytes-4      0.00B          0.00B           ~     (all equal)
Hash/BLAKE3_4bytes-4      0.00B          0.00B           ~     (all equal)
Hash/BLAKE3_5bytes-4      0.00B          0.00B           ~     (all equal)
Hash/BLAKE3_7bytes-4      0.00B          0.00B           ~     (all equal)
Hash/BLAKE3_11bytes-4     0.00B          0.00B           ~     (all equal)

name                   old allocs/op  new allocs/op  delta
Hash/BLAKE3_3bytes-4       0.00           0.00           ~     (all equal)
Hash/BLAKE3_4bytes-4       0.00           0.00           ~     (all equal)
Hash/BLAKE3_5bytes-4       0.00           0.00           ~     (all equal)
Hash/BLAKE3_7bytes-4       0.00           0.00           ~     (all equal)
Hash/BLAKE3_11bytes-4      0.00           0.00           ~     (all equal)

Comparing SipHash2-4-128 and SHA3-256:

name                 old time/op    new time/op    delta
Hash/SHA3_256_3bytes-4     36.2ns ± 0%  1323.1ns ± 0%  +3553.70%  (p=0.000 n=9+8)
Hash/SHA3_256_4bytes-4     36.0ns ± 0%  1321.5ns ± 0%  +3575.94%  (p=0.000 n=7+10)
Hash/SHA3_256_5bytes-4     36.6ns ± 0%  1321.4ns ± 0%  +3514.23%  (p=0.000 n=10+10)
Hash/SHA3_256_7bytes-4     37.9ns ± 0%  1327.9ns ± 3%  +3403.97%  (p=0.000 n=10+10)
Hash/SHA3_256_11bytes-4    40.0ns ± 0%  1336.4ns ± 0%  +3240.44%  (p=0.000 n=10+8)

name                 old alloc/op   new alloc/op   delta
Hash/SHA3_256_3bytes-4      0.00B        960.00B ± 0%      +Inf%  (p=0.000 n=10+10)
Hash/SHA3_256_4bytes-4      0.00B        960.00B ± 0%      +Inf%  (p=0.000 n=10+10)
Hash/SHA3_256_5bytes-4      0.00B        960.00B ± 0%      +Inf%  (p=0.000 n=10+10)
Hash/SHA3_256_7bytes-4      0.00B        960.00B ± 0%      +Inf%  (p=0.000 n=10+10)
Hash/SHA3_256_11bytes-4     0.00B        960.00B ± 0%      +Inf%  (p=0.000 n=10+10)

name                 old allocs/op  new allocs/op  delta
Hash/SHA3_256_3bytes-4       0.00           4.00 ± 0%      +Inf%  (p=0.000 n=10+10)
Hash/SHA3_256_4bytes-4       0.00           4.00 ± 0%      +Inf%  (p=0.000 n=10+10)
Hash/SHA3_256_5bytes-4       0.00           4.00 ± 0%      +Inf%  (p=0.000 n=10+10)
Hash/SHA3_256_7bytes-4       0.00           4.00 ± 0%      +Inf%  (p=0.000 n=10+10)
Hash/SHA3_256_11bytes-4      0.00           4.00 ± 0%      +Inf%  (p=0.000 n=10+10)

from atree.

fxamacker avatar fxamacker commented on June 18, 2024

XXH3 (the 64-bit hash algo) fails one or more smhasher tests, so we should use an alternative that passes all tests.

One alternative for 64-bit digests is to use XXH128low (the low 64-bits of XXH128 in the same library as XXH3), because XXH128low is nearly as fast and passes all smhasher tests.

Another alternative for 64-bit digests is wyhash, which is used by languages like Zig, Nim, V, and Go (since Go 1.17 released in August 2021). However, some versions of wyhash algo requires implementations to avoid using bad seed values.

from atree.

fxamacker avatar fxamacker commented on June 18, 2024

My response on July 19 (in Slack) to whether this design is similar to Double Hashing or Cuckoo Hashing was:

It's different in that we only hash once with SipHash128. SipHash128 produces a 128 bit hash at about the same speed as 64 bit SipHash. We only use the second half of 128 bit if there is a collision with the first half. This forces attackers to produce 128 bit collision. If there is no collision, we only store 64 bit.

Currently, the preliminary default config uses SipHash128 for the first 2 levels, followed by BLAKE3 for the next 2 levels. If there's no collision, only the first 64 bits of SipHash128 is stored in the map for each kay.

Unfortunately, we're only storing 64-bits of each map's seed to save space and worse, it cannot be secret (for reasons outside the scope of this discussion). So we replace the missing 64-bits of the 128-bit seed with the "typical random constant" 0x1BD11BDAA9FC1A22 as the nothing-up-my-sleeve number. However, I'm wondering if we might want to replace this const with the 64-bit digest of the prior hash level.

For example, if we use XXH128low in front of SipHash128, then the 128-bit seed provided to SipHash128 can be 64-bit map seed (hash of storageid128) and 64-bit XXH128low digest. Generating collisions would involve a different 128-bit seed for each map key using an attack that is specific to Flow -- this aspect of not being able to reuse their attack on other systems or blockchains might be annoying enough to discourage attempts even if it turns out to be trivial on paper.

As another example, we can use XXH128low in front of SipHash128 but use XXH128high as 64-bits of the 128-bit seed to SipHash128. In this scenario, all 128-bits of XXH128 is utilized to handle a collision but only 64-bits of it is stored.

from atree.

fxamacker avatar fxamacker commented on June 18, 2024

Some notes from XXH128low hash collision example in Go.

XXH3 failed smhasher.
XXH128 and XXH128low passed smhasher.
XXH128low provides the low 64-bits of XXH128.

Example, collision on 64-bit input for XXH128low:
data=3f1d4bb70c38aa9f, digest128={54ca4b07dcfa8704 ffddad0ba4c9ace9}
data=ec524de4468a8921, digest128={ce414f5214c1fa36 ffddad0ba4c9ace9}

from atree.

fxamacker avatar fxamacker commented on June 18, 2024

Preliminary OMT Benchmark Comparisons for Lookups

For simplicity, keys hashed are uint64 encoded to CBOR.
Random keys are generated with new seed (time) before any benchmarks begin and are reused by all benchmarks.

BenchmarkMapHashCombo/wyv3_10elems-4         	 1000000	      1049 ns/op	     542 B/op	      11 allocs/op
BenchmarkMapHashCombo/wyv3_1000elems-4       	  911642	      1260 ns/op	     544 B/op	      10 allocs/op
BenchmarkMapHashCombo/wyv3_10000elems-4      	  740029	      1619 ns/op	     549 B/op	      10 allocs/op
BenchmarkMapHashCombo/wyv3_100000elems-4     	  477040	      2237 ns/op	     591 B/op	      11 allocs/op

BenchmarkMapHashCombo/XXH128_10elems-4       	 1000000	      1055 ns/op	     542 B/op	      11 allocs/op
BenchmarkMapHashCombo/XXH128_1000elems-4     	  912883	      1285 ns/op	     544 B/op	      10 allocs/op
BenchmarkMapHashCombo/XXH128_10000elems-4    	  718939	      1637 ns/op	     549 B/op	      10 allocs/op
BenchmarkMapHashCombo/XXH128_100000elems-4   	  465313	      2239 ns/op	     592 B/op	      11 allocs/op

BenchmarkMapHashCombo/Sip128_10elems-4       	 1000000	      1087 ns/op	     542 B/op	      11 allocs/op
BenchmarkMapHashCombo/Sip128_1000elems-4     	  863414	      1286 ns/op	     544 B/op	      10 allocs/op
BenchmarkMapHashCombo/Sip128_10000elems-4    	  713320	      1668 ns/op	     549 B/op	      10 allocs/op
BenchmarkMapHashCombo/Sip128_100000elems-4   	  462944	      2324 ns/op	     592 B/op	      11 allocs/op

BenchmarkMapHashCombo/XXH128p_10elems-4      	 1000000	      1153 ns/op	     574 B/op	      13 allocs/op
BenchmarkMapHashCombo/XXH128p_1000elems-4    	  790964	      1354 ns/op	     577 B/op	      12 allocs/op
BenchmarkMapHashCombo/XXH128p_10000elems-4   	  708225	      1738 ns/op	     582 B/op	      12 allocs/op
BenchmarkMapHashCombo/XXH128p_100000elems-4  	  435376	      2402 ns/op	     628 B/op	      13 allocs/op

from atree.

fxamacker avatar fxamacker commented on June 18, 2024

Comparison of using xxh3.Hash128 vs recently added xxh3.Hasher (github.com/zeebo/xxh3) with variation of no prefix vs prefix.

BenchmarkHash/XXH128_3bytes-4           83888863                13.37 ns/op            0 B/op          0 allocs/op
BenchmarkHash/XXH128_4bytes-4           89450071                13.37 ns/op            0 B/op          0 allocs/op
BenchmarkHash/XXH128_5bytes-4           89697273                13.35 ns/op            0 B/op          0 allocs/op
BenchmarkHash/XXH128_7bytes-4           89772805                13.35 ns/op            0 B/op          0 allocs/op
BenchmarkHash/XXH128_11bytes-4          76188787                15.75 ns/op            0 B/op          0 allocs/op

BenchmarkHash/XXH128P_3bytes-4          24256177                47.66 ns/op           16 B/op          1 allocs/op
BenchmarkHash/XXH128P_4bytes-4          24003128                47.45 ns/op           16 B/op          1 allocs/op
BenchmarkHash/XXH128P_5bytes-4          24355566                48.42 ns/op           16 B/op          1 allocs/op
BenchmarkHash/XXH128P_7bytes-4          23710158                48.57 ns/op           16 B/op          1 allocs/op
BenchmarkHash/XXH128P_11bytes-4         21515864                52.87 ns/op           24 B/op          1 allocs/op

BenchmarkHash/XXH128Hasher_3bytes-4      3496035               328.9 ns/op          1280 B/op          1 allocs/op
BenchmarkHash/XXH128Hasher_4bytes-4      3254538               326.3 ns/op          1280 B/op          1 allocs/op
BenchmarkHash/XXH128Hasher_5bytes-4      3394720               321.4 ns/op          1280 B/op          1 allocs/op
BenchmarkHash/XXH128Hasher_7bytes-4      3437254               322.6 ns/op          1280 B/op          1 allocs/op
BenchmarkHash/XXH128Hasher_11bytes-4             3447572               331.4 ns/op          1280 B/op          1 allocs/op

BenchmarkHash/XXH128PHasher_3bytes-4             3441559               339.7 ns/op          1280 B/op          1 allocs/op
BenchmarkHash/XXH128PHasher_4bytes-4             3453799               331.1 ns/op          1280 B/op          1 allocs/op
BenchmarkHash/XXH128PHasher_5bytes-4             3405379               339.9 ns/op          1280 B/op          1 allocs/op
BenchmarkHash/XXH128PHasher_7bytes-4             3358226               339.1 ns/op          1280 B/op          1 allocs/op
BenchmarkHash/XXH128PHasher_11bytes-4            3346489               344.4 ns/op          1280 B/op          1 allocs/op

from atree.

fxamacker avatar fxamacker commented on June 18, 2024

This issue is closed by PR #145

from atree.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.