onflow / atree Goto Github PK

View Code? Open in Web Editor NEW

39.0 39.0 14.0 2.74 MB

Atree provides scalable arrays and scalable ordered maps.

Home Page: https://onflow.org

License: Apache License 2.0

Go 99.97% Shell 0.03%

array data-structures ordered-map

atree's People

Contributors

Stargazers

Watchers

Forkers

crypto-forks ramtinms yinheli isabella232 bluesign abhijrathod m-peter syncsacas bloriv garyvvv hanjv kpacan

atree's Issues

Implement more benchmarks compatible with benchstat

Implement benchmarks compatible with benchstat and make it similar to Ramtin's benchmarks used for the spreadsheets. E.g. measure 100 ops rather than 1 op, and group by array size.

Additionally:

make long-running new benchmarks skip in short mode
use PCG random number generator with deterministic mode for benchmark comparisons

OMT: reduce memory allocation by reusing Digester

Digester is being created and destroyed in every OMT operation.

Refactor to change names to match current design doc

Some of the names are from an old design doc, current design doc, or neither. Make naming more consistent and match the latest design doc.

slab flag changes

values used for slab flags should be changed to

slab flag (8bits)
- one bit - set if this is a root slab
- one bit - set if it has pointers to other slabs (for fastcopy)
- one bit - set if no limit on the size (external collision group current implementation)

- rest for types  (32 types)
	- [00000] ArrayData		
	- [00001] ArrayMeta
	- [00010] LargeImutableArray (includes large strings) (not used right now)
	- [01000] MapData
	- [01001] MapMeta
	- [01010] LargeMapEntry (key/value pair)
	- [01011] LargeCollisionGroup (External)
	- [11111] Storable (general purpose for external objects for now) (edited)

We also need to add method on Slab to check the first few bits with bit masking and exposes

IsRootOfAnObject()
HasPointers()
HasSizeLimit()

Optimize by replacing all division-by-2 with shift right

The compiler should optimize this but new code is simpler and easier to read.

Example change from:

midPoint := uint32(math.Ceil(float64(dataSize) / 2))

to:

midPoint := (dataSize + 1) >> 1

Map: Use dedicated slab flags for external data slabs

Both for security reasons and future migrations we should use dedicated slab flags for

LargeCollisionGroup (External)
LargeMapEntry (Large key/value pair)

OMT: change type info from string to cbor.RawMessage

A similar change was made for array, so this change should be done to map.

fxamacker/cbor: Add NumBytesDecoded() to StreamDecoder

See fxamacker/cbor#299

Add partial deltas map reset during `Commit`

Issue To Be Solved

Merged PR #165 to fix issue #164 disables deltas map reset during Commit. PR #165 fix was rushed to fix testnet issue related to storageUsed.

It causes all delta slabs to be encoded and saved again in the next Commit.

OMT: firstKey isn't updated when empty slab is merged or rebalanced

When firstKey isn't updated, inserts can go to the wrong data slab.

We need to:

update firstKey when empty slab is merged or rebalanced
update slab's parent's firstKey if data slab is the first child

Delay encoding TypeInfo to enable optimizing integration into Cadence

@turbolent identified this nice optimization on Friday. It was discussed extensively on Slack today.

Basically, computing type info can be expensive. And type info for some maps and arrays aren't necessary because they aren't save to storage.

By deferring the retrieval of type info until encoding, we can also take advantage of Storage.FastCommit which uses multiple goroutines for encoding slabs.

Make long-running benchmarks skip when -short is specified

Some benchmarks take over 11 minutes. Make them skip running if -short is specified on the command line.

SAT: add validation check for serialization

Test array serialization by comparing results of encoding, decoding, and re-encoding again.

Intermittent test failure on number of external slabs containing hash collision

There is an intermittent test failure on TestMapHashCollision/deterministic. Test fails on the number of external slabs containing hash collision.

OMT: e.value should be value in singleElement.Set()

This bug affects the pointer flag during encoding.

OMT: maybe optimize by using faster hash algo combination, possibly using wyhash, XXH128, etc. as an alternative to SipHash in front of BLAKE3

Maybe we can use a significantly faster hash such as XXH128 without sacrificing much security (as long as we handle collisions) because SipHash relies on its key being secret for security.

XXHash v2 is used by many software. XXHash v3 (XXH3) is substantially faster and was finalized in v0.8 of its reference implementation, so its digest output won't change in the future.

The preferred Go implementation of XXH3 and XXH128 is https://github.com/zeebo/xxh3. Coincidentally, our BLAKE3 library in Go is by the same author (zeebo). zeebo's BLAKE3 library is used by Jean-Philippe Aumasson (the co-creator of BLAKE/BLAKE2/BLAKE3 and SipHash) in multi-party-sig (a Taurus Group SA project).

The API of XXH3 and XXH128 by zeebo doesn't support seeds/keys but we can workaround that by hashing the seed as a prefix in front of the data. The overhead of doing this is offset by XXH3 and XXH128 being much faster than SipHash.

I'm not proposing we implement a bunch of alternatives, but I thought it would be useful to document some MAC/Hash combinations that are possible using OMT.

BASELINE: OMT's hash algo combination is currently:

SipHash128 -- first 64 bit of digest, with 128-bit key (not secret for now, hopefully secret key soon)
SipHash128 -- next 64 bit of digest, with 128-bit key (not secret for now, hopefully secret key soon)
BLAKE3 -- first 64 bit of digest, without key
BLAKE3 -- next 64 bit of digest, without key
no more hashing, linear search all collisions

ALT0: Use WyHash for super fast first level, then fall back to SipHash128 and BLAKE3 on collisions:

WyHashv3 -- 64 bit digest, with 64-bit key
SipHash128 -- first 64 bit of digest, with 128-bit key (not secret for now, hopefully secret key soon)
SipHash128 -- next 64 bit of digest, with 128-bit key (not secret for now, hopefully secret key soon)
BLAKE3 -- first 64 bit of digest, without key
BLAKE3 -- next 64 bit of digest, without key
no more hashing, linear search all collisions

ALT1: Replace SipHash128 with XXH128 for super fast first two levels:

XXH128 -- first 64 bit of digest, with 128-bit key (not secret for now, hopefully secret key soon)
XXH128 -- next 64 bit of digest, with 128-bit key (not secret for now, hopefully secret key soon)
BLAKE3 -- first 64 bit of digest, without key
BLAKE3 -- next 64 bit of digest, without key
no more hashing, linear search all collisions

ALT2: Using only 64-bit digests for the first 2 levels, so the total bits remain 256 bits:

XXH128low -- 64-bit digest, with 64 bit seed hashed as a prefix to the data (this allows use of libraries that don't allow seeds/keys)
SipHash -- 64-bit digest, with 128-bit key (not secret for now, hopefully secret key soon)
BLAKE3 -- first 64 bits of digest, without key
BLAKE3 -- next 64 bits of digest, without key
no more hashing, linear search all collisions

ALT3: Use XXH3 first as the fast & happy path, before the baseline:

XXH128low -- 64-bit digest, with 64 bit seed hashed as a prefix to the data (this allows use of libraries that don't allow seeds/keys)
SipHash128 -- first 64 bits of digest, with 128-bit key (not secret for now, hopefully secret key soon)
SipHash128 -- next 64 bits of digest, with 128-bit key (not secret for now, hopefully secret key soon)
BLAKE3 -- first 64 bits of digest, without key
BLAKE3 -- next 64 bits of digest, without key
no more hashing, linear search all collisions

OMT: replace GetHashInput() with HashInputProvider

This is required for Cadence integration.

OMT: decoding empty map creates singleElements instead of hkeyElements

This bug was discovered by @turbolent during integration with Cadence.

CI: add go 1.17.x to ci.yml when go 1.17 is released this month

When go 1.17 is released, add it to ci.yml.

CI currently uses go 1.16 (other versions were removed to speed up CI).

OMT: add validation check for serialization

Test map serialization by comparing results of encoding, decoding, and re-encoding again.

OMT: change Set() to return storables for existing key and value

change Set() to return storables for existing key and value
overwrite storable for existing key to prevent memory leak.

SAT: Optimize rebalancing algorithm

The rebalancing algorithm can be optimized for speed and to reduce number of slabs accessed/loaded.

Check right sibling first (instead of left sibling) to see if it has enough data to lend
Check left sibling only if (instead of always) the right sibling doesn't have enough data to lend. The benefit from borrowing from the larger sibling was maybe not worth the extra effort of accessing 2 slabs.
Move data within the right sibling's underlying array after it lends so it has capacity for future ops

SAT: root slab size can overflow or underflow on Set()

This can happen when new element triggers the slab to split or merge, causing the root slab size to overflow/underflow without adjustment.

We need to check for such a scenario in Array.Set() to adjust the root slab properly.

Add nested array and nested map in stress test

Issue To Be Solved

Add nested array and nested map in stress test for array and map so that stress test supports arbitrary nested structures.

Supporting this may require implementing deep remove for array and map (currently done in Cadence).

Create program to stress test to determine if data loss is possible

This is meant to run for hours/days using randomized data and operations in random sequences.

Maybe add versioning to slab layouts

It can be useful to serialize version information in case the layout changes in the future.

Consider adding 1 or 2 bytes to hold layout version info.

with 2 bytes, we can have major version 0-255 and minor version 0-255.
with 1 byte, we either can have:
- major version 0-15 and minor version 0-15, or
- major version 0-255 and no minor versions, or
- make the high bit tell us we're using 2 bytes, etc.

Optimize speed of lookups

Lookup speed can be faster.

Remove deltas map reset during Commit

During Commit, delta maps are reset so all slabs with empty Address are wiped out.

EDIT: I was too hasty labeling this as "Bug" instead of "Enhancement". This issue arose because Commit was called in the middle of a transaction by another project (logic error). My apologies for the confusion. I relabeled this as "Enhancement".

Add `Storable.ChildStorables() []Storable` for nested Storable

Issue To Be Solved

Cadence's SomeStorable has nested Storable and we need a way to expose it.

OMT: maybe optimize integer map keys if feasible

We can potentially skip SipHash for integer key types to gain speed.

Maybe this optimization can be evaluated after OMT is feature complete and we have more test coverage (and fuzzing) in place.

Thanks @ramtinms for mentioning this possible optimization!

Fix unused `prev` field not being updated in some cases by removing it entirely

A new validation check found that in some edge cases, the prev sibling ids can get out of place.

The bug happens when slab A splits and we need to update slab A's old "right" sibling slab B, so slab B's prev can point to the newly split slab.

Fixing this may require loading an extra data slab. However, prev doesn't appear to be used in SAT and OMT. It only appears to be encoded/decoded.

So we can just remove prev since it's unused, which would give us back 16 bytes in each data slab for both SAT and OMT.

Tests/Benchmarks: Add -seed flag to override `time.Now().UnixNano()`

Add -seed flag to override time.Now().UnixNano().

SAT and OMT: improve tests by adding more validation checks for tree integrity

Tests can be improved by adding more checks to confirm tree integrity.

For example, element size, count, first key, etc. in slab header should be confirmed to be correct after various operations.

SAT and OMT: improve tests by adding more validation checks for tree integrity

More validation checks, such as encoding size, can be added to validArray() and validMap()

Return error from StoredValue() for non-root slabs

Slabs implement Storable interface, which has func StoredValue(SlabStorage) (Value, error).

Only root slabs have enough information to recreate Array/OrderedMap. Non-root slabs should return error from StoredValue function.

This affects SAT and OMT.

OMT: add batch sets feature to enable optimizing Cadence's DeepCopy()

Copying performance in Cadence can be improved by using batch ops.

OMT: add reproducer for issue #87 resolved by PR #88

SAT: add batch append feature to enable optimizing Cadence's DeepCopy()

Copying performance in Cadence can be improved by using batch append.

OMT: evaluate using fxamacker/circlehash for level 1 hash

SipHash is too slow to use as a level 1 hash.

XXH3 is a very fast alternative but there are some problems with XXH3_64 and XXH3_128.

Current version of github.com/zeebo/xxh3 doesn't support seeds. Adding support for seeds in a way that is compatible with XXH3 and XXH128 requires many lines of changes.
XXH3_64 failed smhasher tests. We'll focus on XXH128 and XXH128low because they both passed rurban/smhasher. UPDATE: XXH128low fails demerphq/smhasher (Strict Avalanche Criterion).
UPDATE: some of these issues may have been fixed upstream and not yet updated in rurban/smhasher.
XXH128 is bijective for 128-bit input -> 128-bit digest but is not bijective for 64-bit input -> 64-bit digest, which can make collisions faster to find.

Example collision on 64-bit input for XXH128low (low 64 bits of XXH128):
data=3f1d4bb70c38aa9f, digest128={54ca4b07dcfa8704 ffddad0ba4c9ace9}
data=ec524de4468a8921, digest128={ce414f5214c1fa36 ffddad0ba4c9ace9}

One workaround for inputs <= 8 bytes is to prefix the data with a 64-bit key. Unfortunately, this approach using unmodified zeebo/xxh3 is slower than SipHash128 and also introduces an extra allocation.

Another solution is to use fxamacker/xxh128p when it becomes available shortly. It's optimized for using a 64-bit prefix key with XXH3_128. This solution doesn't change the algorithm of XXH3_128, so it's easy to verify results with existing XXH3_128 implementations. Although this solution would be faster than SipHash128, it would be slower than unmodified XXH3_128.

The simplest solution appears to be to use fxamacker/circlehash. It passes every test in smhasher (both rurban/smhasher and demerphq/smhasher), support a uint64 seed, produce a 64-bit digest, is faster than cespare/XXH64, and (for some data sizes) is faster than unseeded XXH3.

The biggest advantage of CircleHash64 would be simplicity and ease-of-audit given our release schedule.

CI: add linting (golangci-lint)

OMT: optimize binary search

Binary search can be faster. Use the same optimization done for SAT, which is to use linear scan if there are insufficient elements to justify the overhead of binary search.

OMT: Optimize GetHashInput()

Optimize GetHashInput():

rename HashCode to GetHashInput
optimize CBOR encoding
reduce memory allocation

OMT: Storable() is used incorrectly in some edge cases

Storable is used incorrectly if:

there's a hash collision, existing key as Storable type is converted to Value and then back to Storable.
Set() is used to modify value, key's storable is created but never used.
Set() returns error, value's storable is created but never used.

Definition of Done

Export newDefaultDigesterBuilder
@fxamacker
Export storage field of OrderedMap
@turbolent
Refactor ComparableValue to comparator
@fxamacker
~~Refactor OrderedMap.Set to return existing storable for key and value~~ (not needed)
Add PopIterate to OMT to enable optimized Cadence DictionaryValue.DeepRemove
@fxamacker
Iterator for just keys
@turbolent
Iterator for just values
@turbolent
Refactor Hashable interface to HashInputProvider function
@fxamacker

Linear scanning

			root
		      /      \
		meta1a        meta1b
		/    \	       /    \
         meta2a    meta2b   meta2c   meta2d
	   /  \     /   \     / \      /   \ 			 
	  d1   d2  d3   d4    ...   dn-1    dn

Document serialization format and add more code comments

update Cadence storage notion page with serialization format
add more code comments in this repo

onflow / atree Goto Github PK

atree's People

Contributors

Stargazers

Watchers

Forkers

atree's Issues

Issue To Be Solved

Suggested Solution

BASELINE: OMT's hash algo combination is currently:

ALT0: Use WyHash for super fast first level, then fall back to SipHash128 and BLAKE3 on collisions:

ALT1: Replace SipHash128 with XXH128 for super fast first two levels:

ALT2: Using only 64-bit digests for the first 2 levels, so the total bits remain 256 bits:

ALT3: Use XXH3 first as the fast & happy path, before the baseline:

Issue To Be Solved

Issue To Be Solved

Suggested Solution

Definition of Done

Recommend Projects

Recommend Topics

Recommend Org