rainblock / merkle-patricia-tree Goto Github PK

View Code? Open in Web Editor NEW

26.0 5.0 3.0 22.64 MB

☔️🌲 A fast, in-memory optimized merkle patricia tree

License: Mozilla Public License 2.0

JavaScript 31.94% TypeScript 68.06%

ethereum merkle patricia tree javascript typescript

merkle-patricia-tree's People

Contributors

Stargazers

Watchers

Forkers

nethereum-ci informartin zhuchengyu04

merkle-patricia-tree's Issues

Original tree does not pass official test 'branch-value-update'

It seems that the original tree's tests were built against an old version of ethereumjs-testing which did not contain this test. This test fails against the original tree and should be fixed. We probably should provide an upstream patch as well.

VerifyWitness doesn't properly handle multiple embedded nodes

An "embedded" node may contain multiple embedded nodes:

For example, the tree from

    tree.put(Buffer.from('12345'), Buffer.from('1'));
    tree.put(Buffer.from('123456'), Buffer.from('2'));
    tree.put(Buffer.from('1234'), Buffer.from('3'));

Results in a tree:

[extension (469732) -(31323334)-> 486339]
[branch (486339) 3: d9680c val: 33]
[extension (d9680c) -(5)-> 9075f1]
*[branch (9075f1) 3: 391d39 val: 31]
*[leaf (391d39) -(6)-> val: 32]

Where the nodes with an asterisk are embedded in the extension node d9680c, since 9075f1 contains the leaf node. VerifyWitness currently assumes embedded nodes contain contain other nodes and will fail to verify these proofs with a:

 Error: Key does not match the proof (embeddedNode)

getFromCache() sometimes erroneously returns null

Sometimes, when getFromCache() in the CachedMerkleTree is used with an empty node map, null is returned instead of the correct value.

See PR#83 for a test case using test data.

Batch ops are reducing the overall throughput

Current benchmarks to test batch operations: batchGet and batchPut (in pipeline), mask the overheads involved. Benchmarks execute a test multiple times and report the mean and deviation for all the runs. batchGet operation includes sorting batch_ops using radix_sort and traversing the tree once from left to right while bookkeeping and collecting responses. So while benchmarking batchGet, the test is run multiple times where the overhead of sorting the keys is encountered only in the first run keeping the mean values look okay. The same is the case with batchPut.

However, take a look at the following where a function is ran only once and is timed. batchPut and batchGet are both implemented and being tested.
generate 15k-1-32-ran : 3665.27 ms (batchPut-1)
generate 15k-1-32-ran : 347.50 ms (put-15k)
getFrom 15k-1-32-ran : 10246.64 ms (batchGet-1)
getFrom 15k-1-32-ran : 27.40 ms (get-15k)

Just the radix_sort for 15k keys is taking around 30-32 ms.

Need API support for copy function

rlpToMerkleNode should be static

Currently, it seems that rlpToMerkleNode needs to have an instance of a merkle tree. Does it actually need any fields of the Merkle tree to work? It seems it shouldn't, so it should either be static, or just an exported function

batchCOW: doesn't return a valid tree if putOps and delOps are empty

tree.batchCOW(putOps, delOps): If putOps and delOps are empty; batchCOW should return the tree. Currently batchCOW returns a new instance of MerklePatriciaTree

Experiment with new StateManager interface of EthereumJS VM

Hi guys, Holger from EthereumJS here. We just released a new version v2.5.0 of the VM with a new (beta labelled) StateManager API which now completely encapsulate all accesses to the trie, goal for a next breaking release is to completely separate StateManager and VM.

Maybe it's worth to experiment with this a bit, I think your trie implementation might be a very interesting use case, would be excited to see to exchange the trie with yours and e.g. see how fast our tests will run or stuff like that. As a mid-term goal it might be also worth to think about supporting your trie implementation in a then separate StateManager repo.

Also circling @mattdean-digicatapult into the conversation, who did all the StateManager refactoring work. Matthew, if you find some time, you might also want to have a closer look how realistic the above scenario is and how well the current API (or this is somewhat more the internal code structure as far as I see) is suited to do such a trie replacement?

Cheers
Holger

Webpack generated file output is large

Currently, the webpack browser package is quite large (~256KB). This could be because of inline source maps or other dependencies, we should investigate and minimize it.

VerifyWitness fails to account for embedded branch value

If a value is within an embedded node, VerifyWitness will fail to verify it with an error since original nodes are used, and the value of branch nodes is an array.

Bug in del

Delete all the existing keys in the merkle-tree and the tree's root node remains to be a BranchNode, where as the rootNode should be a NullNode.

Pruned tree doesn't throw error

It appears that a pruned CachedMerklePatriciaTree returns null if the key could not be found in getFromCache instead of throwing an error. This behavior is incorrect, if the key could possibly be present (i.e., we've reached a HashNode), an error should be thrown. If we can actually determine the key does not exist, only then can we return null.

A test can be found in #87

CoW API needs to return CachedMerkleTree and return nodes used.

In order to fully support the CachedMerkleTree, we need a CoW API that uses nodeBags and provides nodesUsed, and returns a CachedMerkleTree so that it can be pruned.

For example:

batchCOW(batchOps: BatchOp[], nodesUsed : Set<bigint>, nodeBag:<bigint, MerkleTreeNode<V>>) : CachedMerklePatriciaTree<K,V>

getFromCache needs to return which nodes were used

We need to return which nodes were actually used from the cache, so we can save space by not re-transmitting nodes which were not used.

I suggest actually changing the signature of getFromCache to this:

interface CachedTraversal<V> {
   value : V;
   nodesUsed: bigint[];
}

getFromCache(key: K, nodesBags... : Map<bigint, MerklePatriciaTreeNode<V>) : CachedTraversal<V>

nodeBags should be variable arguments, so we can support a simple cache in the future of frequently used accounts. Example:

getFromCache("buffer", nodeBag, cache);

[Bug] Inserting into an Extension Node

Value not being inserted into the tree
(The root hash is different in the cases where same key value pairs are put into the tree in different order.)

Example:

tree.put(Buffer.from("12345"), Buffer.from("1"))
tree.put(Buffer.from("123456"), Buffer.from("2"))
tree.put(Buffer.from("1234"), Buffer.from("3"))
tree.get(Buffer.from("1234"))

returns :
expected : <Buffer 33>

CachedMerklePatriciaTree doesn't take options?

It seems that the CachedMerklePatriciaTree doesn't take an options object. This means that it can't be typed as anything other than <Buffer, Buffer>, since both key conversion and value conversion won't work.

It seems the class currently takes the default options object, which discards all this information.

The class should be updated to take an options object.