rainblock / merkle-patricia-tree Goto Github PK
View Code? Open in Web Editor NEWโ๏ธ๐ฒ A fast, in-memory optimized merkle patricia tree
License: Mozilla Public License 2.0
โ๏ธ๐ฒ A fast, in-memory optimized merkle patricia tree
License: Mozilla Public License 2.0
It seems that the original tree's tests were built against an old version of ethereumjs-testing which did not contain this test. This test fails against the original tree and should be fixed. We probably should provide an upstream patch as well.
An "embedded" node may contain multiple embedded nodes:
For example, the tree from
tree.put(Buffer.from('12345'), Buffer.from('1'));
tree.put(Buffer.from('123456'), Buffer.from('2'));
tree.put(Buffer.from('1234'), Buffer.from('3'));
Results in a tree:
[extension (469732) -(31323334)-> 486339]
[branch (486339) 3: d9680c val: 33]
[extension (d9680c) -(5)-> 9075f1]
*[branch (9075f1) 3: 391d39 val: 31]
*[leaf (391d39) -(6)-> val: 32]
Where the nodes with an asterisk are embedded in the extension node d9680c, since 9075f1 contains the leaf node. VerifyWitness currently assumes embedded nodes contain contain other nodes and will fail to verify these proofs with a:
Error: Key does not match the proof (embeddedNode)
Sometimes, when getFromCache() in the CachedMerkleTree is used with an empty node map, null is returned instead of the correct value.
See PR#83 for a test case using test data.
Current benchmarks to test batch
operations: batchGet
and batchPut
(in pipeline), mask the overheads involved. Benchmarks execute a test multiple times and report the mean and deviation for all the runs. batchGet
operation includes sorting batch_ops
using radix_sort
and traversing the tree once from left to right while bookkeeping and collecting responses. So while benchmarking batchGet
, the test is run multiple times where the overhead of sorting the keys is encountered only in the first run keeping the mean values look okay. The same is the case with batchPut
.
However, take a look at the following where a function is ran only once and is timed. batchPut
and batchGet
are both implemented and being tested.
generate 15k-1-32-ran : 3665.27 ms (batchPut-1)
generate 15k-1-32-ran : 347.50 ms (put-15k)
getFrom 15k-1-32-ran : 10246.64 ms (batchGet-1)
getFrom 15k-1-32-ran : 27.40 ms (get-15k)
Just the radix_sort
for 15k keys is taking around 30-32 ms.
Currently, it seems that rlpToMerkleNode needs to have an instance of a merkle tree. Does it actually need any fields of the Merkle tree to work? It seems it shouldn't, so it should either be static, or just an exported function
tree.batchCOW(putOps, delOps)
: If putOps
and delOps
are empty; batchCOW
should return the tree. Currently batchCOW returns a new instance of MerklePatriciaTree
Hi guys, Holger from EthereumJS
here. We just released a new version v2.5.0 of the VM with a new (beta
labelled) StateManager
API which now completely encapsulate all accesses to the trie, goal for a next breaking release is to completely separate StateManager
and VM
.
Maybe it's worth to experiment with this a bit, I think your trie implementation might be a very interesting use case, would be excited to see to exchange the trie with yours and e.g. see how fast our tests will run or stuff like that. As a mid-term goal it might be also worth to think about supporting your trie implementation in a then separate StateManager
repo.
Also circling @mattdean-digicatapult into the conversation, who did all the StateManager
refactoring work. Matthew, if you find some time, you might also want to have a closer look how realistic the above scenario is and how well the current API (or this is somewhat more the internal code structure as far as I see) is suited to do such a trie replacement?
Cheers
Holger
Currently, the webpack browser package is quite large (~256KB). This could be because of inline source maps or other dependencies, we should investigate and minimize it.
If a value is within an embedded node, VerifyWitness will fail to verify it with an error since original nodes are used, and the value of branch nodes is an array.
Delete all the existing keys in the merkle-tree and the tree's root node remains to be a BranchNode, where as the rootNode should be a NullNode.
It appears that a pruned CachedMerklePatriciaTree
returns null if the key could not be found in getFromCache
instead of throwing an error. This behavior is incorrect, if the key could possibly be present (i.e., we've reached a HashNode), an error should be thrown. If we can actually determine the key does not exist, only then can we return null.
A test can be found in #87
In order to fully support the CachedMerkleTree, we need a CoW API that uses nodeBags and provides nodesUsed, and returns a CachedMerkleTree so that it can be pruned.
For example:
batchCOW(batchOps: BatchOp[], nodesUsed : Set<bigint>, nodeBag:<bigint, MerkleTreeNode<V>>) : CachedMerklePatriciaTree<K,V>
We need to return which nodes were actually used from the cache, so we can save space by not re-transmitting nodes which were not used.
I suggest actually changing the signature of getFromCache to this:
interface CachedTraversal<V> {
value : V;
nodesUsed: bigint[];
}
getFromCache(key: K, nodesBags... : Map<bigint, MerklePatriciaTreeNode<V>) : CachedTraversal<V>
nodeBags should be variable arguments, so we can support a simple cache in the future of frequently used accounts. Example:
getFromCache("buffer", nodeBag, cache);
Value not being inserted into the tree
(The root hash is different in the cases where same key value pairs are put into the tree in different order.)
Example:
tree.put(Buffer.from("12345"), Buffer.from("1"))
tree.put(Buffer.from("123456"), Buffer.from("2"))
tree.put(Buffer.from("1234"), Buffer.from("3"))
tree.get(Buffer.from("1234"))
returns :
expected : <Buffer 33>
It seems that the CachedMerklePatriciaTree doesn't take an options object. This means that it can't be typed as anything other than <Buffer, Buffer>, since both key conversion and value conversion won't work.
It seems the class currently takes the default options object, which discards all this information.
The class should be updated to take an options object.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.