It is possible to insert a cache between DB operation and node handling operation.

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I don't know whether TrieDB and <code class="notransl

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="55

Add cache to the Merkle Trie about rust-merkle-trie HOT 8 CLOSED

codechain-io commented on June 1, 2024

Add cache to the Merkle Trie

from rust-merkle-trie.

Comments (8)

sgkim126 commented on June 1, 2024 1

The current TrieDB and TrieDBMut is Send and Sync. However, you can make it non-Send and non-Sync, since CodeChain doesn't pass it to other threads. I think RefCell might be the solution.

from rust-merkle-trie.

somniumism commented on June 1, 2024

@sgkim126 @majecty I'm implementing a cache to reduce DB operation. CodeChain already uses lru_cache crate(https://contain-rs.github.io/lru-cache/lru_cache/struct.LruCache.html#method.contains_key), so I'm trying to add the cache using the same crate.

Like this:

// triedbmut.rs
pub(crate) struct TrieDB<'db> {
    db: &'db dyn HashDB,
    root: &'db H256,
    cache: LruCache<H256, Vec<u8>>,
}
.
.
.
let node = RlpNode::Leaf(path, insert_value);
let node_rlp = RlpNode::encoded(node);
let hash = self.db.insert(&node_rlp);
self.cache.insert(hash, node_rlp);

The key-value(hash-rlp_node) is stored in the cache when they are written, read in the DB. Before DB operation(get()), we can find the value in the cache. Therefore, the overhead derived from DB operations will be reduced by adding cache.

However, there is a problem. In fn get_aux() in triedb.rs, an error 'self' is a '&' reference, so the data it refers to cannot be borrowed as mutable occur because of immutable self. It can be solved by using RefCell, but we cannot use the multi-thread.

Could you advise me, or Could you tell me what your opinion on this matter is?

from rust-merkle-trie.

majecty commented on June 1, 2024

I don't know whether TrieDB and TriDBMut were used in a multi-threaded environment or not. IMO, they are just a view of internal KeyValueDB, using them in a single thread is natural. So using RefCell for the cache is a right choice.

from rust-merkle-trie.

somniumism commented on June 1, 2024

@sgkim126 @majecty I implemented the cache in our Merkle trie. As I mentioned, I used lru_cache crate. In short, when reading the value from the trie, I found that performance improved by up to 30 percent. However, when writing the value to the trie, there was no change in performance.

I tested the performance using the benchmark that we were already using, and It is like this:

const DB_SIZE: usize = 10000;
const BATCH: usize = 10000;

#[bench]
fn bench_read_multiple(b: &mut Bencher) {
    let db = TestDB::new(DB_SIZE);
    b.iter(|| {
        let trie = db.trie();
        for _ in 0..BATCH {
            let item = random::<usize>() % DB_SIZE;
            let _ = trie.get(&item.to_be_bytes()).unwrap().unwrap();
        }
    });
}

The results are as follows:

Result of benchmark tests for bench_read_multiple
Control : 174,522,399 ns/iter
Cache Size = 500 : 149,356,897 ns/iter => down 14.419 %
Cache Size = 1000 : 135,082,059 ns/iter => down 22.599 %
Cache Size = 2000 : 132,718,845 ns/iter => down 23.953 %
Cache Size = 3000 : 123,239,539 ns/iter => down 29.384 %
Cache Size = 4000 : 120,926,292 ns/iter => down 30.710 %
Cache Size = 5000 : 117,794,169 ns/iter => down 32.504 %
Cache Size = 6000 : 116,277,524 ns/iter => down 33.373 %
Cache Size = 7000 : 112,805,251 ns/iter => down 35.363 %

In my opinion, we need new benchmarks to test memory usage, and tests that can be applied to the actual situation. As cache size increases, we need more memory. It causes performance degradation. Could you tell me what your opinion on the current situation?

from rust-merkle-trie.

majecty commented on June 1, 2024

It looks good to me.

It is interesting that the performance improved even though the test uses random keys. It seems that there are some common branch nodes that cached.

IMO, it is good to implement it. @sgkim126 What is your opinion?

from rust-merkle-trie.

sgkim126 commented on June 1, 2024

@somniumism Could I see the source? Did you cache the key-value pair or the address-node pair?

from rust-merkle-trie.

somniumism commented on June 1, 2024

@sgkim126 I pushed the commit, but I think it still has something to fix: (1) There is no change in performance when writing the value to the trie, so I have to exclude the cache from triedbmut.rs. (2) I wanted to see the result quickly. So there may be unnecessary codes. Of course, it's my guess. So, I need your advice. Additionally, I cached the address-node pair. : )

This is the pull request: #7
I want you to advise me.

from rust-merkle-trie.

sgkim126 commented on June 1, 2024

#7 closed it.

from rust-merkle-trie.

Add cache to the Merkle Trie about rust-merkle-trie HOT 8 CLOSED

Comments (8)

Related Issues (11)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent