I was looking for some more details on the --big-value-split

--big-value-split-threshold Details about mcrouter HOT 4 CLOSED

facebook commented on May 3, 2024

--big-value-split-threshold Details

from mcrouter.

Comments (4)

alikhtarov commented on May 3, 2024

Every value over the size of N will be split into smaller chunks of size N (last chunk might be less than N). The original key becomes the 'index' key which stores a random suffix and total number of chunks; the chunks are stored at modified keys that includes the same random suffix and the chunk id.

The random suffix is needed for consistency - you can simply remove the original key and the chunks will be 'deleted' - there's no way to access them without knowing the random suffix. It also takes care of simultaneous sets, since only one key will win the race, and only its random suffix will be valid.

The way we deployed it on a live system was in stages. First we deployed reads only - if you set N to some large value (like 1000000000), the logic is still enabled on the read path, but will not actually split any values. This makes sure that all clients can understand split values once we start writing them.
Second stage was lowering N to actually start splitting the values. The exact value we use is 524288.

Note that all chunks will be sent to the same memcache box as the original key would be, so that means you're still transferring the same amount of data from a single memcache box to the client. If you want to transfer huge values this way, you still have to wait for individual chunks to arrive serially, so that might explain the timeouts you see - can you share the size of the values you're setting/fetching and the value of N you tried?

from mcrouter.

jamescarr commented on May 3, 2024

Thanks this clears it all up.

My timeouts were unrelated it turned out.

from mcrouter.

jamescarr commented on May 3, 2024

If anyone wants some fun, it turned out that the cached value for the view context of our blog posts was weighing in at 6.5mb. Each. This got rejected but I think once we turned on big value splitting it overloaded our cache servers. ;-)

from mcrouter.

marko-jovicic commented on May 3, 2024

The random suffix is needed for consistency - you can simply remove the original key and the chunks will be 'deleted' - there's no way to access them without knowing the random suffix. It also takes care of simultaneous sets, since only one key will win the race, and only its random suffix will be valid.

I have few questions about big value split threshold option:

When key is deleted, I find it out that other related parts/chunks are not deleted. Is this expected behaviour or not?
In case of simultaneous sets, last key that is set has references to its other parts/chunks which is ok. However, chunks that are related to previous sets are not deleted, but they should be?

mcrouter is 1.0 version (built using Dockerfile). Memcached version is 1.4.13.
mcrouter is started with this command:

# 5 bytes just in test purposes
mcrouter --big-value-split-threshold=5  --config-str='{"pools":{"A":{"servers":["127.0.0.1:5001"]}},"route":"PoolRoute|A"}' -p 5000

from mcrouter.

--big-value-split-threshold Details about mcrouter HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent