Coder Social home page Coder Social logo

Comments (4)

alikhtarov avatar alikhtarov commented on May 3, 2024

Every value over the size of N will be split into smaller chunks of size N (last chunk might be less than N). The original key becomes the 'index' key which stores a random suffix and total number of chunks; the chunks are stored at modified keys that includes the same random suffix and the chunk id.

The random suffix is needed for consistency - you can simply remove the original key and the chunks will be 'deleted' - there's no way to access them without knowing the random suffix. It also takes care of simultaneous sets, since only one key will win the race, and only its random suffix will be valid.

The way we deployed it on a live system was in stages. First we deployed reads only - if you set N to some large value (like 1000000000), the logic is still enabled on the read path, but will not actually split any values. This makes sure that all clients can understand split values once we start writing them.
Second stage was lowering N to actually start splitting the values. The exact value we use is 524288.

Note that all chunks will be sent to the same memcache box as the original key would be, so that means you're still transferring the same amount of data from a single memcache box to the client. If you want to transfer huge values this way, you still have to wait for individual chunks to arrive serially, so that might explain the timeouts you see - can you share the size of the values you're setting/fetching and the value of N you tried?

from mcrouter.

jamescarr avatar jamescarr commented on May 3, 2024

Thanks this clears it all up.

My timeouts were unrelated it turned out.

from mcrouter.

jamescarr avatar jamescarr commented on May 3, 2024

If anyone wants some fun, it turned out that the cached value for the view context of our blog posts was weighing in at 6.5mb. Each. This got rejected but I think once we turned on big value splitting it overloaded our cache servers. ;-)

from mcrouter.

marko-jovicic avatar marko-jovicic commented on May 3, 2024

The random suffix is needed for consistency - you can simply remove the original key and the chunks will be 'deleted' - there's no way to access them without knowing the random suffix. It also takes care of simultaneous sets, since only one key will win the race, and only its random suffix will be valid.

I have few questions about big value split threshold option:

  1. When key is deleted, I find it out that other related parts/chunks are not deleted. Is this expected behaviour or not?
  2. In case of simultaneous sets, last key that is set has references to its other parts/chunks which is ok. However, chunks that are related to previous sets are not deleted, but they should be?

mcrouter is 1.0 version (built using Dockerfile). Memcached version is 1.4.13.
mcrouter is started with this command:

# 5 bytes just in test purposes
mcrouter --big-value-split-threshold=5  --config-str='{"pools":{"A":{"servers":["127.0.0.1:5001"]}},"route":"PoolRoute|A"}' -p 5000 

from mcrouter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.