Coder Social home page Coder Social logo

Comments (3)

evanmcc avatar evanmcc commented on August 17, 2024

Haven't gone through all of the design docs yet, but would love two see two things added:

  • time complexities for the various operations and a comparison to the current setup
  • references to the literature, where it has inspired various features, etc.

from bitcask.

engelsanchez avatar engelsanchez commented on August 17, 2024

Unfortunately I don't have any reference paper for this. I would love to know if you know of any to avoid wasting my time on bad paths. My naive search for "spill to disk hash table" didn't really turn up anything relevant. Searching for "hot data and virtual memory" got me some pages of Readings in Database Systems, which I promptly bought. But the paper in question was about executing joins and no other paper in there seem similar enough. So I'm down $60, but no reference.

As for the time complexity: without considering the fact that some pages are memory mapped and thus there is the probability of a heavy penalty for operations once you consume the entire memory buffer, the data structure is a very plain hashtable where items with the same hash are laid out sequentially in pages instead of in a list, for example. The table is never rehashed. So in theory it is still O(1) + cost of scanning a series of pages which is proportional to the data set size. I am not sure how to compute time complexities that are more meaningful for the interesting scenarios where any number of the pages might generate I/O. If you assume that we can not recommend its use if the hot data does not fit in memory, because you fall off the memory/disk cliff, you could say that it should stay in the vicinity of O(1) + k * num_keys or something.

from bitcask.

jsvisa avatar jsvisa commented on August 17, 2024

@engelsanchez Looks forward to it, but is there any news about the implement?

I just come up with a simple implement, replace the keydir data struct with a LevelDB instance, the implement is easy, but not for the time complex and any other criticals.

from bitcask.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.