spacejam / sled Goto Github PK
View Code? Open in Web Editor NEWthe champagne of beta embedded databases
License: Apache License 2.0
the champagne of beta embedded databases
License: Apache License 2.0
take the arguments, and implement these operations:
Right now some of the core components such as tree
, log
, etc. are larger files which include a couple of different structs
, enums
and their corresponding methods and functions.
Those components should be modularized so that a clear file hierarchy is present which makes it easier to understand and reason about the code, its different components and their relationships.
@spacejam already started the process for the Log
component in https://github.com/spacejam/rsdb/pull/27.
this was skipped to simplify the in-memory prototype, but we need to use a frag so that we can linearize key updates with splits
should have similar basic structure as Arc
:
AtomicUsize
the use of a dictionary dramatically improves zstd compression and decompression performance.
when executing operations, we want to know about:
For latency, we care about the latency distribution, which means we need to collect percentiles. This is actually a pretty interesting problem, since almost all existing solutions REALLY suck and use probabilistic approches that destroy accuracy. One approach that I like is:
Vec
of AtomicUsize
that is pretty big.AtomicUsize
and fetch_add
1 to it.here's an example of a function that converts from a number to a slot number, and back: https://github.com/spacejam/loghisto/blob/master/metrics.go#L316
For throughput, just atomically increment a counter for each of the ops, and have a thread spit out the values every second, resetting it to 0. The latency percentiles can be for the duration of the entire benchmark, for now.
basic idea:
https://github.com/spacejam/tla-rust#lock-free-epoch-based-gc
it may be sufficient to use ping-pong epochs for this to simplify some logic. we keep a single word "header" that is CAS'd by threads to coordinate:
1 bit | 1 bit | 7 bits | 1 bit | 7 bits
left/right active | left state | left active thread count | right state | right active thread count
the current radix implementation is super naive, and all nodes contain 64 (1 << 6) atomic pointers to possibly nonexistant children. because this is a dense keyspace, the first few layers will be full, with the bottom leaves (of which there are 64x more than the above layer...) possibly being quite sparse. we want to keep the overall structure lock-free, since this structure is in the hot-path for every operation. look into ways of using less memory for sparse nodes, and growing/shrinking nodes with a CAS to keep sparsity in check.
naive recovery, no handling of LRU cache-related stuff yet. no page table snapshotting. read it all into memory for reconstruction.
Log
traitPageCache
during all page operations:
calls to the Lru bookkeeping structures are inefficient. should use simpler data structures.
reconstructing pages in-memory involves chasing linked-list pointers around
implement a linked-array structure that minimizes cache thrashing during page reconstruction
set/del only
I am trying to port rsdb
to crossbeam-epoch
: https://github.com/crossbeam-rs/crossbeam-epoch/
In doing so, I found out that Stack
probably should require T: Send
, because an item can be sent to elsewhere. In turn, I guess PageCache
's P
should be Send
because of inner: Radix<Stack<CacheEntry<P>>>
, and then Frag
should be Send
. What do you think?
io buffer claim
io buffer seal
io buffer unique write
gc enter epoch
gc write garbage to (possibly different) epoch
gc clear garbage guaranteed not to be in use by thread in epoch or any earlier epochs
tree ops...
etc...
https://github.com/spacejam/tla-rust
LogID
in CacheEntry
non-optionalin the future we may want to buffer things before claiming a log slot for updates, but for now let's just keep it simple
we can't compress on the log-buffer level without destroying the property of a LogID
being its physical offset, so we need to compress before writing here. Desired properties:
the pagetable should know how to retrieve frags from the log to satisfy non-resident reads
calls to the page materializer take a lot of time, try to cache results from it so we don't spend as much cpu calling it on the same hot pages over and over.
to improve data locality and reduce written data, write adjacent chunks of frags in a single stack at once to the log
the benchmark tool should accept arguments for these parameters:
currently the pagecache recovers itself, but does not communicate to the tree on top about updates it encounters to the tree root, since it doesn't know what a tree root is. We should support an interface that lets trees inject custom logic into the recovery (and eventually the snapshotting) process.
on recovery, we should be able to recover a snapshot of the pagetable, then read the log after the pagetable was created for full reconstruction. we only want to read and recover the structure of pages and the offsets of chunks of frags in the log for future retrieval, without pulling in any kv data.
initially this can be done by forking and snapshotting, similar to redis
define structures around persisted fragments and base nodes, and a way of easily generating a block of serialized state from a Frags
struct.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.