Coder Social home page Coder Social logo

MemTable WAL implementation about filodb HOT 5 CLOSED

filodb avatar filodb commented on June 12, 2024
MemTable WAL implementation

from filodb.

Comments (5)

velvia avatar velvia commented on June 12, 2024

Now that we've switched to FiloMemTable, need a fast in-memory off-heap storage that can also be persisted. Some options:

Chronicle-java

https://github.com/xerial/larray

from filodb.

velvia avatar velvia commented on June 12, 2024

@parekuti here are some guidelines for the write-ahead log implementation for the memtable.

Requirements

  • Must be able to save the state of FiloAppendStore and FiloMemTable such that it could be restored if a crash happens
  • Must be able to save new Filo chunks as appended by FiloAppendStore to disk
  • If FiloAppendStore decides to rewrite the most current chunk, this must be handled (instead of appending new chunks, it replaces most recently appended chunks)

If a crash happens, the on disk file must restore all the state of the FiloAppendStore as well as the partSegKeyMap in the FiloMemTable. However, the thought is that the partSegKeyMap does not need to be preserved on disk because the partition and segment keys for each row could be recovered from the chunks themselves.

At a higher level, we must be able to restore the state of all the active NodeCoordinatorActors. Thus, the active and flushing memtables; for each NodeCoordinatorActor, the dataset, version, and ingestion schema / columns. This needs to be persisted somewhere.

Write-Ahead Log File Format

While the FiloMemTable already uses binary Filo chunks, we still need some file format for containing the chunks. So this is a proposal for the format.

File Header

The file header consists of the following bytes. The + signifies an offset in hex. Everything is written little endian.

  • +0000: 8 bytes: The UTF8 for the string "FiloWAL" followed by 0x00
  • +0008: 2 bytes: 0x0001 - signifying a header with column definitions
  • +000a: 2 bytes: the little-endian number of columns
  • +000c: 2 bytes: the number of bytes of column definitions (NN)
  • +000e: NN bytes: The output of DataColumn.toString for each column, UTF8-encoded / written using DataWriter.write(string)
  • +000e + NN: 2 bytes: 0x0002 - signifying a section holding Filo chunks, should have number of chunks corresponding to the number of columns
  • +0010 + NN: 4 bytes: number of bytes of first Filo columnar chunk
  • +0014 + NN: first Filo columnar chunk
    The above pattern repeats for each columnar chunk.

from filodb.

velvia avatar velvia commented on June 12, 2024

directory structure:

${memtable-wal-dir} / $dataset_$version / $timestamp.wal

Need to store datasets being written somewhere

from filodb.

velvia avatar velvia commented on June 12, 2024

@parekuti is working on this issue, but for some reason cannot assign this issue to her.

from filodb.

velvia avatar velvia commented on June 12, 2024

The PR for this has been merged.

from filodb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.