Coder Social home page Coder Social logo

learndb-py's Introduction

LearnDB

What I Cannot Create, I Do Not Understand -Richard Feynman

In the spirit of Feynman's immortal words, the goal of this project is to better understand the internals of databases by implementing a database management system (DBMS) (sqlite clone) from scratch.

Parts

The following are some parts of a DBMS (In no particular order):

  • Storage Layer

    • how is the data stored (physically and data-structure) and accessed?
    • is the data partitioned or replicated?
  • Logical Modelling

    • what is the syntax, semantics, and expressibility of the language/interface for storing data and expressing computations on it
    • sql syntax, relational semantics, declarative style, is the historically dominant choice
    • increasingly, alternatives are becoming popular e.g. functional (hadoop map-reduce, spark) and imperative
    • many different storage models, e.g. key-value, document, graph
    • these alternatives also tend to be more piece-meal, unlike sql
    • and sometimes the distinction between a database and non-database information management/processing system is not clear
  • Query Execution

    • how is query plan generated
    • what is the basic unit of work?
    • how is the work distributed and resources managed?
  • Query Optimization

    • for a user given query, are there many plans that satisfy it?
    • how do we search this plan-space for the lowest cost plan?
  • Transactions and Concurrency

    • how do multiple users concurrently use the system?
    • how does the system ensure concurrent access doesn't leave the system in an inconsistent state?
    • how does the system ensure system failures, e.g. power failure don't leave in an unrecoverable/inconsistent state
    • what is an inconsistent state?
    • what is a recoverable state?
    • How are ACID properties guaranteed

Todo

  • immanent work/issues are tracked in tasks.md
  • long-term ideas are tracked in docs/future-work.md

References consulted

Development

  • To install for development, i.e. src can be edited from without having to reinstall:

    • cd <repo_root>
    • create virtualenv: python3 -m venv venv
    • activate: source venv\bin\activate
    • python -m pip install -r requirements.txt
    • install in edit mode: python3 -m pip install -e .
  • NOTE: run.py provides a script to run learndb; this requires learndb be installed first.

Run

  • Requires > python 3.10 and pytest

  • Run btree tests: -python -m pytest -s tests/btree_tests.py # stdout

  • python -m pytest tests/btree_tests.py # suppressed out

  • Run end-to-end tests: ... e2e_tests.py

  • Run serde tests: ... serde_tests.py

  • Run language parser tests: ... lang_tests.py

  • Run specific test: python -m pytest tests.py -k test_name

  • Clear pytest cache python -m pytest --cache-clear

  • Run REPL: python learndb.py repl

Generate Docs

e.g. python -m pydoc -w .\btree.py

Getting Started- Tutorial

TODO: update and move to docs/tutorial.md

Let's consider how we can use the repl

start the REPL (Only supports a global, implicit table)

python learndb.py repl

Currently, supported commands include

insert 3 into tree

insert 3

select and output all rows (no filtering support for now)

select

delete 3 from tree

delete 3

Supported meta-commands: quit REPl

.quit

print btree

.btree

performs internal consistency checks on tree

.validate

Deleting the db.file will effectively drop the entire database.

Supported Grammar

The supported grammar can be found at: <repo_root>/learndb/lang_parser/grammar.py

Misc Notes

  • LEAF_NODE_MAX_CELLS, INTERNAL_NODE_MAX_CELLS control how many max children each node type can support

Gotchas

learndb-py's People

Contributors

spandanb avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.