Coder Social home page Coder Social logo

optd's Introduction

optd

optd (pronounced as op-dee) is a database optimizer framework. It is a cost-based optimizer that searches the plan space using the rules that the user defines and derives the optimal plan based on the cost model and the physical properties.

The primary objective of optd is to explore the potential challenges involved in effectively implementing a cost-based optimizer for real-world production usage. optd implements the Columbia Cascades optimizer framework based on Yongwen Xu's master's thesis. Besides cascades, optd also provides a heuristics optimizer implementation for testing purpose.

The other key objective is to implement a flexible optimizer framework which supports adaptive query optimization (aka. reoptimization) and adaptive query execution. optd executes a query, captures runtime information, and utilizes this data to guide subsequent plan space searches and cost model estimations. This progressive optimization approach ensures that queries are continuously improved, and allows the optimizer to explore a large plan space.

Currently, optd is integrated into Apache Arrow Datafusion as a physical optimizer. It receives the logical plan from Datafusion, implements various physical optimizations (e.g., determining the join order), and subsequently converts it back into the Datafusion physical plan for execution.

optd is a research project and is still evolving. It should not be used in production. The code is licensed under MIT.

Get Started

There are two demos you can run with optd. More information available in the docs.

cargo run --release --bin optd-adaptive-tpch-q8
cargo run --release --bin optd-adaptive-three-join

You can also run the Datafusion cli to interactively experiment with optd.

cargo run --bin datafusion-optd-cli

Documentation

The documentation is available in the mdbook format in the docs directory.

Structure

  • datafusion-optd-cli: The patched Apache Arrow Datafusion (version=32) cli that calls into optd.
  • datafusion-optd-bridge: Implementation of Apache Arrow Datafusion query planner as a bridge between optd and Apache Arrow Datafusion.
  • optd-core: The core framework of optd.
  • optd-datafusion-repr: Representation of Apache Arrow Datafusion plan nodes in optd.
  • optd-adaptive-demo: Demo of adaptive optimization capabilities of optd. More information available in the docs.
  • optd-sqlplannertest: Planner test of optd based on risinglightdb/sqlplannertest-rs.
  • gungnir: Scalable, memory-efficient, and parallelizable statistical methods for cardinality estimation (e.g. TDigest, HyperLogLog).

Related Works

optd's People

Contributors

skyzh avatar averyqi115 avatar gun9nir avatar yliang412 avatar wangpatrick57 avatar jurplel avatar alschlo avatar xiaguan avatar xzhseh avatar sweetsuro avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.