Coder Social home page Coder Social logo

calcite-tutorial's Introduction

Apache Calcite

A tutorial of Apache Calcite for the BOSS'21 VLDB workshop.

In this tutorial, we demonstrate the main components of Calcite and how they interact with each other. To do this we build, step-by-step, a fully fledged query processor for data residing in Lucene indexes, and gradually introduce various extensions covering some common use-cases appearing in practice.

The project has three modules:

  • indexer, containing the necessary code to populate some sample dataset(s) into Lucene to demonstrate the capabilities of the query processor;
  • solution, containing the material of the tutorial fully implemented along with a few unit tests ensuring the correctness of the code;
  • template, containing only the skeleton and documentation of selected classes, which the attendees can use to follow the real-time implementation of the Lucene query processor.

Requirements

  • JDK version >= 8

Quickstart

To compile the project, run:

./mvnw package -DskipTests 

To load/index the TPC-H dataset in Lucene, run:

java -jar indexer/target/indexer-1.0-SNAPSHOT-jar-with-dependencies.jar

The indexer creates the data under target/tpch directory. The TPC-H dataset was generated using the dbgen command line utility (dbgen -s 0.001) provided in the original TPC-H tools bundle.

To execute SQL queries over the data in Lucene, and get a feeling of how the finished query processor looks like, run:

java -jar solution/target/solution-1.0-SNAPSHOT-jar-with-dependencies.jar SIMPLE queries/tpch/Q0.sql
java -jar solution/target/solution-1.0-SNAPSHOT-jar-with-dependencies.jar ADVANCED queries/tpch/Q0.sql
java -jar solution/target/solution-1.0-SNAPSHOT-jar-with-dependencies.jar PUSHDOWN queries/tpch/Q0.sql

The finished query processor provides three execution modes, representing the three main sections which are covered in this tutorial.

You can use one of the predefined queries under queries/tpch directory or create a new file and write your own.

In SIMPLE mode, the query processor does not do any advanced optimization and shows how easy it is to build an adapter from scratch with very few lines of customized code by relying on the built-in operators of the EnumerableConvention and the ScannableTable interface.

In ADVANCED mode, the query processor is able to combine operators with different characteristics demonstrating the most common implementation pattern of an adapter and sets the bases for building federation query engines using Calcite. In this mode, we combine two kinds of operators using the built-in EnumerableConvention and the custom LuceneRel#LUCENE convention along with some basic optimization rules.

In PUSHDOWN mode, the query processor combines operators with different characteristics and is also capable of pushing simple filtering conditions to the underlying engine by introducing custom rules, expression transformations, and additional operators.

calcite-tutorial's People

Contributors

zabetak avatar julianhyde avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.