Coder Social home page Coder Social logo

pql's Introduction

pipelined query language

Website Playground Discord

This Go library compiles a pipelined-based query language (inspired by the Kusto Query Language) into SQL. It has been specifically tested to work with the Clickhouse SQL dialect, but the generated SQL is intentionally database agnostic. This repository contains a Go library, and a CLI to invoke the library.

For example, the following expression:

StormEvents
| where DamageProperty > 5000 and EventType == "Thunderstorm Wind"
| top 3 by DamageProperty

will be compiled to SQL that is similar to:

SELECT *
FROM StormEvents
WHERE DamageProperty > 5000 AND EventType = 'Thunderstorm Wind'
ORDER BY DamageProperty DESC
LIMIT 3;

Getting Started

If you'd like to see a demo along with some examples, check out https://pql.dev.

To use pql in your go code, a minimal example might look like this

package main

import (
	"github.com/runreveal/pql"
)

func main() {
	sql, err := pql.Compile("users | project id, email | limit 5")
	if err != nil {
		panic(err)
	}
	println(sql)
}

Running this program should give you the following output

$ go run test.go

WITH "__subquery0" AS (SELECT "id" AS "id", "email" AS "email" FROM "users")
SELECT * FROM "__subquery0" LIMIT 5;

Documentation

The following tabular operators are supported and the Microsoft KQL documentation is representative of the current pql api.

The following scalar functions are implemented within pql. Functions not in this list will be passed through to the underlying SQL engine. This allows the usage of the full APIs implemented by the underlying engine.

Column names with special characters can be escaped with backticks.

Get involved

License

Apache 2.0

pql's People

Contributors

abraithwaite avatar ejcx avatar jamesejr avatar shellcromancer avatar zombiezen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

pql's Issues

Create parser

  • Tabular expression structure
  • #23
  • Parse count tabular operator
  • Parse where tabular operator
  • Parse take tabular operator
  • Parse project tabular operator
  • Parse summarize tabular operator
  • Parse sort tabular operator

Question on Validation

How can we validate the generated query. For e.g.

Allowed Databases, Allowed Tables, Allowed Fields specific to each Table, on Where conditions checking for certain conditions, on limits checking the min and max limit value that the user can pass etc

If the user will be writing the query in PQL format. We want to validate for valid tables, but also limit the way the user queries instead of allowing them to query everything.

Support for Parameterized Queries

We should figure out how to support parameterized queries for databases.

They have various distinct ways of being implemented that we should be able to support

We may be able to use the passthrough mechanisms plus some side-channel loading of parameters somehow, as an idea.

Add an LSP / Autocompletion support

Not sure what the best way to approach this problem is, but support for an LSP would be huge for obvious reasons.

Theoretically, this should be more straight forward than SQL because we have better context at the point of typing.

That is, context flows through the pipe, versus SQL SELECT a, b, c FROM X where we don't know the possible values for a, b, or c until the table X is specified.

Decide on name

Keeping an issue open here to decide on the language/repository name and rename things as appropriate.

Implement `as` operator

Reference

Since we're already extracting CTEs anyway, we can just have a no-op that captures aliases at particular points of a pipeline.

How does it differ from PRQL?

This seems very similar to the effort done by the PRQL team.

How does this project differ and have you considered joining forces instead of creating a new SQL competitor.

Map key access

From @abraithwaite:

Support for map column type would be helpful I think. Not sure the best way to do this, but the syntax for a map key access is: mapcol['strkey'].

Functions in project

Not sure if expected or not. I want to do some post-processing of columns that are selected.

runreveal_logs
	| project toLower(eventName), notEmpty(eventName)
	| limit 5

Are things like this currently supported? I couldn't quite figure out if they were or not with another syntax. This way I'm currently doing it I get this error:

pql: parse pipeline query language: 2:26: expected '|', got '('
pql: one or more statements could not be compiled

Decide how to approach Nix dependencies in GitHub Actions CI

In other projects, I use Nix to manage the CI steps (example). This guarantees that running nix flake check locally reproduces the CI build exactly, but puts Nix in a more prominent place in the build process, which may not be desirable.

Alternatively, we could use Nix just as a package manager and use it to install Clickhouse (or possibly Go as well) in CI.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.