Coder Social home page Coder Social logo

sigpwned / jdbq Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 1.0 188 KB

JDBI-inspired Database Access Framework for Java + BigQuery

License: Apache License 2.0

ANTLR 1.23% Java 98.77%
java bigquery data-lake data-access-layer data-access-library persistence persistence-framework persistence-layer data-access-framework

jdbq's Introduction

jdbq tests Maven Central

A JDBI-inspired database access framework for Java 8+ and Google BigQuery.

Motivation

BigQuery is one of the best data lake implementations available on the market today. While its official Java client supports several useful features for working with data stores, such as parameter bindings, there are many important patterns it does not support, such as modular row and column mappers. JDBI is the state-of-the-art tool for working with persistence stores. JDBQ is a library for using BigQuery that makes the best features of JDBI available to BigQuery users.

Goals

  • To work directly with the official Java client
  • To provide the most important features from the JDBI framework
  • To improve the QOL and productivity of Java developers using BigQuery

Non-Goals

  • To provide all features from the JDBI framework

Example Usage

For the below examples, imagine a table with the following structure:

CREATE TABLE sales(
    buyer STRING NOT NULL,
    invoice STRING NOT NULL,
    sku STRING NOT NULL,
    quantity INT64 NOT NULL,
    timestamp TIMESTAMP NOT NULL
);

Initial Setup

First, the user should create a Jdbq instance:

BigQuery client=createBigQueryClient();
Jdbq jdbq=new Jdbq(client);

Users may also find it useful to add a customizer to set default dataset, SQL dialect, etc.

jdbq.getConfig().get(SqlStatements.class).addCustomizer(new StatementCustomizer() {
    @Override
    public void beforeExecution(QueryJobConfiguration.Builder stmt, StatementContext ctx) {
        stmt.setDefaultDataset(DEFAULT_DATASET_NAME).setUseLegacySql(false);
    }
});

Column Mapped Results

Consider the following query:

long quantity=jdbq.createQuery("""
        SELECT
            SUM(quantity) AS quantity
        FROM sales
        WHERE sku=:sku
            AND timestamp BETWEEN :since AND :until""")
    .bind("sku", "abcd1234")
    .bind("since", LocalDate.of(2023, 1, 1))
    .bind("until", LocalDate.of(2023, 3, 31))
    .mapTo(Long.class)
    .one();

This query computes how many units of SKU abcd1234 were sold in Q1 2023.

In this example, we see that we have named parameters in the query (e.g., :sku) with values provided using the bind method later. Next, the result is mapped to Long values, and then exactly one value is retrieved, otherwise an exception is thrown. This works because there are built-in ColumnMapper classes for most builtin types, such as Integer, Long, Double, String, LocalDate, and so on.

Row Mapped Results

Now consider this code:

Jdbq jdbq=createJdbq();

record SkuSales(String sku, long quantity) {}

jdbq.getConfig(RowMappers.class).register(new RowMapper<SkuSales>() {
    @Override
    public SkuSales map(FieldValueList fvs, StatementContext ctx) {
        String sku=fvs.get("sku").getStringValue();
        long quantity=fvs.get("quantity").getLongValue();
        return new SkuSales(sku, quantity);
    }
});

List<SkuSales> sales=jdbq.createQuery("""
        SELECT
            sku AS sku,
            SUM(quantity) AS quantity
        FROM sales
        WHERE timestamp BETWEEN :since AND :until
        GROUP BY 1
        ORDER BY 2 DESC
        LIMIT 10""")
    .bind("since", LocalDate.of(2023, 1, 1))
    .bind("until", LocalDate.of(2023, 3, 31))
    .mapTo(SkuSales.class)
    .list();

This query computes the top 10 SKUs with the most sales in Q1 2023.

In this example, we see our first RowMapper, which is custom code used to map a SQL query result row to a Java bean. In this case, each row is mapped to a SkuSales object. Note that the registering the RowMapper for the SkuSales class during initialization effectively decouples the serialization of records from business logic.

DML

The library also supports DML operations.

Jdbq jdbq=createJdbq();

long deleted=jdbq.createUpdate("""
        DELETE FROM sales
        WHERE sku=:sku AND quantity=0""")
    .bind("sku", "1234")
    .execute();

This query deletes all sales records with sku 1234 and quantity 0.

QueryFragment

JDBQ does have one important innovation over the rote JDBI feature set: the QueryFragment. A QueryFragment allows users to bundle SQL along with attributes and arguments for use in a query, which may contain other QueryFragment instances, and so on. For example:

Jdbq jdbq=createJdbq();
jdbq.get(SqlStatements.class).setTemplateEngine(new QueryFragmentTemplateEngine());

record SkuSales(String sku, long quantity) {}

jdbq.getConfig(RowMappers.class).register(new RowMapper<SkuSales>() {
    @Override
    public SkuSales map(FieldValueList fvs, StatementContext ctx) {
        String sku=fvs.get("sku").getStringValue();
        long quantity=fvs.get("quantity").getLongValue();
        return new SkuSales(sku, quantity);
    }
});

QueryFragment buyerPredicate;
if(filterToBuyer != null) {
    buyerPredicate = new QueryPredicate("buyer=:buyer").bind("buyer", filterToBuyer);
}
else {
    buyerPredicate = new QueryPredicate("TRUE");
}

List<SkuSales> sales=jdbq.createQuery("""
        SELECT
            sku AS sku,
            SUM(quantity) AS quantity
        FROM sales
        WHERE timestamp BETWEEN :since AND :until
            AND (<BUYER>)
        GROUP BY 1
        ORDER BY 2 DESC
        LIMIT 10""")
    .define("BUYER", buyerPredicate)
    .bind("since", LocalDate.of(2023, 1, 1))
    .bind("until", LocalDate.of(2023, 3, 31))
    .mapTo(SkuSales.class)
    .list();

This query computes the top 10 SKUs with the most sales in Q1 2023 from the given optional buyer.

Note that the buyer predicate includes an argument. Using a QueryFragment, the entire predicate is self-contained because it supports not only SQL but also attributes and arguments, and is therefore reusable. If this were handled without QueryFragment, then the builder of the overall query would have to know about how the predicate works, which violates encapsulation and reduces reusability.

The QueryFragment feature allows users to divide and conquer query generation, as well as to reuse components of query generation more freely. This style of query generation is sometimes referred to as the specification pattern.

Each QueryFragment has its own logical "namespace," which means that users don't have to worry about attribute or argument name overlap between QueryFragment instances, even when used in the same query.

Extensibility

The following key features have been brought over from JDBI:

  • Column mappers
  • Row mappers
  • Query customizers
  • Pluggable template engines
  • Custom arguments

FAQ

Why not just use JDBI with the Simba JDBC driver?

That is a fine option, and may work for many businesses. However, not all BigQuery features are available through the JDBC driver, and not all users are comfortable using a closed-source JDBC driver for their data lake. In short: feel free to use the JDBC option if it works for you, but it does not work for everyone.

Roadmap

More features, such as JDBI-style annotated methods, may be added to the library if there is demand.

jdbq's People

Contributors

dependabot[bot] avatar sigpwned avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

gokepler

jdbq's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.