Coder Social home page Coder Social logo

proofrock / ws4sqlite Goto Github PK

View Code? Open in Web Editor NEW
500.0 7.0 14.0 275 KB

Query sqlite via json+http

Home Page: https://germ.gitbook.io/ws4sqlite/

License: ISC License

Go 93.53% Makefile 3.20% Dockerfile 0.24% Java 2.63% Shell 0.40%
sqlite ws isc golang database sql

ws4sqlite's Introduction

๐ŸŒฑ ws4sqlite

๐Ÿ“ข This project's next version was forked, that will support more databases than sqlite (hence the new name, if you noticed ๐Ÿ˜‰). It will take some time, through improvements and, alas, breaking changes. The development is happening in the fork/ws4sql branch, and you can find the changes and the steps needed to migrate a regular installation in the ROAD_TO_WS4SQL.md document.

I recently started a discussion over the future direction for this project. Take a look, and chip in if you want!

ws4sqlite is a server application that, applied to one or more sqlite files, allows to perform SQL queries and statements on them via REST (or better, JSON over HTTP).

Possible use cases are the ones where remote access to a sqlite db is useful/needed, for example a data layer for a remote application, possibly serverless or even called from a web page (after security considerations of course).

Client libraries are available, that will abstract the "raw" JSON-based communication. See here for Java/JVM, here for Go(lang); others will follow.

As a quick example, after launching

ws4sqlite --db mydatabase.db

It's possible to make a POST call to http://localhost:12321/mydatabase, e.g. with the following body:

// Set Content-type: application/json
{
    "resultFormat": "map", // "map" or "list"; if omitted, "map"
    "transaction": [
        {
            "statement": "INSERT INTO TEST_TABLE (ID, VAL, VAL2) VALUES (:id, :val, :val2)",
            "values": { "id": 1, "val": "hello", "val2": null }
        },
        {
            "query": "SELECT * FROM TEST_TABLE"
        }
    ]
}

Obtaining an answer of

{
    "results": [
        {
            "success": true,
            "rowsUpdated": 1
        },
        {
            "success": true,
            "resultSet": [
                { "ID": 1, "VAL": "hello", "VAL2": null }
            ]
        }
    ]
}

Features

Docs, a Tutorial, a Discord.

  • Aligned to SQLite 3.46.0;
  • A single executable file (written in Go);
  • HTTP/JSON access, with client libraries for convenience;
  • Directly call ws4sqlite on a database (as above), many options available using a YAML companion file;
  • In-memory DBs are supported;
  • Serving of multiple databases in the same server instance;
  • Batching of multiple value sets for a single statement;
  • Parameters may be passed to statements positionally (lists) or by name (maps);
  • Results of queries may be returned as key-value maps, or as values lists;
  • All queries of a call are executed in a transaction;
  • For each query/statement, specify if a failure should rollback the whole transaction, or the failure is limited to that query;
  • "Stored Statements": define SQL in the server, and call it from the client;
  • CORS mode, configurable per-db;
  • Scheduled tasks, cron-like and/or at startup, also configurable per-db;
  • Scheduled tasks can be: backup (with rotation), vacuum and/or a set of SQL statements;
  • Provide initialization statements to execute when a DB is created;
  • WAL mode enabled by default, can be disabled;
  • Quite fast!
  • Embedded web server to directly serve web pages that can access ws4sqlite without CORS;
  • Compact codebase;
  • Comprehensive test suite (make test);
  • 11 os's/arch's directly supported;
  • Docker images, for amd64, arm and arm64.

Security Features

  • Authentication can be configured
    • on the client, either using HTTP Basic Authentication or specifying the credentials in the request;
    • on the server, either by specifying credentials (also with hashed passwords) or providing a query to look them up in the db itself;
    • customizable Not Authorized error code (if 401 is not optimal)
  • A database can be opened in read-only mode (only queries will be allowed);
  • It's possible to enforce using only stored statements, to avoid some forms of SQL injection and receiving SQL from the client altogether;
  • CORS Allowed Origin can be configured and enforced;
  • It's possible to bind to a network interface, to limit access.

Design Choices

Some design choices:

  • Very thin layer over SQLite. Errors and type translation, for example, are those provided by the SQLite driver;
  • Doesn't include HTTPS, as this can be done easily (and much more securely) with a reverse proxy;
  • Doesn't support SQLite extensions, to improve portability.

Contacts and Support

Let's meet on Discord!

Credits

Many thanks and all the credits to these awesome projects:

Kindly supported by JetBrains for Open Source development

ws4sqlite's People

Contributors

proofrock avatar thanhnguyen2187 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

ws4sqlite's Issues

Queries fail when an empty array is passed

Hi!

On testing #38, I found a subtle bug where ws4sqlite would not be able to process if an empty array is passed. In more details:

# start the server
ws4sqlite --db /tmp/crypta.db

# sending request
curl \
    -X POST \
    --header 'Content-Type: application/json' \
    --data @/tmp/data.json \
    --fail-with-body \
    127.0.0.1:12321/crypta \
| jq

Where /tmp/data's content is:

{
    "transaction": [
        {
            "query": "SELECT 1",
            "values": []
        }
    ]
}

The response is:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   257  100   156  100   101   420k   272k --:--:-- --:--:-- --:--:--  250k
curl: (22) The requested URL returned error: 400
{
  "reqIdx": -1,
  "error": "in parsing body: json: cannot unmarshal array into Go struct field requestItem.transaction.values of type map[string]json.RawMessage"
}

A workaround for this is to change from empty array [] to empty object/dictionary {}.

I'll see if I can help with this in another PR.

Thanks!


EDIT: this is expected. ws4sqlite differs from sqliterg in this regard ๐Ÿ˜….

Compilation on Windows (PowerShell)

Last thing I noticed! Not the most important thing, but can be surprising!

I compiled your project on a Windows machine on a PowerShell console.
As PowerShell has aliased rm (rm -> Remove-Item), rm -rf doesn't work, but rm -r does.
So in Makefile, the default was broken:

cleanup:
	rm -rf bin
	rm -f src/ws4sqlite

and using rm.exe was required (or I could have unalias rm).

My question is : is force -f parameter (rm -rf) necessary in some cicumstances? if not, could it be removed?
Thanks!

Typo in Documentation ?

Hi,

thanks for this project.
Shouldn't storedQueries be replaced by storedStatements in stored-statements documentation (or added)?
With storedQueries, I get :

{
  "reqIdx": 0,
  "error": "a stored statement is required, but did not find it"
}

and no With 2 stored statements at startup.

Enhancement: keep the db connection open

Instead of opening and closing it for every operation. This doesn't have much impact on the code (an explicit mutex is already present) and would probably give a boost in terms of performances (measurements needed).

When checking file/dir existence, manage all errors

It only happened once, when testing wolfi as docker base: the mounting of resources in docker was somehow faulty, and this code:

func fileExists(filename string) bool {
	info, err := os.Stat(filename)
	if os.IsNotExist(err) {
		return false
	}
	return !info.IsDir()
}

(also the similar dirExists right below) would panic with a "pointer dereference", because there was an err not of the "right" type and it wasn't caught, so info was nil. Give a fault and exit if this happen, so that the reason is more evident.

Rework build system

Tasks/goals:

  • create a BUILDING.md file with clear instructions
  • clarify/fix compilation with statically or dinamically linking (make build/make build-nostatic)
  • compile binaries under docker (after tests) so that docker images and binaries are compiled the same way
  • remove useless Makefile targets

Strategies for higher write throughput.

Sharing a few strategies here. Can be converted to a discussion at your discretion.

  • Enabling WAL2: #30
  • Write batching / bulk INSERT
    • Used by rqlite (sacrifices durability for throughput).
    • Not all writes may be suitable for batching.
  • Multiple databases for writes, then ATTACH for reads.
# main.yaml
scheduledTasks:
  - atStartup: true
    statements:
      - ATTACH 'sessions.db' as sessions

You can then:

SELECT count(*) FROM sessions.users;

Perform JOIN, etc.

Multiple databases enable you to multiply your throughput.

Share your strategies!

Feature: statements to be run at every startup

A new field in the config file (modeled on initStatements) could be added, with statements/queries to be run at each startup of the application. Good to do cleanups, for example, or to generate "run id"s.

Feature: multiple maintenance tasks

Because #20 and #19, maintenance is now much more flexible and covers different use cases; for this reason, it's good to be able to specify multiple maintenance tasks, with different goals and/or different timing.

It can be tricky to maintain retrocompatibility.

Allow different path for db companion YAML file

Currently, the companion YAML file is inferred from the db file, and must be in the same directory. But actually it's part of the "logic", so in some cases it's good to keep them in separate folders.

For example, if I am building a docker container based on ws4sqlite, the db file probably will be mounted externally, but the YAML file will probably be part of the image.

Allow for a syntax similar to the one that's used for --mem-db:

--db <db_path>:<YAML_path>

Out-of-order startup reporting of server configs

When starting with both a db and a serve-dir this happens:

./ws4sqlite -db my.db --serve-dir public

ws4sqlite 0.12.7
- Based on SQLite v3.40.1
- Serving database 'sagramat' from my.db?_pragma=journal_mode(WAL)
  + Parsed companion config file
  + Using WAL
  + Strictly using only stored statements
  + With 24 stored statements
  + Authentication enabled, with 1 credentials
- Serving directory 'public'
  + CORS Origin set to *
- Web Service listening on 0.0.0.0:12321

The CORS Origin set to * line should be above the Serving directory 'public' line, as it refers to my.db.

Any plans for WAL2 / BEGIN CONCURRENT?

Firstly, awesome project! ๐Ÿ‘ Love the backup / vacuum jobs and driver-free access via http! Very flexible tool.

WAL2 would provide some nice advantages to ws4sqlite:

  • More concurrent writes per second without batching.
  • No infinitely growing WAL file.
  • No writing until COMMIT.
  • More granular "page level" locking (instead of "table level" with WAL).

Middlewares activation is not correct

The current implementation creates several middlewares for each database, to do CORS and authentication, but uses a convoluted method to actually enable the correct ones for a database; basically, all the handlers are registered, and if the URL doesn't match the database in each one they are skipped. The problem is that the match is done on the name of the database, but the URL may contain e.g. a trailing slash and the match fails. This mean that for example the authentication could not trigger if the URL have trailing slashes.

Better let Fiber do the match, by registering the relevant middlewares on the POST; this also should have performances benefits, even if probably negligible.

Hardcoded path in error message

$ ./ws4sqlite -db db -serve-dir .
ws4sqlite 0.12.3
- Based on SQLite v3.39.4
panic: runtime error: slice bounds out of range [:-1]

goroutine 1 [running]:
main.parseCLI()
        /home/mano/ws4sqlite/src/cli.go:87 +0xdf9
main.main()
        /home/mano/ws4sqlite/src/ws4sqlite.go:67 +0x11b

Small fixes to commandline args parsing

  • the default value of some args is printed twice in the help
  • the sqlite version could be rendered better
  • the error returned by the parse method is not managed

This is potentially a break of retrocompatibility, but I cannot really see it as a serious one.

Allow positional parameters

For example, we have this query:

INSERT INTO table VALUES (?, ?, ?)

Binding values to those questions marks are not supported by ws4sqlite (#39).

I know the documentation wrote that:

If the query needs to be parametrized, named parameters can be defined in the statement using SQLite (e.g. :id or @id), and the proper values for them must be specified here.

However, it is a syntax supported by SQLite itself:

In the SQL statement text input to sqlite3_prepare_v2() and its variants, literals may be replaced by a parameter that matches one of following templates:

?
?NNN
:VVV
@VVV
$VVV 

In the templates above, NNN represents an integer literal, and VVV represents an alphanumeric identifier. The values of these parameters (also called "host parameter names" or "SQL parameters") can be set using the sqlite3_bind_*() routines defined here.

ws4duck?

There is a ๐Ÿฆ† figuring here so I wonder if this nice web service would need considerable refactoring to work with the sqlite-like database duckdb?

There is a go binding and also a C++ REST server component... and if I understand correctly, SQLite databases can also be attached.

Would be interesting to hear your thoughts on this.

Remove encryption framework

It's basically tailor-made, and it's just not the way to put a feature inside a platform IMHO. It can be done on the client.

Feature: statements to be run as scheduled

A new field in the config file (modeled on initStatements) could be added, with statements/queries to be run in a scheduled fashion, with the same syntax as the maintenance. Good to do cleanups.

Create and delete databases over REST

Would it be possible to start ws4sqlite without specifying a db?

ws4sqlite

Then a db could be managed with REST:

PUT http://localhost:12321/mynewdatabase # Creates db
POST http://localhost:12321/mynewdatabase # Queries db
DELETE http://localhost:12321/mynewdatabase # Deletes db

JavaScript/TypeScript client

Hi!

Many thanks for the awesome tool!

I'm planning to use it in my web application written in Svelte/TypeScript. Unfortunately, there is no JavaScript/TypeScript client yet. Are you planning to do it soon?

Regards.

Rebase docker image(s) on distroless/static-debian11

A less opinionated and more stable base than alpine. Build on golang's official docker image, then publish on distroless.

In the past it happended that alpine's edge version broke golang builds for a while, and I understand it can happen, being... edge. This should be more consistent.

Option to return results as a list, instead of a map

The request may be:

{
        
        "resultFormat": "list",
        "transaction": [{
            "query": "SELECT * FROM folders",
            ...
        }, ...]
...

with values of list or map, by default/if absent map would be used, for retrocompatibility. If list, your result would become:

{
  "results": [
    {
      "success": true,
      "resultHeaders": ["id", "name", "position", "updated_at", "created_at"],
      "resultSet": [
        [ "default", "Default", 0.0, "2024-02-04 12:23:08", "2024-02-04 12:23:08" ]
      ]
    }
  ]
}

The new node resultHeaders would always be displayed, even in maps: currently there's no way to tell the column order, and this would give it.

Feature: internal webserver

Could be useful when ws4sqlite is used directly from a webpage, to avoid setting up CORS. Of course, beware of security implications of doing so.

Fiber's internal web server is more than enough for the task, just expose it.

ws4sqlite does not listen on IPv6 socket even when started with an IPv6 address

$ ws4sqlite --bind-host '[::]' --mem-db mem1 &
ws4sqlite 0.11.1
- Serving database 'mem1' from :memory:?_journal=WAL
  + No config file loaded, using defaults
  + Using WAL
- Web Service listening on [::]:12321

$ curl https://[::1]:12321
curl: (7) Failed to connect to ::1 port 12321: Connection refused

$ netstat -l -n -p | grep 12321
tcp        0      0 0.0.0.0:12321           0.0.0.0:*               LISTEN      4060941/ws4sqlite

runtime error: slice bounds out of range [:-1]

My database name is db which causes this error.

$ ./ws4sqlite -db db -serve-dir .
ws4sqlite 0.12.3
- Based on SQLite v3.39.4
panic: runtime error: slice bounds out of range [:-1]

goroutine 1 [running]:
main.parseCLI()
        /home/mano/ws4sqlite/src/cli.go:87 +0xdf9
main.main()
        /home/mano/ws4sqlite/src/ws4sqlite.go:67 +0x11b

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.