Comments (40)
I have a few thoughts on this.
Thoughts on an appropriate API
I designed better-sqlite3
to be easy to use for people who have never used SQLite's C API. I tried to capture the higher-level concepts into the simplest API I could think of, while still exposing most of what advanced users would want via options and utility methods. That said, I don't necessarily think that philosophy aligns with existing APIs within Node.js core. For example, node:fs
is actually pretty hard to understand unless the user is already familiar with the C API of Unix/Linux-style filesystems (with the exception of a few high-level convenience methods like fs.readFile
). Of course, for something built into the runtime, that low-level exposure makes sense. Therefore, I do think it would be appropriate for the API to (as much as possible) be a 1-to-1 mapping with SQLite's low-level C API, but with a few convenience methods built-in. I see no reason to depart from the method/methodSync
naming convention used by Node.js's core libraries.
However, it should be noted that many asynchronous APIs would simply break or result in undefined behavior or unsafe memory accesses when combined with much of the behavior provided by SQLite. For example, SQLite allows you to register user-defined SQL functions in the host language (i.e., JavaScript in this case). These function would be invoked during certain SQL queries, but if those SQL queries were executed in a thread pool, it will result in unsafe cross-thread usage of JavaScript contexts (which is very, very bad). Similarly, transactions become prone to error and terrible performance when performed in an asynchronous manner, since transactions in SQLite lock the entire database. Therefore, any asynchronous API for SQLite in Node.js would need to be highly limited, exposing only the most basic functionality that SQLite offers.
To safely separate the synchronous and asynchronous worlds, I recommend exposing two different classes, SQLiteDatabaseAsync
and SQLiteDatabaseSync
, each of which represents a connection to a SQLite database, with SQLiteDatabaseSync
providing access to the entire (or almost the entire) SQLite C API, while SQLiteDatabaseAsync
would only provide basic functionality like making queries. The goal here is to make it impossible for a single database connection to use both async/non-blocking methods AND advanced SQLite APIs that would have detrimental effects when invoked from the libuv thread pool.
I believe an approach like this would align with the existing patterns/philosophy of Node.js's core libraries, and would satisfy the memory safety that is expected of the Node.js runtime. Sprinkle in a few high-level convenience methods, and let user-land do the rest.
Thoughts on SQLite versions
Perhaps most importantly (much more important than the API, in my opinion), is how to deal with the issue of SQLite versions. Besides the fact that the SQLite developers release new versions many times per year, SQLite is designed to be customized at compile-time, giving the user the power to enable certain advanced features, disable deprecated features that may harm performance or be unsafe, or even change the behavior of basic SQL expressions (like case-sensitivity, unicode, etc.). In better-sqlite3
, it's very common for users to need a different build of SQLite than the one provided by default. I do provide a way for users to compile a custom version of SQLite, but the process is relatively painful. If Node.js exposed SQLite to users, I think it would be crucial to solve this need. Perhaps it can be solved with dynamic linking or something. Just be aware that you will inevitably run into this.
from node.
I imagine this idea will get a lot of support and a lot of opposition. If someone else handles the consensus building aspect, I'd be happy to work on the implementation side.
from node.
FWIW, SQLite has followed semver since version 3.9.0 in 2015, and the current major version is still 3:
The current value for X is 3, and the SQLite developers plan to support the current SQLite database file format, SQL syntax, and C interface through at least the year 2050. Hence, one can expect that all future versions of SQLite for the next several decades will begin with "3.".
from node.
One note: if you do decide to include SQLite in Node.js, please statically compile an up-to-date SQLite library instead of relying on the host's SQLite library!
This is a problem with Python's SQLite library and Bun's SQLite library: By default they use whatever SQLite version the OS has installed, which leads to headaches. Some default SQLite builds omit features like SQLite extensions, JSON, or math functions. Or they are stuck on old version of SQLite which lack window functions, JSONB
, datetime utils, or STRICT
tables. This is even documented in Bun's docs and Python's docs.
So if you statically compile a recent SQLite version, say 3.46
, you'd have full control of what SQLite features are available across different platforms. And if someone wanted to use a different/more up-to-date SQLite version, they can always reach for better-sqlite3
or another 3rd party package. This is what we do for Datasette, which is in Python - we use the standard Python SQLite library and tell people to use the 3rd party pysqlite3 package if their builtin SQLite is too old.
from node.
For reference:
- https://bun.sh/docs/api/sqlite
- https://docs.deno.com/runtime/manual/basics/connecting_to_databases#sqlite
from node.
One suggestion from the peanut gallery: I’d suggest this API live in a node:sqlite package (like bun:sqlite) rather than node:util, just to avoid the “util is a dumping ground” trope.
This seems to be the consensus so far.
from node.
@cjihrig honestly since naming is very bikesheddy in this case I think you should just pick one. Here is an example color for our bike shed:
require("node:sqlite").sync.query(...)
for synchronous queriesrequire("node:sqlite").promises.query(...)
for asynchronous queries
But I think that's not one of the things that will prevent a PR from landing so I wouldn't worry too much about it.
from node.
We already see different opinions on the API. If there is no "one size fits all" API
Currently the different opinions are basically sync vs async.
The basic API could be something like this
class Database {
// path can be :memory: or the file path
constructor(path: string, options?: { readonly?: boolean, create?: boolean }) {}
query(statement: string): Statement; // works like prepare, like bun is doing right know
open(): void; // maybe we can just open automatically? and add a option to disable auto-open
close(): void;
}
type AllResult = Array<any>;
type RunResult = void; // maybe boolean?
type GetResult = any;
class Statement {
all(...params: any[]): Promise<AllResult>;
run(...params: any[]): Promise<RunResult>;
get(...params: any[]): Promise<GetResult>;
// we can also introduce sync methods, like fs
allSync(...params: any[]): AllResult;
runSync(...params: any[]): RunResult;
getSync(...params: any[]): GetResult;
}
Sync has advantages, and async too, so ship both and be happy.
In the first iteration, we can stick with the easiest/simplest implementation, which is probably the sync version.
So we can iterate and add the asynchronous versions and then add more methods (pluck, iterate, values, etc...)
EDIT: I didn't talk about transactions because it didn't even matter in this first iteration, is better to discuss this method when we have at least the basic methods, then we will have more knowledge on what could be the best API for exposing transactions.
If we make it possible for native addons to link against the SQLite dependency that's embedded in Node.js, then that should let third-party packages define whatever API they want without having to ship SQLite.
The point is that you don't need to use third-party packages to be able to use sqlite, but you can still do that even if we have our own API.
from node.
I think it makes the most sense to do something similar to the fs
module here and have synchronous and asynchronous (Promise based, no callbacks) APIs. For the majority of cases, synchronous APIs are good enough, but there are almost certainly cases where async will be preferred (and it is more Node-like). I have started writing some code for the DB connection already.
EDIT: Very early, but here is the branch.
from node.
@asg017 Node bundles 3.46.0 for web storage support already so presumably that's the one that will be used in this case.
from node.
Proper SQLite support within Node.js would be splendid!
A couple thoughts:
- I’d suggest not skipping a Transaction API in the first iteration: many use cases require ACID behavior.
PRAGMA
support is another possibly-surprising feature of SQLite that is a hard prerequisite for reasonable performance and stability for many use cases.- A codebase that uses both sync and async calls concurrently could be prone to deadlock/busy timeouts, especially when mixing different transaction flavors (like BEGIN DEFERRED in the sync call, and BEGIN IMMEDIATE in the async call). A “solution” to this issue could simply be a warning in the documentation to not do that.
I like the idea of using better-sqlite3’s API wholecloth, as that could mean switching between the “native” node api, bun, and better-sqlite3 would be a simple import twiddle. Async API support would break this, of course. Edit: as @cjihrig mentions below, this already isn't the case: I think better-sqlite3
API's main value-add for this effort should be informative with respect to what SQLite functions should be considered. Also know that better-sqlite3
's API is not a comprehensive set of sqlite3's API.
One suggestion from the peanut gallery: I’d suggest this API live in a node:sqlite
package (like bun:sqlite
) rather than node:util
, just to avoid the “util is a dumping ground” trope.
from node.
Another question regarding the API - if we were to start off following the better-sqlite3 API and wanted to add async support in the future, what would that look like? Node has generally had async APIs, and appended Sync
to the name for synchronous equivalents. The better-sqlite3 API would sort of flip that naming pattern around.
from node.
it's very common for users to need a different build of SQLite than the one provided by default
I second this assertion of @JoshuaWise and think it is very important to build this capability in the node's integrated version of SQLite. I want the same (possibly customized) version of SQLite powering my node application as well as my CLI so there are no surprises when I am building my application or analyzing my data. For this reason itself, it is very important that not only can I customize my SQLite with the extensions I need, but do it in multiple locations, and as easily as possible because I am not a C programmer. Even the most obvious steps for a C programmer can be very confusing for me.
from node.
I think at the very least let's land the localStorage
PR which would get SQLite into the codebase. After that, my concern would be semver: if we expose SQLite, then we can't update it any faster than our release cycle, unless we make some kind of exception for it.
from node.
Any reason to put it in node:util?
Not particularly. A new core module is a bigger undertaking - particularly since the decision to require the node:
scheme - so I thought it was at least worth raising the point. I'm fine either way.
from node.
I like the idea of using better-sqlite3’s API wholecloth, as that could mean switching between the “native” node api, bun, and better-sqlite3 would be a simple import twiddle.
I'm not trying to knock the better-sqlite3 API at all, but my understanding is that this is already not the case. The Bun docs mention that their API was inspired by better-sqlite3, but they already seem to have diverged from it some (and probably will continue to do so). That approach also would not offer any compatibility with other sqlite modules like Deno's or node-sqlite3 (which has roughly twice the weekly downloads according to npm).
from node.
If someone wanted something outside of that, they should use a 3rd party package like
better-sqlite3
. This would include:
- A different SQLite version
- Custom compilation arguments
- A completely different SQLite library (ie sqlciper or libsql)
FWIW, the SQLite developers strongly recommend against having more than a single copy of SQLite per process, see How To Corrupt An SQLite Database File.
from node.
Shall we use the same interface as https://github.com/TryGhost/node-sqlite3?
from node.
@gabrielschulhof On the rare occasion that I voluntarily write JavaScript code, I personally prefer the better-sqlite3
API, which also tends to be significantly faster.
That being said, the non-blocking behavior of node-sqlite3
better matches the Node.js I/O model. If you have a slow storage medium or if you use a slow VFS, e.g., with a network resource as its backend, then the asynchronous node-sqlite3
API is a better choice. Similarly, applications that do not utilize SQLite transactions efficiently will spend a lot of time waiting for each change to be committed, which is slow even if the underlying storage is fast, so using synchronous APIs in those cases unnecessarily blocks the event loop. Applications that use well-designed transactions and that run on a very fast storage backend, on the other hand, will see a lot of unnecessary overhead from asynchronous APIs.
Perhaps Node.js could allow native addons to link against the bundled SQLite dependency instead and leave the API to third-party packages.
from node.
I would like to bring attention to esqlite by @mscdex, which outperform(s/ed) better-sqlite3 in a number of cases while being asynchronous. https://github.com/mscdex/esqlite
from node.
It looks like we have consensus around experimenting with this (i.e. prototyping a sqlite binding in core), just for further visibility I'll also ping the TSC in chat.
As for the API I personally really prefer a synchronous one over an asynchronous one since SQLite is built to run in-process. I think in the common use case this is significantly simpler and likely faster. That said, an asynchronous promise based API should also be fine.
Funnily enough 2 people asked me about this feature today - a good luck omen it's something users find interesting.
from node.
We already see different opinions on the API. If there is no "one size fits all" API, should we have one in core? If we make it possible for native addons to link against the SQLite dependency that's embedded in Node.js, then that should let third-party packages define whatever API they want without having to ship SQLite.
from node.
We already see different opinions on the API
We've had very little api discussion so far, I think it's fine to talk about stuff and see if we can reach consensus?
from node.
@tniessen looping back to your comment here: #53264 (comment) I tend to think the better-sqlite3 sync API is better for most cases and given SQLite's execution model and the motivation of users asking to add it to Node.js (not as a DB server replacement but for local configuration, at least that's what I've been hearing). I see your point though.
There is interesting discussion here: https://sqlite.org/forum/forumpost/7c90893579 as well.
I think it may be a good idea to reach out to the SQLite team and ask, so I went ahead and opened https://sqlite.org/forum/forumpost/96f9f78fc1
from node.
Another thing I think is worth discussing is whether or not it makes sense to expose this as its own module (node:sqlite
) or as part of another module such as node:util
.
from node.
Also, regarding sync vs async: the better-sqlite3
README says:
node-sqlite3 uses asynchronous APIs for tasks that are either CPU-bound or serialized. That's not only bad design, but it wastes tons of resources. It also causes mutex thrashing which has devastating effects on performance.
I've never benchmarked it myself, but this is the sole reason why I always use better-sqlite3
over node-sqlite
in my projects.
I know a lot of Node.js developers would expect an async API (like node:fs/promises
), but if they knew of potential performance problems, I'd imagine they'd let it slide.
from node.
Just to add my 2-cents, sync makes sense for simple queries. But analytical queries, things like computeing averages and sum can take a fair amount of time.
Given that some users will want to use this for things like dashboards, that can issue a bunch of aggregation queries, an async API would make more sense.
from node.
I agree with that point from better-sqlite3. A counterpoint is that sqlite might not be the only thing the application is doing. But if a sqlite call is blocking, then nothing else can happen. This is why both sync and async APIs should be provided.
from node.
This is also my conclusion based on https://sqlite.org/forum/forumpost/96f9f78fc1
from node.
I'll also escalate the questions to the TSC chat to make sure we have consensus.
from node.
Another thing I think is worth discussing is whether or not it makes sense to expose this as its own module (
node:sqlite
) or as part of another module such asnode:util
.
Any reason to put it in node:util
?
from node.
This is also my conclusion based on https://sqlite.org/forum/forumpost/96f9f78fc1
FWIW, that discussion also raised this point:
it's our collective opinion that there neither can be, nor should be, One True Wrapper (...)
Having multiple wrappers is, in our considered opinions, healthy for the ecosystem and provides the most all-around flexibility when selecting which set of trade-offs (compared to the C API) a project wants to use.
This reinforces my opinion that we should make it easy for third-party packages to link against the bundled version of SQLite, assuming we are committing to shipping SQLite forever anyway.
from node.
With asynchronous APIs, I think it's important to very carefully design a good transaction interface. better-sqlite3
and Bun's SQLite API use wrapped functions for that, which is somewhat reasonable in a purely synchronous setting. node-sqlite3
, on the other hand, relies on the user manually distinguishing between calls that may run in parallel and call that may not. Other bindings, such as Rusqlite
have created much better interfaces in my opinion, but those don't tend to work that well in a messy language like JavaScript.
from node.
This reinforces my opinion that we should make it easy for third-party packages to link against the bundled version of SQLite, assuming we are committing to shipping SQLite forever anyway.
Yes and that sounds like a good idea. That does not preclude us from shipping our own wrapper as well. I think a major theme in recent Node features (the test runner, task runner, watch mode, util.parseArgs etc) isn't to be the best or have the richest functionality in those spaces - it's to provide sensible and safe defaults for most projects.
I think no matter what API we ship, userland will still provide alternatives and projects using sqlite heavily for more than configuration are likely to still use those wrappers. (Similarly to projects needing browser testing, advanced task runner features, rich argument parsing, more fs convenience methods etc).
So I am not that concerned about our API not being "feature complete"
Other bindings, such as Rusqlite have created much better interfaces in my opinion, but those don't tend to work that well in a messy language like JavaScript.
Assuming you had no messy language constraints whatsoever - what API would you pick?
I personally think we might want to stick with disposers (Symbol.dispose) and not the disposer pattern - but honestly even if we have very rudimentary transaction support and leave richer APIs to userland I'm fine with that.
from node.
I personally recommend using most of the amazing API of https://github.com/WiseLibs/better-sqlite3, i.e. use synchronous APIs. They generically offer better performance with SQLite on modern HW. I'm not opposed to have sync and async version of those.
cc @JoshuaWise @mceachen for visiblity.
from node.
I have a slightly different view: node:sqlite
should be a no-nonsense "just works" SQLite package that application developers use when they want to read/write to a SQLite database. They would mostly care about:
- Having an up-to-date SQLite version
- Ability change a few settings (like
PRAGMA
and runtime configuration options) - Using SQLite extensions to add new functionality (either as application defined functions or dynamically loadable extensions)
If someone wanted something outside of that, they should use a 3rd party package like better-sqlite3
. This would include:
- A different SQLite version
- Custom compilation arguments
- A completely different SQLite library (ie sqlciper or libsql)
This is kindof how Python works: the builtin sqlite3
library is a "just works" version of SQLite, while if someone wanted a customized build, they'd need to reach for a 3rd party package like pysqlite3
.
That means that the SQLite library offered in node:sqlite
needs to precisely define which compilation arguments to include and default settings, which is difficult. But I think offering an API that can dynamically handle multiple SQLite libraries would be more difficult
from node.
FWIW, the SQLite developers strongly recommend against having more than a single copy of SQLite per process
That seems like it would raise a more immediate question about WebStorage and its impact on code already using a userland sqlite module. How do Deno/Bun/Python etc. deal with that?
from node.
@tniessen ooh thanks for sharing, hadn't seen that before!
But in this case, if Node.js statically links a "golden" SQLite library for node:sqlite
, would other dynamically linked SQLite libraries from 3rd party npm packages (better-sqlite3, libsql, sqlcipher, etc) be effected as well? It's hard to tell from that link, they only say "linked" but don't specify if this would happen in dynamically linked version of SQLite with a statically linked version.
My hunch is that there could be only 1 statically linked copy of SQLite, but other dynamically linked SQLite copies would be fine? Especially since this pattern is used in Python/Deno (which uses SQLite for localStorage + KV), but I'm no expert here
from node.
would other dynamically linked SQLite libraries from 3rd party npm packages (better-sqlite3, libsql, sqlcipher, etc) be effected as well?
I would think that Node’s internal SQLite would use different files on disk for storing the database than the files on disk used by these various packages? And if that’s the case, then I don’t see why they would interfere with each other.
from node.
right, as long as I continue to use the SQLite driver with which I created my db, I would not expect any problem. So, if I use node:sqlite
to create my files, I continue to use it with those files. And, if I use a custom version of SQLite with, say, better-sqlite3
, I would continue to happily use that combo. Using the correct driver with the files they created would be my responsibility
from node.
Related Issues (20)
- Mocha-like root hooks for test runner HOT 21
- Sharing AsyncLocalStorage instance with customization hook HOT 4
- OOM - Segmentation fault (not ulimit, not cgroups, not max-space, not exhausted RAM) HOT 6
- URL.canParse('https://*.example.com') incorrectly returns true HOT 5
- Inconsistent behavior with crypto.KeyObject.from() and key usage flags for RSA encryption/decryption HOT 1
- [Bug]: Vite Wallet does not run correctly after building it on Microsoft Windows. HOT 1
- Random wrong encoding of strings returned by .replace function HOT 1
- nodejs crashes at stream.c due to queue size assertion HOT 5
- Flaky text encoding regression in Node 22 HOT 1
- Reuse Blob Registry in Workers
- exec command 'playerctl status' does not work HOT 1
- test_runner: `latest.js` file is included by default HOT 1
- `cpSync` symlink handling regression HOT 3
- Node-API V8 Fast API HOT 2
- Nominating @abmusse to be a collaborator HOT 1
- Node 22.1+ crashes process when a lot of VM processes are created/destroyed in parallel workers HOT 6
- build warning with gcc 12 HOT 4
- Buffer to string conversion fails after a few KB of load. HOT 1
- Use the compilation cache when running typescript files through `--experimental-transform-types` HOT 5
- Encoding Issue with Special Characters in Node.js 22 Docker Image HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from node.