Info on Concurrency about tinydb HOT 7 CLOSED

fny commented on May 26, 2024 1

Info on Concurrency

from tinydb.

Comments (7)

p-baum commented on May 26, 2024

I would like to know as well please.

from tinydb.

VermiIIi0n commented on May 26, 2024

It would be great to have info on whether tiny supports concurrent reads and writes. This is not clear from the README.

Unfortunately, I believe it's not possible after viewing the source code.

Concurrent writing/reading almost certainly leads to data corruption.

And I think this piece of info is already presented in the docs.

from tinydb.

fny commented on May 26, 2024

Yeah, its a shame.

…

On Tue, Oct 11, 2022 at 10:21 AM Mashir0 ***@***.***> wrote: It would be great to have info on whether tiny supports concurrent reads and writes. This is not clear from the README. Unfortunately, I believe it's not possible after viewing the source code. Concurrent writing/reading almost certainly leads to data corruption. — Reply to this email directly, view it on GitHub <#487 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACUPVFWY4VPSRTONW7RTNDWCVZWJANCNFSM6AAAAAAQN5J77Q> . You are receiving this because you authored the thread.Message ID: ***@***.***>

from tinydb.

msiemens commented on May 26, 2024

You always can add your own locks (e.g. using Python's locks) to ensure that concurrent writing/reading within the same Python process works correctly. If you have multiple programs, you can add some form of file locking (see e.g. https://stackoverflow.com/questions/489861/locking-a-file-in-python) to tell a process that the file is currently in use.

In general, TinyDB doesn't make any assumptions about the data storage mechanism (due to the ability to drop in your own data storage class), so there is no generic locking in TinyDB (because file-based locking might e.g. not work on network file systems or something like S3). Leaving this to the user is the most flexible solution to me even though it requires some work from users who just use the default JSONStorage.

from tinydb.

FeralRobot commented on May 26, 2024

documentation update needed:
class tinydb.middlewares.ConcurrencyMiddleware(storage_cls)
Makes TinyDB working with multithreading.
Uses a lock so write/read operations are virtually atomic.

from tinydb.

msiemens commented on May 26, 2024

class tinydb.middlewares.ConcurrencyMiddleware(storage_cls)
Makes TinyDB working with multithreading.
Uses a lock so write/read operations are virtually atomic.

Actually, the ConcurrencyMiddleware has been removed in TinyDB 2.0.0 due to an incorrect implementation (#18).

from tinydb.

andryyy commented on May 26, 2024

I am using TinyDB in a project right now and came across this problem.

Since I'm already using Redis for synchronizing some application states in my cluster, I found it easiest to reuse it for distributed locking.

My application uses a Storage class that's pretty much a copy of JSONStorage with some personal tweaks like automatic backups and custom caching.

In the custom storage's __init__() definition I added this (slightly modified):

...
        self._access_id = kwargs.pop("access_id")
        try:
            if r.get("DATABASE_LOCK") != self._access_id:
                while not r.set(
                    "DATABASE_LOCK",
                    self._access_id,
                    px=4000, # Max lock time in ms
                    nx=True,
                ):
                    continue
        except Exception as e:
            r.delete("DATABASE_LOCK")
            raise
...

The close method looks like this:

...
        self._handle.close()
        r.delete("DATABASE_LOCK")
...

TinyDB is used like this:

TINYDB = {
    "storage": JSONStorageLocked,
    "path": "database/data.json",
    "access_id": str(uuid4()),
}
with TinyDB(**defaults.TINYDB) as db:
    ...

Since the storage instance is reinitialized during operations, I was not able to use id(self) but had to define a fixed ID to use inside the whole context.

At first I tried to only hold a lock inside the read and write definitions, but in stress tests it of course failed sometimes. The application would lock for a data read, unlock, lock for a write and append the data. But between read and write there is a tiny window for another worker to lock and write data, so I dropped that idea.

Of course I could switch back to that logic and check for the last writer's ID, but I'm fine with it now. :) I also use some caching in Redis, so it does not really impact the performance anyway for the few writes I'm doing.

from tinydb.

Info on Concurrency about tinydb HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent