Light

Question: Will datahike allow arbitrary "forking" of db? about datahike HOT 7 CLOSED

replikativ commented on July 25, 2024

Question: Will datahike allow arbitrary "forking" of db?

from datahike.

Comments (7)

Engelberg commented on July 25, 2024 2

One of the things I'm curious to explore is the idea of making an implementation of https://github.com/Engelberg/ubergraph that uses something like datahike behind the scenes so that you can create really large graphs and aren't limited by memory. My goal would be to make the durable aspect as invisible to the user as possible, so you can use it exactly as you would an in-memory data structure. That would mean that every single change would need to be its own equally valid "snapshot". No concept of the "current graph" just as there isn't when you work with it in memory.

from datahike.

whilo commented on July 25, 2024

Well, you are absolutely right. I have deliberately used the atom to linearize the write operations through locking in a simple way. datahike still has the with form that should be applicable to the connection atom in memory so you can either manually flush or transact afterwards and would overwrite the datahike identity. There is no reason that you cannot also make these parallel versions durable and they would even do structural sharing in the store. Do you have specific requirements? I can provide a dedicated flush-db routine and allow concurrent forks in one store. Honestly I tried to make it useful by having a comfortable entry point and am very happy to figure out what people would like to do with it.

Since I have worked on CRDTs before, which are by their nature forkable and joinable, I am at the moment rather trying to expose the the internals and explore different approaches myself. One idea that I have is to add a conflict-resolution mechanism to attributes in the schema, so they would be automatically joinable. This would not work for the whole database of course though.

from datahike.

whilo commented on July 25, 2024

That sounds cool! I still have to incorporate my complex network analysis algorithms for loom. I have also yesterday thought about doing something like this for clara and factui. You can just use https://github.com/replikativ/datahike/blob/master/src/datahike/core.cljc#L204 for instance, this will give you in memory forks and I can provide a simple flush routine separate from the datomic-like API.

Note that in effect this means just passing through the flush functionality of the hitchhiker-tree. I have not done a lot of fancyness for datahike, just glue code between datascript and the hitchhiker-tree. My main point atm. is to get across how great these persistent durable datastructures are to build complex state management systems that store and distribute state (read: distributed databases :) ) through simple recomposition of good in-memory libraries. The hitchhiker-tree might have limits when you want to write more than 10000 txs/sec, but I think Clojure has a huge potential in this space also beyond that.

from datahike.

whilo commented on July 25, 2024

So maybe it is easier to just put the node-map and maybe attrs here:
https://github.com/Engelberg/ubergraph/blob/master/src/ubergraph/core.clj#L124
in a hitchhiker-tree (which is just a sorted map on disk). It is persistent, so you can decide on your own when and how to make it durable. If you have questions about that, feel free to ping me.

from datahike.

Engelberg commented on July 25, 2024

Thanks. I'll be investigating this more in a couple weeks. What did you do to make datahike work out of the box, as opposed to hitchhiker-tree which seems to require installing and starting redis separately?

from datahike.

whilo commented on July 25, 2024

redis is one backend for the hitchhiker-tree. I have ported the hitchhiker-tree to support core.async based IO for cljs and supplied a konserve backend a year ago:

https://github.com/datacrypt-project/hitchhiker-tree/blob/master/src/hitchhiker/konserve.cljc

In case of datahike I just map the different URL schemes to konserve backends, but I think the redis backend for datahike might be interesting as well. It is in effect two backend abstraction layers, that of the hitchhiker-tree and the one of konserve, which might be confusing. If you have any questions, I am happy to help in porting ubergraph. The hitchhiker-tree is solid, but using it in more projects would definitely help with things like garbage collection, performance improvements and serialization options.

from datahike.

whilo commented on July 25, 2024

I will close this for now, feel free to reopen it if you have questions or suggestions how to improve the current libraries for your use cases.

from datahike.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.