Coder Social home page Coder Social logo

mirage / irmin Goto Github PK

View Code? Open in Web Editor NEW
1.8K 61.0 153.0 54.59 MB

Irmin is a distributed database that follows the same design principles as Git

Home Page: https://irmin.org

License: ISC License

Makefile 0.02% OCaml 95.86% C 3.07% Perl 0.08% Standard ML 0.31% Raku 0.11% HTML 0.55%
ocaml git storage mirageos database irmin

irmin's Introduction

Irmin logo
A Distributed Database Built on the Same Principles as Git

OCaml-CI Build Status codecov GitHub release (latest by date) docs


Irmin is an OCaml library for building mergeable, branchable distributed data stores.

Irmin is based on distributed version-control systems (DVCs), extensively used in software development to enable developers to keep track of change provenance and expose modifications in the source code. Irmin applies DVC's principles to large-scale distributed data and exposes similar functions to Git (clone, push, pull, branch, rebase). It is highly customizable: users can define their types to store application-specific values and define custom storage layers (in memory, on disk, in a remote Redis database, in the browser, etc.). The Git workflow was initially designed for humans to manage changes within source code. Irmin scales this to handle automatic programs performing a very high number of operations per second, with a fully automated handling of update conflicts. Finally, Irmin exposes an event-driven API to define programmable dynamic behaviours and to program distributed dataflow pipelines.

Irmin was created at the University of Cambridge in 2013 to be the default storage layer for MirageOS applications (both to store and orchestrate unikernel binaries and the data that these unikernels are using). As such, Irmin is not, strictly speaking, a complete database engine. Instead, similarly to other MirageOS components, it is a collection of libraries designed to solve different flavours of the challenges raised by the CAP Theorem. Each application can select the right combination of libraries to solve its particular distributed problem.

Irmin consists of a core of well-defined low-level data structures that specify how data should be persisted and be shared across nodes. It defines algorithms for efficient synchronization of those distributed low-level constructs. It also builds a collection of higher-level data structures that developers can use without knowing precisely how Irmin works underneath. Some of these components even have a formal semantics, including Conflict-free Replicated Data-Types (CRDT). Since it's a part of MirageOS, Irmin does not make strong assumptions about the OS environment that it runs in. This makes the system very portable: it works well for in-memory databases and slower persistent serialization such as SSDs, hard drives, web browser local storage, or even the Git file format.

Irmin is primarily developed and maintained by Tarides, with contributions from many contributors from various organizations. External maintainers and contributors are welcome.

Features

  • Built-in Snapshotting - backup and restore
  • Storage Agnostic - you can use Irmin on top of your own storage layer
  • Custom Datatypes - (de)serialization for custom data types, derivable via ppx_irmin
  • Highly Portable - runs anywhere from Linux to web browsers and Xen unikernels
  • Git Compatibility - irmin-git uses an on-disk format that can be inspected and modified using Git
  • Dynamic Behavior - allows the users to define custom merge functions, use in-memory transactions (to keep track of reads as well as writes) and to define event-driven workflows using a notification mechanism

Documentation

API documentation can be found online at https://mirage.github.io/irmin

Installation

Prerequisites

Please ensure to install the minimum opam and ocaml versions. Find the latest version and install instructions on ocaml.org.

To install Irmin with the command-line tool and all unix backends using opam:

  opam install irmin-cli

A minimal installation containing the reference in-memory backend can be installed by running:

  opam install irmin

The following packages have are available on opam:

  • irmin - the base package, plus an in-memory storage implementation
  • irmin-chunk - chunked storage
  • irmin-cli - a simple command-line tool
  • irmin-fs - filesystem-based storage using bin_prot
  • irmin-git - Git compatible storage
  • irmin-graphql - GraphQL server
  • irmin-mirage - mirage compatibility
  • irmin-mirage-git - Git compatible storage for mirage
  • irmin-mirage-graphql - mirage compatible GraphQL server
  • irmin-pack - compressed, on-disk, posix backend
  • ppx_irmin - PPX deriver for Irmin content types (see README_PPX.md)
  • irmin-containers - collection of simple, ready-to-use mergeable data structures

To install a specific package, simply run:

  opam install <package-name>

Development Version

To install the development version of Irmin in your current opam switch, clone this repository and opam install the packages inside:

  git clone https://github.com/mirage/irmin
  cd irmin/
  opam install .

Usage

Example

Below is a simple example of setting a key and getting the value out of a Git-based, filesystem-backed store.

open Lwt.Syntax

(* Irmin store with string contents *)
module Store = Irmin_git_unix.FS.KV (Irmin.Contents.String)

(* Database configuration *)
let config = Irmin_git.config ~bare:true "/tmp/irmin/test"

(* Commit author *)
let author = "Example <[email protected]>"

(* Commit information *)
let info fmt = Irmin_git_unix.info ~author fmt

let main =
  (* Open the repo *)
  let* repo = Store.Repo.v config in

  (* Load the main branch *)
  let* t = Store.main repo in

  (* Set key "foo/bar" to "testing 123" *)
  let* () =
    Store.set_exn t ~info:(info "Updating foo/bar") [ "foo"; "bar" ]
      "testing 123"
  in

  (* Get key "foo/bar" and print it to stdout *)
  let+ x = Store.get t [ "foo"; "bar" ] in
  Printf.printf "foo/bar => '%s'\n" x

(* Run the program *)
let () = Lwt_main.run main

The example is contained in examples/readme.ml It can be compiled and executed with dune:

$ dune build examples/readme.exe
$ dune exec examples/readme.exe
foo/bar => 'testing 123'

The examples directory also contains more advanced examples, which can be executed in the same way.

Command-line

The same thing can also be accomplished using irmin, the command-line application installed with irmin-cli, by running:

$ echo "root: ." > irmin.yml
$ irmin init
$ irmin set foo/bar "testing 123"
$ irmin get foo/bar
testing 123

irmin.yml allows for irmin flags to be set on a per-directory basis. You can also set flags globally using $HOME/.irmin/config.yml. Run irmin help irmin.yml for further details.

Also see irmin --help for list of all commands and either irmin <command> --help or irmin help <command> for more help with a specific command.

Context

Irmin's initial desing is directly inspired from XenStore, with:

  • the need for efficient optimistic concurrency control features to be able to let thousands of virtual machine concurrently access and modify a central configuration database (the Xen stack uses XenStore as an RPC mechanism to setup VM configuration on boot). Very early on, the initial focus was to specify and handle potential conflicts when the optimistic assumptions do not usually work so well.
  • the need for a convenient way to debug and audit possible issues that might happen in that system. Our initial experiments showed that it was possible to design a reliable system using Git as backend to persist configuation data reliably (to safely restart after a crash), while making system debugging easy and go really fast, thanks to efficient merging strategy.

In 2014, the first release of Irmin was announced part of the MirageOS 2.0 release here. Since then, several projects started using and improving Irmin. These can roughly be split into 3 categories: (i) use Irmin as a portable, structured key-value store (with expressive, mergeable types); (ii) use Irmin as distributed database (with a customizable consistency semantics) and (iii) an event-driven dataflow engine.

Irmin as a portable and efficient structured key-value store

  • XenStored is an information storage space shared between all the Xen virtual machines running in the same host. Each virtual machines gets its own path in the store. When values are changed in the store, the appropriate drivers are notified. The initial OCaml implementation was later extended to use Irmin here. More details here.
  • Jitsu is an experimental orchestrator for unikernels. It uses Irmin to store the unikernel configuration (and manage dynamic DNS entries). See more details here.
  • Cuekeeper is a web-based GTD (a fancy TODO list) that runs entirely in the browser. It uses Irmin in the browser to store data locally, with support for structured concurrent editing and snapshot export and import. More details here.
  • Canopy and Unipi both use Irmin to serve static websites pull from Git repositories and deployed as unikernels.
  • Caldav is using Irmin to store calendar entries and back them into a Git repository. More information here.
  • Datakit was developed at Docker and provided a 9p interface to the Irmin API. It was used to manage the configuration of Docker for Desktop, with merge policies on upgrade, full auditing, and snapshot/rollback capabilites.
  • Tezos started using Irmin in 2017 to store the ledger state. The first prototype used irmin-git before switching to irmin-lmdb and irmin-leveldb (and now irmin-pack). More details here.

Irmin as a distributed store

  • An IMAP server using Irmin to store emails. More details here. The goal of that project was both to use Irmin to store emails (so using Irmin as a local key-value store) but also to experiment with replacing the IMAP on-wire protocol by an explicit Git push/pull mechanism.
  • irmin-ARP uses Irmin to store and audit ARP configuration. It's using Irmin as a local key-value store for very low-level information (which are normally stored very deep in the kernel layers), but the main goal was really to replace the broadcasting on-wire protocol by point-to-point pull/push synchronisation primitives, with a full audit log of ARP operations over a network. More details here.
  • Banyan uses Irmin to implement a distributed cache over a geo-replicated cluster. It's using Cassandra as a storage backend. More information here.
  • irmin-fdb implements an Irmin store backed by FoundationDB. More details here.

Irmin as a dataflow scheduler

  • Datakit CI is a continuous integration service that monitors GitHub project and tests each branch, tag and pull request. It displays the test results as status indicators in the GitHub UI. It keeps all of its state and logs in DataKit, rather than a traditional relational database, allowing review with the usual Git tools. The core of the project is a scheduler that manage dataflow pipelines across Git repositories. It was used for a few years as the CI system test Docker for Desktop on bare-metal and virtual machines, as well as all the new opam package submissions to ocaml/opam-repository. More details here.
  • Causal RPC implements an RPC framework using Irmin as a network substrate. More details here.
  • CISO is an experimental (distributed) Continuous Integration engine for OPAM. It was designed as a replacement of Datakit-CI and finally turned into ocurrent.

Issues

Feel free to report any issues using the GitHub bugtracker.

License

See the LICENSE file.

Acknowledgements

Development of Irmin was supported in part by the EU FP7 User-Centric Networking project, Grant No. 611001.

irmin's People

Contributors

adatario avatar andreas avatar art-w avatar avsm avatar clecat avatar craigfe avatar dinosaure avatar djs55 avatar dsheets avatar g2p avatar gpetiot avatar gs0510 avatar hannesm avatar hnrgrgr avatar icristescu avatar jonludlam avatar liautaud avatar maiste avatar mattiasdrp avatar metanivek avatar ngoguey42 avatar niksu avatar patricoferris avatar samoht avatar shonfeder avatar talex5 avatar tomjridge avatar vbmithr avatar yomimono avatar zshipko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

irmin's Issues

dump requires `dot`

amir$ irmin dump thestate
sh: dot: command not found

I don't know if I'm misunderstanding some Unix thing or if I have a missing dependency for the initial tutorial. What do I do?

bounded memory/storage usage

Some users want to have some guarantee on the memory/storage consumption of the system. They want only a bounded part of the history to be stored persistently.

We need to have a better story for this. #71 might be a possible solution.

Use XMetaRequires in _oasis file

When one uses only BuildDepends in the _oasis file with syntax extensions, Oasis copies everything to the META file:
see https://github.com/samoht/irminsule/blob/master/_oasis#L32
and https://github.com/samoht/irminsule/blob/master/lib/core/META#L6

That forces users of the library to use camlp4, and to have only extensions that are compatible with type_conv (that breaks with Eliom for instance, last time I tried at least).

Please, add a XMETARequires field, like here:
https://github.com/janestreet/core_kernel/blob/master/_oasis#L243

add heterogenous policies for backends

Would be good to add a proof-of-concept backend which dispatch its keys on different backends type. Maybe using a key -> Irmin.S dispatch function. So that would be easier to store some parts in memory and other parts on disk.

Questions About the Design

Is there any documentation about the current or planned design of Irminsule?

I understand that it follows the same principles as Git and can use it as actual storage backend, but that's a very general description.

How the data will be structured and related? Will there be a possibility to make queries?
Can the database store large binary blobs?

Thank you in advance for the answers!

Build and Install: ocamlfind: Package `git.fs' not found and more errors

(*
  * After a successful installation of
  *  opam install ezjsonm ocamlgraph lwt cryptokit \
   *       re dolog mstruct core_kernel \
   *        uri cohttp ssl core_kernel \
   *         cmdliner alcotest
    *
    *  I downloaded irminsule-master.zip from GitHub and unziped on my Linux
    *  Ubuntu
    *)

cm@pursuit2:~/local/irminsule/irminsule-master$ ls -la
total 280
drwxr-xr-x 5 cm cm   4096 Apr  9 17:12 .
drwxr-xr-x 3 cm cm   4096 Apr  9 17:14 ..
-rw-r--r-- 1 cm cm   2737 Apr  8 19:22 CHANGES
-rwxr-xr-x 1 cm cm    363 Apr  8 19:22 configure
drwxr-xr-x 2 cm cm   4096 Apr  8 19:22 examples
-rw-r--r-- 1 cm cm   122 Apr  8 19:22 .gitignore
drwxr-xr-x 6 cm cm   4096 Apr  8 19:22 lib
drwxr-xr-x 2 cm cm   4096 Apr  8 19:22 lib_test
-rw-r--r-- 1 cm cm    892 Apr  8 19:22 Makefile
-rw-r--r-- 1 cm cm  20530 Apr  8 19:22 myocamlbuild.ml
-rw-r--r-- 1 cm cm   3359 Apr  8 19:22 _oasis
-rw-r--r-- 1 cm cm   1752 Apr  8 19:22 README.md
-rw-r--r-- 1 cm cm 193207 Apr  8 19:22 setup.ml
-rw-r--r-- 1 cm cm  7926 Apr  8 19:22 _tags
-rwxr-xr-x 1 cm cm   488 Apr  8 19:22 .travis-ci.sh
-rw-r--r-- 1 cm cm    43 Apr  8 19:22 .travis.yml

cm@pursuit2:~/local/irminsule/irminsule-master$ make
ocaml setup.ml -configure  --enable-tests --prefix /usr/local
ocamlfind: Package `git.fs' not found
W: Field 'pkg_git_fs' is not set: Command ''/home/cm/ocamlbrew/ocaml-4.01.0/.opam/system/bin/ocamlfind' query -format %d git.fs > '/tmp/oasis-99ee3d.txt'' terminated with error code 2
ocamlfind: Package `git.memory' not found
W: Field 'pkg_git_memory' is not set: Command ''/home/cm/ocamlbrew/ocaml-4.01.0/.opam/system/bin/ocamlfind' query -format %d git.memory > '/tmp/oasis-49a28a.txt'' terminated with error code 2
E: Cannot find findlib package git.fs
E: Cannot find findlib package git.memory
E: Failure("2 configuration errors")
make: *** [setup.data] Error 1

Issue with deleted values

from @gregtatcam

I ran into some Irmin issues in my IMAP server which I think are related to my misunderstanding of the API. I implemented interface to Irmin as transactions via the views. I have attached an example where I create the top level key ["imaplet";"user";"mailboxes";"Test”], then I get the view and create keys/values in that view ["messages";"1”], ["messages";"1";"header”], ["messages";"1";"content”]. I commit the view with update and then read the values back for verification. Then I remove the top level key ["messages";"1”] and test each key with mem, read, and then I also list the whole tree under ["messages";"1”]. I get results which I don’t quite understand. Here is the matrix that I get

                                     mem    read
messages;1                  true      no
messages;1;header      false     yes
messages;1;content     false      yes

And then the listing under messages;1 returns header and content keys. If I delete messages;1;content and messages;1;header then I can’t read then any longer but mem still returns true and the listing doesn’t return header and content keys. In my IMAP server I have the messages structured in such a way that there is a bunch of sub keys under the message UID keys, like header, content, etc. So, ideally I would like to remove the top level key, basically messages;UID and don’t bother with the removal of all the sub keys. I also would have expected that mem returns false for the removed key. Would you have some time to take a look at the example and explain if and what I’m doing wrong?

https://gist.github.com/samoht/fdf3895bdec18c078c8f

optimize the amount of data to send on pull/push

We already send only a partial view of the store, but this could still be very much improved. I have some ideas already obviously, but this will certainly need more though (and this is really the core of irminsule).

Implement a simple GC

If we want to have Irmin running on real systems we need a story about data recollection:

  1. unreachable blocks in the store should be collected
  2. very old objects might need to be removed

For simplicity, 2. can be transformed into 1. by rebasing the history. A downside with this is that two rebased stores are not able to sync anymore (if they pruned different parts of their history). Solving 2. without transforming it into 1. means supporting partial fetch/push, which the Git protocol doesn't handle very well (not sure if it is a limitation of the git command-line or of the protocol itself, need to investigate a bit more).

support structured values

Currently the values are raw blobs. It could be interesting to easily support more structured values. The only "interesting" idea here is that a structured value can also contains keys, which can be understood by Irminsule when it computes the export set.

Ideally this structure should be quite easy to define for the user (for instance, using a schema stored somewhere in the database).

Add a "max_path" operation

Not sure what's the exact semantics of that for Irmin, though, but that's necessary to populate an IMAP server. We could ask the user to provide an ordering on the path? On the keys in the path?

JSON CRUD interface error

I followed the "Getting Started", when I get to:
"You can also access the daemon state by opening your browser to http://127.0.0.1:8080"

I get:
{"error":"(Failure "to_json: HTML node")"}

instead of a "list of possible operations"

Improve the error message when `dot` is not installed on the system

I have been exploring the 'irmin' tutorial.

I followed 'Using the In-Memory Backend' instructions as suggested, except that I created the 'irmin-mem' dir within my personal space instead of '/tmp' as indicated in the tutorial.

I uncovered the following error message:

m770@pursuit2:/local/irminsule/testing/irmin-mem$ irmin dump thestate
sh: 1: dot: not found
2014-04-23 10:27:07.942 IRMIN ERROR: The thestate.dot is corrupted
cm770@pursuit2:
/local/irminsule/testing/irmin-mem$

Notice that the thestate.dot file is still created!

cm770@pursuit2:~/local/irminsule/testing/irmin-mem$ ls -la
total 16
drwxr-xr-x 2 cm770 cm770 4096 Apr 23 10:27 .
drwxr-xr-x 4 cm770 cm770 4096 Apr 23 10:33 ..
-rw-r--r-- 1 cm770 cm770 5864 Apr 23 10:27 thestate.dot

It might help to take into account that before the 'irmin dump thestate' instruction I executed on my interaction terminal:

  1. export IRMIN=r:http://127.0.0.1:8080
  2. irmin write person/name carlos
  3. irmin write person/surname molina
  4. STATE=irmin snapshot
  5. irmin write person/surname perez
  6. irmin revert $STATE

These are the last lines produced by irmin main terminal:

2014-04-23 10:26:51.761 HTTP INFO : Request received: PATH=/contents/read/19e370dcbf0f82eba30d835f45c7c1900c0ed66b
2014-04-23 10:27:07.927 HTTP INFO : Request received: PATH=/contents/dump
2014-04-23 10:27:07.928 HTTP INFO : Request received: PATH=/node/dump
2014-04-23 10:27:07.929 HTTP INFO : Request received: PATH=/commit/dump
2014-04-23 10:27:07.930 HTTP INFO : Request received: PATH=/ref/dump
irmin dump thestateirmin dump thestateirmin dump thestateirmin dump thestateirmin dump thestateirmin dump thestateirmin dump thestateirmin dump thestateirmin dump thestateirmin dump thestateirmin dump thestateirmin dump thestateirmin dump thestateirmin dump thestate

full backend format compat

currently, we are cheating a little bit when using the different backends: we always assume that we picked the right binary format (eg. if you want to access a Git repo, you need to start your irmin database with a Git format or you are screwed). This will be useful when we'll add the convergent encryption backend. Not very hard to do though, but we need some way to cache key translations somewhere if this will be very costly.

Issue when too many open fd

We have an issue currently when too many processes are trying to open /.git/refs/heads/master. This can happen when you have a lot of active connection. One fix is to have a bounded pool of open fd (this is already done for reading objects in .git/objects but for any reasons, that's not yet the case with tags).

installation error from Irminsule Build and Install

The following error messages are produced from opam run in Linux;

% opam install ezjsonm ocamlgraph lwt cryptokit \
              re dolog mstruct core_kernel \
              uri cohttp ssl core_kernel \
              cmdliner alcotest

....
=-=-= Installing cryptokit.1.9 =-=-=
Building cryptokit.1.9:
  make
  make install
[ERROR] The compilation of cryptokit.1.9 failed.
Removing cryptokit.1.9.
  ocamlfind remove cryptokit

...
# opam-version 1.1.1
# os           linux
# command      ./configure --prefix /home/cm770/ocamlbrew/ocaml-4.01.0/.opam/system
# path         /auto/homes/cm770/ocamlbrew/ocaml-4.01.0/.opam/system/build/ssl.0.4.6
# compiler     system (4.01.0)
# exit-code    1
# env-file     /auto/homes/cm770/ocamlbrew/ocaml-4.01.0/.opam/system/build/ssl.0.4.6/ssl-7317-548b09.env
 ...
# opam-version 1.1.1
# os           linux
# command      ./configure --prefix /home/cm770/ocamlbrew/ocaml-4.01.0/.opam/system
# path         /auto/homes/cm770/ocamlbrew/ocaml-4.01.0/.opam/system/build/ssl.0.4.6
# compiler     system (4.01.0)
# exit-code    1
# env-file     /auto/homes/cm770/ocamlbrew/ocaml-4.01.0/.opam/system/build/ssl.0.4.6/ssl-7317-548b09.env
# stdout-file  /auto/homes/cm770/ocamlbrew/ocaml-4.01.0/.opam/system/build/ssl.0.4.6/ssl-7317-548b09.out
# stderr-file  /auto/homes/cm770/ocamlbrew/ocaml-4.01.0/.opam/system/build/ssl.0.4.6/ssl-7317-548b09.err
### stdout ###
# ...[truncated]
# checking for ocamldep... /home/cm770/ocamlbrew/ocaml-4.01.0/bin/ocamldep
# checking for ocamllex... /home/cm770/ocamlbrew/ocaml-4.01.0/bin/ocamllex
# checking for ocamlyacc... /home/cm770/ocamlbrew/ocaml-4.01.0/bin/ocamlyacc
# checking for ocamldoc... /home/cm770/ocamlbrew/ocaml-4.01.0/bin/ocamldoc
# checking for ocamlmktop... /home/cm770/ocamlbrew/ocaml-4.01.0/bin/ocamlmktop
# checking for gcc... (cached) gcc
# checking whether we are using the GNU C compiler... (cached) yes
# checking whether gcc accepts -g... (cached) yes
# checking for gcc option to accept ISO C89... (cached) none needed
# checking for SSL_new in -lssl... no
### stderr ###
# configure: error: Cannot find libssl.

# stdout-file  /auto/homes/cm770/ocamlbrew/ocaml-4.01.0/.opam/system/build/ssl.0.4.6/ssl-7317-548b09.out
# stderr-file  /auto/homes/cm770/ocamlbrew/ocaml-4.01.0/.opam/system/build/ssl.0.4.6/ssl-7317-548b09.err
### stdout ###
# ...[truncated]
# checking for ocamldep... /home/cm770/ocamlbrew/ocaml-4.01.0/bin/ocamldep
# checking for ocamllex... /home/cm770/ocamlbrew/ocaml-4.01.0/bin/ocamllex
# checking for ocamlyacc... /home/cm770/ocamlbrew/ocaml-4.01.0/bin/ocamlyacc
# checking for ocamldoc... /home/cm770/ocamlbrew/ocaml-4.01.0/bin/ocamldoc
# checking for ocamlmktop... /home/cm770/ocamlbrew/ocaml-4.01.0/bin/ocamlmktop
# checking for gcc... (cached) gcc
# checking whether we are using the GNU C compiler... (cached) yes
# checking whether gcc accepts -g... (cached) yes
# checking for gcc option to accept ISO C89... (cached) none needed
# checking for SSL_new in -lssl... no
### stderr ###
# configure: error: Cannot find libssl.

Unable to store marshalled caml strings in the in-memory store.

utop # CC.update h ["tables";"mysupertable";"auieauie"] (CC.Value.of_bytes_exn "auieauie");;               
- : unit = ()

but

utop # CC.update h ["tables";"mysupertable";"auieauie"] (Marshal.to_string "auieauie" [] |> CC.Value.of_bytes_exn);;                                                                                                  
Exception: IrminCRUD.Make(Client).Error "(IrminHTTP.Invalid)". 

In the first case, the daemon outputs more lines than in the first.

Serialize operations on local mutable stores

Currently, committing a complex operation to the store involves:

  1. reading the HEAD commit, reading the root of the immutable store
  2. adding new nodes in the immutable store
  3. modifying the HEAD commit to points to the new immutable root

So if someone else modifies HEAD between 1. and 3., we can loose some operations on the main branch (but no data corruption ...) -- the added data/commits will be in the store, but not in the branch anymore, which is bad.

A solution to this is to serialize local operations on mutable store. The idea is to clone the database, be the only one to modify it locally and the push back the result. So in this case, local locking is not too bad. An other possibility it to automatically merge/rebase the operation when HEAD has been changed between 1. and 3. but this requires having transactions.

irminMain.ml missing?

The master HEAD doesn't build for me. it looks like irminMain.ml is missing (perhaps inadvertently omitted from c9e49a0):

$ make
ocamlbuild -Is src,src/lib,src/lwt -use-ocamlfind -cflags "-bin-annot" -no-links -pkgs cryptokit,jsonm,uri,ocamlgraph,cmdliner,lwt,ocplib-endian,cstruct -tags "syntax(camlp4o)" -pkgs lwt.syntax,cohttp.lwt,cstruct.syntax irminMain.native
Solver failed:
  Ocamlbuild knows of no rules that apply to a target named src/irminMain.mly. This can happen if you ask Ocamlbuild to build a target with the wrong extension (e.g. .opt instead of .native) or if the source files live in directories that have not been specified as include directories.
Backtrace:
  - Failed to build the target irminMain.native
      - Failed to build all of these:
          - Building src/irminMain.native:
              - Building src/irminMain.cmx:
                  - Failed to build all of these:
                      - Building src/irminMain.ml:
                          - Failed to build all of these:
                              - Building src/irminMain.mly
                              - Building src/irminMain.mll
                      - Building src/irminMain.mlpack
          - Building irminMain.native:
              - Building irminMain.cmx:
                  - Failed to build all of these:
                      - Building irminMain.ml:
                          - Failed to build all of these:
                              - Building irminMain.mly
                              - Building irminMain.mll
                      - Building irminMain.mlpack
make: *** [_build/src/irminMain.native] Error 6

Build and Install: usr/local/bin/irmin'' terminated with error code 1") make: *** [install] Error 1"

"0) I'm installing on Linux Ubuntu computer

  1. make executed successfully--no errors

cm@pursuit2:~/local/irminsule/irminsule-master$ make

ocaml setup.ml -configure  --enable-tests --prefix /usr/local
...
pkg_alcotest: ........................................ /home/cm/ocamlbrew/ocaml-4.01.0/.opam/system/lib/alcotest

echo "let current = \"0.6.0\"" > lib/core/irminVersion.ml
ocaml setup.ml -build
Finished, 1 target (0 cached) in 00:00:00.
Finished, 143 targets (0 cached) in 00:01:50.
  1. However 'make install' failed
cm@pursuit2:~/local/irminsule/irminsule-master$ make install
ocaml setup.ml -install
Installed /home/cm/ocamlbrew/ocaml-4.01.0/.opam/system/lib/irminsule/irmin.mli
... 

Installed /home/cm/ocamlbrew/ocaml-4.01.0/.opam/system/lib/irminsule/irminFS.cmx
Installed /home/cm/ocamlbrew/ocaml-4.01.0/.opam/system/lib/irminsule/META
cp: cannot create regular file `/usr/local/bin/irmin': Permission denied
E: Failure("Command ''cp' '/auto/homes/cm770/local/irminsule/irminsule-master/_build/lib/driver/irminMain.native' '/usr/local/bin/irmin'' terminated with error code 1")
make: *** [install] Error 1"
  1. A second run of 'make install' produces a slightly different error
cm@pursuit2:~/local/irminsule/irminsule-master$ make install
ocaml setup.ml -install 
ocamlfind: Package irminsule is already installed
 (file /home/cm/ocamlbrew/ocaml-4.01.0/.opam/system/lib/irminsule/META already exists)
 E: Failure("Command ''/home/cm/ocamlbrew/ocaml-4.01.0/.opam/system/bin/ocamlfind'

...

lib/core/irminReference.mli lib/core/irminValue.mli lib/core/irminStore.mli lib/core/irminCommit.mli lib/core/irminMisc.mli lib/core/irminKey.mli lib/core/irminGraph.mli lib/core/irmin.mli' terminated with error code 2")
make: *** [install] Error 1
"

partial clone / fetch

Would be good to have a way to query a remote database for a partial state:

  • clone the subtree under a/b/
  • pull everything matching a/b/*.txt
  • ...

support for merge/rebase operations

Currently, the push/pull are a bit dumb as they simply synchronize the store and move the HEAD tag. We do not rebase/merge currently, as the first use-case we want to support is a kind of auto-updating cache (which will never modify the store itself).

backend specific code for push/pull

currently the push/pull logic is handled using a custom binary / REST protocol. would be good if the backends could also express some way to push/pull (so we can use Git directly for instance).

Read values from stdin

I have notice that irmin can't store actual public keys unless you wrap them within " ".

  1. You can see that irmin is ok with plain strings

cm770@pursuit2:/local/irminsule/testing/irmin-mem$ irmin write person/pubKey 123abc
cm770@pursuit2:
/local/irminsule/testing/irmin-mem$ irmin tree
/person/name........................................................................................."carlos"
/person/nationality..............................................................................."universal"
/person/pubKey......................................................................................."123abc"
/person/surname......................................................................................"molina"

  1. However, it can't handle actual public keys

cm770@pursuit2:~/local/irminsule/testing/irmin-mem$ irmin write person/pubKey ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAvaRbn64jCX5Ne+FS4lmNNn+p+umx8/vcJ1+0oFF7oyXg55pFHJEBNhL0g8UXXzvuMdmFgHCBapAq9Efb4dUv45ffnoc5PQqYToO1Sy8eMSSxnVqI8oQeH81/yzgBelKiHVcLx5FNXyuBesIpvxd7VhrN4VZy2I1eWBE3EefmmfigQ9ISEeIHnIjyK6O922xPWkjiWmd83CAC6DGzCChC13Q6eFFArEgUmmNupm+A5Beu4qz6BkvP6xo0vJoOCI+WruWPsYhjxg13LE17LvSLmm1DQz2ghk25rRGPzNKVeJi7xy1LcM1i4vzIMJbTya9g86q5COvT5MqnIHvlbFi0GQ== [email protected]
irmin: internal error, uncaught exception:
(Failure "Too many arguments")

  1. The prob goes away if yo wrap the key in between " ".

cm770@pursuit2:~/local/irminsule/testing/irmin-mem$ irmin write person/pubKey "ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAvaRbn64jCX5Ne+FS4lmNNn+p+umx8/vcJ1+0oFF7oyXg55pFHJEBNhL0g8UXXzvuMdmFgHCBapAq9Efb4dUv45ffnoc5PQqYToO1Sy8eMSSxnVqI8oQeH81/yzgBelKiHVcLx5FNXyuBesIpvxd7VhrN4VZy2I1eWBE3EefmmfigQ9ISEeIHnIjyK6O922xPWkjiWmd83CAC6DGzCChC13Q6eFFArEgUmmNupm+A5Beu4qz6BkvP6xo0vJoOCI+WruWPsYhjxg13LE17LvSLmm1DQz2ghk25rRGPzNKVeJi7xy1LcM1i4vzIMJbTya9g86q5COvT5MqnIHvlbFi0GQ== [email protected]"

`irmin tree` is broken

Seems to be a regression introduced in 0.6.0, which should be related to the API change of list.

Which branch reference?

I am looking for the function:

S.t -> S.branch which tells the current branch

Does it exist somewhere? I am tracking this in a product with the S.t but surely the S.t knows the reference it tracks internally?

The example provided in the README file doesn't seem to work

I was trying to run the example of using the Irmin API in the README file.

After running module G = IrminGit.Make(IrminGit.Memory);;

I get:

Error: Signature mismatch: 

The field `disk' is required but not provided
The field `bare' is required but not provided 
The field `Sync' is required but not provided  
The field `Store' is required but not provided 
The field `root' is required but not provided 

Documentation on JSON CRUD interface

I'm interested in using Irmin other languages (Clojure and ClojureScript, specifically) via the JSON CRUD interface.

The only documentation I can find on the JSON CRUD interface is a single GET statement - it'd be great to see the other API endpoints have a bit of documentation with (or just) an example payload/CURL line, ideally linking to the source where the endpoint is implemented. I looked through the source a bit, but I've never worked with OCaml, so couldn't quite pin down the shape of the data I should use.

Stack overflow in toplevel

─( 00:00:00 )─< command 0 >────────────────────────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # #require "irmin";;
        Camlp4 Parsing version 4.01.0                                                                                                                                                                                                                       
─( 10:00:34 )─< command 1 >────────────────────────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # #require "irmin.unix";;
─( 10:00:42 )─< command 2 >────────────────────────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # open Irmin_unix;;
─( 10:00:47 )─< command 3 >────────────────────────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # module Git = IrminGit.FS(struct
    let root = Some "/tmp/db"
    let bare = true
  end);;
module Git : Irmin.BACKEND
─( 10:00:51 )─< command 4 >────────────────────────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # module DB = Git.Make(IrminKey.SHA1)(IrminContents.String)(IrminTag.String);;
Fatal error: exception Stack overflow            

If I put the commands into a file and "#use" that, it does work.

Support the Git protocol

Currently, Irminsule speaks two languages:

  • JSON for all the remote (currently HTTP(s) only) CRUD backends
  • a custom binary protocol which decent read/write performance (need to benchmarked at some point)

Adding a different binary protocol is not totally trivial but is definitely doable. Tip: use cagit.

Writing to [] makes a broken git repo

[vagrant@localhost examples]$ cat >irmin_test.ml <<EOT
open Lwt

let t =
 let open Irmin_unix in
  let module Git = IrminGit.FS(struct
    let root = Some "/tmp/foo"
    let bare = true
  end) in
  let module DB = Git.Make(IrminKey.SHA1)(IrminContents.String)(IrminTag.String) in
  DB.create () >>= fun db ->
  DB.View.create () >>= fun v ->
  DB.View.update v [  ] "root" >>= fun () ->
  DB.View.merge_path_exn db [] v

let () = Lwt_main.run t
EOT
[vagrant@localhost examples]$ ocamlbuild -use-ocamlfind -no-hygiene -tag "syntax(camlp4o)" -package irmin.unix,sexplib.syntax,comparelib.syntax,bin_prot.syntax irmin_test.native

[vagrant@localhost examples]$ ./irmin_test.native 

[vagrant@localhost examples]$ cd /tmp/foo

[vagrant@localhost foo]$ git show
fatal: unable to read root tree (d98125eebf0495e8a2b455578fb261f553a27db5)

[vagrant@localhost foo]$ git log
commit d98125eebf0495e8a2b455578fb261f553a27db5
Author: 448 <[email protected]>
Date:   Thu Jan 1 00:00:01 1970 +0000

    Merge view to

    Actions:
    - write

It works ok if I change the path from [] to [ "foo" ].

review tag store writes

seems that the tls webserver is writing to .git/HEAD (which means that it is somehow switching branches), but this should ne happen because there is only one branch. I suspect we update the head tag a bit too much -- need to review that.

More robust merges

Currently, there are two situations that Irmin do not handler very well:

  • trying to merge when no parents in common. This usually do not happen, but why not try to support it (even if it is not very efficient). Also, it might be necessary to handle if we deal with partial history (see #21). This means adding a merge2 function in user-defined contents.
  • trying to merge where multiple common ancestor exists. Currently, Irmin fails (see https://github.com/mirage/irmin/blob/master/lib/core/irminCommit.ml#L180) but we should be able to 3-merge all the pair of parents recursively to get a unique ancestor.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.