sirixdb / sirix Goto Github PK

SirixDB is an an embeddable, bitemporal, append-only database system and event store, storing immutable lightweight snapshots. It keeps the full history of each resource. Every commit stores a space-efficient snapshot through structural sharing. It is log-structured and never overwrites data. SirixDB uses a novel page-level versioning approach.

Home Page: https://sirix.io

License: BSD 3-Clause "New" or "Revised" License

Java 93.45% XSLT 0.10% XQuery 0.19% Kotlin 6.12% Dockerfile 0.03% Shell 0.06% HTML 0.06%

xquery java temporal-data storage snapshot comparison ssd json versioning hashing

sirix's Introduction

An Embeddable, Bitemporal, Append-Only Database System and Event Store

Stores small-sized, immutable snapshots of your data in an append-only manner. It facilitates querying and reconstructing the entire history as well as easy audits.

Download ZIP | Join us on Discord | Community Forum | Documentation | Architecture & Concepts

Working on your first Pull Request? You can learn how from this free series How to Contribute to an Open Source Project on GitHub and another tutorial: How YOU can contribute to OSS, a beginners guide

"Remember that you're lucky, even if you don't think you are because there's always something that you can be thankful for." - Esther Grace Earl (http://tswgo.org)

We want to build the database system together with you. Help us and become a maintainer yourself. Why? You may like the software and want to help us. Furthermore, you'll learn a lot. You may want to fix a bug or add a feature. Do you want to add an awesome project to your portfolio? Do you want to grow your network?... All of this are valid reasons besides probably many more: Collaborating on Open Source Software

SirixDB appends data to an indexed log file without the need of a WAL. It can be embedded and used as a library from your favorite language on the JVM to store and query data locally or by using a simple CLI. An asynchronous HTTP server, which adds the core and query modules as dependencies, can interact with SirixDB over the network using Keycloak for authentication/authorization. One file stores the data with all revisions and possibly secondary indexes. A second file stores offsets into the file to quickly search for a revision by a given timestamp using an in-memory binary search. Furthermore, a few maintenance files exist, which store the configuration of a resource and the definitions of secondary indexes (if any are configured). Other JSON files keep track of changes in delta files if enabled.

It currently supports the storage and (time travel) querying of XML and JSON data in its binary encoding, tailored to support versioning. The index structures and the whole storage engine has been written from scratch to support versioning natively. We might also implement storing and querying other data formats as relational data.

SirixDB uses a huge persistent (in the functional sense) tree of tries, wherein the committed snapshots share unchanged pages and even common records in changed pages. The system only stores page fragments during a copy-on-write out-of-place operation instead of full pages during a commit to reduce write-amplification. During read operations, the system reads the page fragments in parallel to reconstruct an in-memory page (thus, a fast, random access storage device as a PCIe SSD is best suited or even byte-addressable storage as Intel DC optane memory shortly -- as SirixDB stores fine granular cache-size (not page) aligned modifications in a single file.

Please consider sponsoring our Open Source work if you like the project.

Note: Let us know if you'd like to build a brand-new frontend with, for instance Svelte, D3.js, and Typescript.

Discuss it in the Community Forum.

Keeping All Versions of Your Data By Sharing Structure
JSONiq examples
SirixDB Features
- Design Goals
- Revision Histories
Getting Started
Getting Help
- Community Forum
- Join us on Discord
Contributors
License

Keeping All Versions of Your Data By Sharing Structure

We could write a lot about why keeping all states of your data in a storage system is of great value. In a nutshell, it's all about looking at the evolution of your data, finding trends, doing audits, and implementing efficient undo-/redo-operations. The Wikipedia page has a bunch of examples. We recently also added use cases over here.

Our firm belief is that a temporal storage system must address the issues which arise from keeping past states way better than traditional approaches. Usually, storing time-varying, temporal data in database systems that do not support the storage thereof natively results in many unwanted hurdles. They waste storage space, query performance to retrieve past states of your data is not most ideal, and usually, temporal operations are missing altogether.

The DBS must store data so that storage space is used as effectively as possible while supporting the reconstruction of each revision, as the database saw it during the commits. All this should be handled in linear time, whether it's the first revision or the most recent revision. Ideally, the query time of old/past revisions and the most recent revision should be in the same runtime complexity (logarithmic when querying for specific records).

SirixDB not only supports snapshot-based versioning on a record granular level through a novel versioning algorithm called sliding snapshot, but also time travel queries, efficient diffing between revisions, and storing semi-structured data.

Executing the following time-travel query on our binary JSON representation of Twitter sample data gives an initial impression of the possibilities:

let $statuses := jn:open('mycol.jn','mydoc.jn', xs:dateTime('2019-04-13T16:24:27Z')).statuses
let $foundStatus := for $status in $statuses
  let $dateTimeCreated := xs:dateTime($status.created_at)
  where $dateTimeCreated > xs:dateTime("2018-02-01T00:00:00") and not(exists(jn:previous($status)))
  order by $dateTimeCreated
  return $status
return {"revision": sdb:revision($foundStatus), $foundStatus{text}}

The query opens a database/resource in a specific revision based on a timestamp (2019–04–13T16:24:27Z) and searches for all statuses, which have a created_at timestamp, which has to be greater than the 1st of February in 2018 and did not exist in the previous revision. . is a dereferencing operator used to dereference keys in JSON objects, array values can be accessed as shown looping over the values or through specifying an index, starting with zero: array[0] for instance, specifies the first value of the array. Brackit, our query processor, also supports Python-like array slices to simplify tasks.

JSONiq examples

To verify changes in a node or its subtree, first, select the node in the revision and then query for changes using our stored Merkle hash tree, which builds and updates hashes for each node and its subtree and checks the hashes with sdb:hash($item). The function jn:all-times delivers the node in all revisions in which it exists. jn:previous delivers the node in the previous revision or an empty sequence if there's none.

let $node := jn:doc('mycol.jn','mydoc.jn').fieldName[1]
let $result := for $node-in-rev in jn:all-times($node)
               let $nodeInPreviousRevision := jn:previous($node-in-rev)
               return
                 if ((not(exists($nodeInPreviousRevision)))
                      or (sdb:hash($node-in-rev) ne sdb:hash($nodeInPreviousRevision))) then
                   $node-in-rev
                 else
                   ()
return [
  for $jsonItem in $result
  return { "node": $jsonItem, "revision": sdb:revision($jsonItem) }
]

Emit all diffs between the revisions in a JSON format:

let $maxRevision := sdb:revision(jn:doc('mycol.jn','mydoc.jn'))
let $result := for $i in (1 to $maxRevision)
               return
                 if ($i > 1) then
                   jn:diff('mycol.jn','mydoc.jn',$i - 1, $i)
                 else
                   ()
return [
  for $diff at $pos in $result
  return {"diffRev" || $pos || "toRev" || $pos + 1: jn:parse($diff).diffs}
]

We support easy updates as in

let $array := jn:doc('mycol.jn','mydoc.jn')
return insert json {"bla":true} into $array at position 0

to insert a JSON object into a resource, whereas the root node is an array at the first position (0). The transaction is implicitly committed. Thus, a new revision is created, and the specific revision can be queried using a single third argument, either a simple integer ID or a timestamp. The following query issues a query on the first revision (thus without the changes).

jn:doc('mycol.jn','mydoc.jn',1)

Omitting the third argument opens the resource in the most recent revision, but you could, in this case, also specify revision number 2. You can also use a timestamp as in:

jn:open('mycol.jn','mydoc.jn',xs:dateTime('2022-03-01T00:00:00Z'))

A simple join (whereas joins are optimized in our query processor called Brackit):

(* first: store stores in a stores resource *)
sdb:store('mycol.jn','stores','
[
  { "store number" : 1, "state" : "MA" },
  { "store number" : 2, "state" : "MA" },
  { "store number" : 3, "state" : "CA" },
  { "store number" : 4, "state" : "CA" }
]')


(* second: store sales in a sales resource *)
sdb:store('mycol.jn','sales','
[
  { "product" : "broiler", "store number" : 1, "quantity" : 20  },
  { "product" : "toaster", "store number" : 2, "quantity" : 100 },
  { "product" : "toaster", "store number" : 2, "quantity" : 50 },
  { "product" : "toaster", "store number" : 3, "quantity" : 50 },
  { "product" : "blender", "store number" : 3, "quantity" : 100 },
  { "product" : "blender", "store number" : 3, "quantity" : 150 },
  { "product" : "socks", "store number" : 1, "quantity" : 500 },
  { "product" : "socks", "store number" : 2, "quantity" : 10 },
  { "product" : "shirt", "store number" : 3, "quantity" : 10 }
]')

let $stores := jn:doc('mycol.jn','stores')
let $sales := jn:doc('mycol.jn','sales')
let $join :=
  for $store in $stores, $sale in $sales
  where $store."store number" = $sale."store number"
  return {
    "nb" : $store."store number",
    "state" : $store.state,
    "sold" : $sale.product
  }
return [$join]

SirixDB through Brackit also supports array slices. The start index is 0, the step is 1, and end index is 1 (exclusive) in the next query:

let $array := [{"foo": 0}, "bar", {"baz": true}]
return $array[0:1:1]

The query returns the first object {"foo":0}.

With the function sdb:nodekey you can find out the internal unique node key of a node, which will never change. You for instance might be interested in which revision it has been removed. The following query uses the function sdb:select-item which, as the first argument needs a context node and, as the second argument the key of the item or node to select. jn:last-existing finds the most recent version and sdb:revision retrieves the revision number.

sdb:revision(jn:last-existing(sdb:select-item(jn:doc('mycol.jn','mydoc.jn',1), 26)))

Index types

SirixDB has three types of indexes along with a path summary tree, which is basically a tree of all distinct paths:

name indexes, to index a set of object fields
path indexes, to index a set of paths (or all paths in a resource)
CAS indexes, so-called content-and-structure indexes, which index paths and typed values (for instance, all xs:integers). In this case, on the paths specified, only integer values are indexed on the path, but no other types

We base the indexes on the following serialization of three revisions of a very small SirixDB resource.

{
  "sirix": [
    {
      "revisionNumber": 1,
      "revision": {
        "foo": [
          "bar",
          null,
          2.33
        ],
        "bar": {
          "hello": "world",
          "helloo": true
        },
        "baz": "hello",
        "tada": [
          {
            "foo": "bar"
          },
          {
            "baz": false
          },
          "boo",
          {},
          []
        ]
      }
    },
    {
      "revisionNumber": 2,
      "revision": {
        "tadaaa": "todooo",
        "foo": [
          "bar",
          null,
          103
        ],
        "bar": {
          "hello": "world",
          "helloo": true
        },
        "baz": "hello",
        "tada": [
          {
            "foo": "bar"
          },
          {
            "baz": false
          },
          "boo",
          {},
          []
        ]
      }
    },
    {
      "revisionNumber": 3,
      "revision": {
        "tadaaa": "todooo",
        "foo": [
          "bar",
          null,
          23.76
        ],
        "bar": {
          "hello": "world",
          "helloo": true
        },
        "baz": "hello",
        "tada": [
          {
            "foo": "bar"
          },
          {
            "baz": false
          },
          "boo",
          {},
          [
            {
              "foo": "bar"
            }
          ]
        ]
      }
    }
  ]
}

let $doc := jn:doc('mycol.jn','mydoc.jn')
let $stats := jn:create-name-index($doc, ('foo','bar'))
return {"revision": sdb:commit($doc)}

The index is created for "foo" and "bar" object fields. You can query for "foo" fields as for instance:

let $doc := jn:doc('mycol.jn','mydoc.jn')
let $nameIndexNumber := jn:find-name-index($doc, 'foo')
for $node in jn:scan-name-index($doc, $nameIndexNumber, 'foo')
order by sdb:revision($node), sdb:nodekey($node)
return {"nodeKey": sdb:nodekey($node), "path": sdb:path($node), "revision": sdb:revision($node)}

Second, whole paths are indexable.

Thus, the following path index is applicable to both queries: .sirix[].revision.tada[].foo and .sirix[].revision.tada[][4].foo. Thus, essentially both foo nodes are indexed and the first child has to be fetched afterwards. For the second query also the array index 4 has to be checked if the indexed node is really on index 4.

let $doc := jn:doc('mycol.jn','mydoc.jn')
let $stats := jn:create-path-index($doc, '/sirix/[]/revision/tada//[]/foo')
return {"revision": sdb:commit($doc)}

The index might be scanned as follows:

let $doc := jn:doc('mycol.jn','mydoc.jn')
let $pathIndexNumber := jn:find-path-index($doc, '/sirix/[]/revision/tada//[]/foo')
for $node in jn:scan-path-index($doc, $pathIndexNumber, '/sirix/[]/revision/tada//[]/foo')
order by sdb:revision($node), sdb:nodekey($node)
return {"nodeKey": sdb:nodekey($node), "path": sdb:path($node)}

CAS indexes index a path plus the value. The value itself must be typed (so in this case we index only decimals on a path).

let $doc := jn:doc('mycol.jn','mydoc.jn')
let $stats := jn:create-cas-index($doc, 'xs:decimal', '/sirix/[]/revision/foo/[]')
return {"revision": sdb:commit($doc)}

We can do an index range-scan as for instance via the next query (2.33 and 100 are the min and max, the next two arguments are two booleans which denote if the min and max should be retrieved or if it's >min and <max). The last argument is usually a path if we index more paths in the same index (in this case we only index /sirix/[]/revision/foo/[]).

let $doc := jn:doc('mycol.jn','mydoc.jn')
let $casIndexNumber := jn:find-cas-index($doc, 'xs:decimal', '/sirix/[]/revision/foo/[]')
for $node in jn:scan-cas-index-range($doc, $casIndexNumber, 2.33, 100, false(), true(), ())
order by sdb:revision($node), sdb:nodekey($node)
return {"nodeKey": sdb:nodekey($node), "node": $node}

You can also create a CAS index on all string values on all paths (all object fields: //*; all arrays: //[]):

let $doc := jn:doc('mycol.jn','mydoc.jn')
let $stats := jn:create-cas-index($doc,'xs:string',('//*','//[]'))
return {"revision": sdb:commit($doc)}

To query for string values with a certain name (bar) on all paths (empty sequence ()):

let $doc := jn:doc('mycol.jn','mydoc.jn')
let $casIndexNumber := jn:find-cas-index($doc, 'xs:string', '//*')
for $node in jn:scan-cas-index($doc, $casIndexNumber, 'bar', '==', ())
order by sdb:revision($node), sdb:nodekey($node)
return {"nodeKey": sdb:nodekey($node), "node": $node, "path": sdb:path(sdb:select-parent($node))}

The argument == means check for equality of the string. Other values that make more sense for integers, and decimals... are <, <=, >= and >.

SirixDB Features

SirixDB is a log-structured, temporal JSON and XML database system, which stores evolutionary data. It never overwrites any data on disk. Thus, we're able to restore and query the full revision history of a resource in the database.

Design Goals

Some of the most important core principles and design goals are:

Embeddable: Similar to SQLite and DucksDB, SirixDB is embeddable at its core. Other APIs as the non-blocking REST API are built on top.
Minimize Storage Overhead: SirixDB shares unchanged data pages as well as records between revisions, depending on a chosen versioning algorithm during the initial bootstrapping of a resource. SirixDB aims to balance read and writer performance in its default configuration.
Concurrent: SirixDB contains very few locks and aims to be as suitable for multithreaded systems as possible.
Asynchronous: Operations can happen independently; each transaction is bound to a specific revision and only one read/write transaction on a resource is permitted concurrently with N read-only transactions.
Versioning/Revision history: SirixDB stores a revision history of every resource in the database without imposing extra overhead. It uses a huge persistent, durable page tree for indexing revisions and data.
Data integrity: SirixDB, like ZFS, stores full checksums of the pages in the parent pages. That means that almost all data corruption can be detected upon reading in the future; we aim to partition and replicate databases in the future.
Copy-on-write semantics: Similarly to the file systems Btrfs and ZFS, SirixDB uses CoW semantics, meaning that SirixDB never overwrites data. Instead, database-page fragments are copied/written to a new location. SirixDB does not simply copy whole pages. Instead, it only copies changed records plus records, which fall out of a sliding window.
Per revision and page versioning: SirixDB does not only version on a per revision but also on a per page base. Thus, whenever we change a potentially small fraction of records in a data page, it does not have to copy the whole page and write it to a new location on a disk or flash drive. Instead, we can specify one of several versioning strategies known from backup systems or a novel sliding snapshot algorithm during the creation of a database resource. The versioning type we specify is used by SirixDB to version data pages.
Guaranteed atomicity and consistency (without a WAL): The system will never enter an inconsistent state (unless there is hardware failure), meaning that unexpected power-off won't ever damage the system. This is accomplished without the overhead of a write-ahead log. (WAL)
Log-structured and SSD friendly: SirixDB batches writes and syncs everything sequentially to a flash drive during commits. It never overwrites committed data.

Revision Histories

Keeping the revision history is one of the main features in SirixDB. You can revert any revision to an earlier version or back up the system automatically without the overhead of copying. SirixDB only ever copies changed database pages and, depending on the versioning algorithm you chose during the creation of a database/resource, only page fragments, and ancestor index pages to create a new revision.

You can reconstruct every revision in O(n), where n denotes the number of nodes in the revision. Binary search is used on an in-memory (linked) map to load the revision, thus finding the revision root page has an asymptotic runtime complexity of O(log n), where n, in this case, is the number of stored revisions.

Currently, SirixDB offers two built-in native data models, namely a binary XML store and a JSON store.

Articles published on Medium:

Status

SirixDB as of now has not been tested in production. It is recommended for experiments, testing, benchmarking, etc., but is not recommended for production usage. Let us know if you'd like to use SirixDB in production and get in touch. We'd like to test real-world datasets and fix issues we encounter along the way.

Please also get in touch if you like our vision, and you want to sponsor us or help with man-power or if you want to use SirixDB as a research system. We'd be glad to get input from the database and scientific community.

Getting started

Download ZIP or Git Clone

git clone https://github.com/sirixdb/sirix.git

or use the following dependencies in your Maven or Gradle project.

SirixDB uses Java 21, thus you need an up-to-date Gradle (if you want to work on SirixDB) and an IDE (for instance IntelliJ or Eclipse). Also make sure to use the provided Gradle wrapper.

Maven artifacts

At this stage of development, you should use the latest SNAPSHOT artifacts from the OSS snapshot repository to get the most recent changes.

Just add the following repository section to your POM or build.gradle file:

<repository>
  <id>sonatype-nexus-snapshots</id>
  <name>Sonatype Nexus Snapshots</name>
  <url>https://oss.sonatype.org/content/repositories/snapshots</url>
  <releases>
    <enabled>false</enabled>
  </releases>
  <snapshots>
    <enabled>true</enabled>
  </snapshots>
</repository>

repository {
    maven {
        url "https://oss.sonatype.org/content/repositories/snapshots/"
        mavenContent {
            snapshotsOnly()
        }
    }
}

Note that we changed the groupId from com.github.sirixdb.sirix to io.sirix.

Maven artifacts are deployed to the central maven repository (however please use the SNAPSHOT-variants as of now). Currently, the following artifacts are available:

Core project:

<dependency>
  <groupId>io.sirix</groupId>
  <artifactId>sirix-core</artifactId>
  <version>0.x.y-SNAPSHOT</version>
</dependency>

compile group:'io.sirix', name:'sirix-core', version:'0.x.y-SNAPSHOT'

Brackit binding:

<dependency>
  <groupId>io.sirix</groupId>
  <artifactId>sirix-query</artifactId>
  <version>0.x.y-SNAPSHOT</version>
</dependency>

compile group:'io.sirix', name:'sirix-query', version:'0.x.y-SNAPSHOT'

Asynchronous, RESTful API with Vert.x, Kotlin and Keycloak (the latter for authentication via OAuth2/OpenID-Connect):

<dependency>
  <groupId>io.sirix</groupId>
  <artifactId>sirix-rest-api</artifactId>
  <version>0.x.y-SNAPSHOT</version>
</dependency>

compile group: 'io.sirix', name: 'sirix-rest-api', version: '0.x.y-SNAPSHOT'

Other modules are currently not available (namely the GUI, the distributed package as well as an outdated Saxon binding).

You have to add the following JVM parameters currently:

-ea
--enable-preview
--add-exports=java.base/jdk.internal.ref=ALL-UNNAMED
--add-exports=java.base/sun.nio.ch=ALL-UNNAMED
--add-exports=jdk.unsupported/sun.misc=ALL-UNNAMED
--add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED
--add-opens=jdk.compiler/com.sun.tools.javac=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED
--add-opens=java.base/java.io=ALL-UNNAMED
--add-opens=java.base/java.util=ALL-UNNAMED

Plus we recommend using the Shenandoah GC or ZGC (if possible in the future the generational versions):

-XX:+UseZGC
-Xlog:gc
-XX:+AlwaysPreTouch
-XX:+UseLargePages
-XX:-UseBiasedLocking
-XX:+DisableExplicitGC

We've also had perfect results using GraalVM, possibly due to its JIT compiler and the improved escape analysis.

Setup of the SirixDB HTTP-Server and Keycloak to use the REST-API

The REST-API is asynchronous at its very core. We use Vert.x, which is a toolkit built on top of Netty. It is heavily inspired by Node.js but for the JVM. As such, it uses event loop(s), which is thread(s), which never should by blocked by long-running CPU tasks or disk-bound I/O. We are using Kotlin with coroutines to keep the code simple. SirixDB uses OAuth2 (Password Credentials/Resource Owner Flow) using a Keycloak authorization server instance.

Keycloak setup (Standalone Setup / Docker Hub Image)

You can set up Keycloak as described in this excellent tutorial. Our docker-compose file imports a sirix realm with a default admin user with all available roles assigned. You can skip steps 3 - 7 and 10, 11, and simply recreate a client-secret and change oAuthFlowType to "PASSWORD". If you want to run or modify the integration tests, the client secret must not be changed. Make sure to delete the line "build: ." in the docker-compse.yml file for the server image if you want to use the Docker Hub image.

Open your browser. URL: http://localhost:8080
Login with username "admin" and password "admin"
Create a new realm with the name "sirixdb"
Go to Clients => account
Change client-id to "sirix"
Make sure access-type is set to confidential
Go to the Credentials tab
Put the client secret into the SirixDB HTTP-Server configuration file. Change the value of "client.secret" to whatever Keycloak set up.
If "oAuthFlowType" is specified in the same configuration file change the value to "PASSWORD" (if not default is "PASSWORD").
Regarding Keycloak the direct access grant on the settings tab must be enabled.
Our (user-/group-)roles are "create" to allow creating databases/resources, "view" to allow to query database resources, "modify" to modify a database resource and "delete" to allow deletion thereof. You can also assign ${databaseName}- prefixed roles.

Start Docker Keycloak-Container using docker-compose

For setting up the SirixDB HTTP-Server and a basic Keycloak-instance with a test realm:

git clone https://github.com/sirixdb/sirix.git
sudo docker-compose up keycloak

Start the SirixDB HTTP-Server and the Keycloak-Container using docker-compose

This section describes setting up the Keycloak using docker compose. If you are looking for configuring keycloak from scratch, instructions for that is described in the previous section

For setting up the SirixDB HTTP-Server and a basic Keycloak-instance with a test sirixdb realm:

At first clone this repository with the following command (Or download .zip)
```
git clone https://github.com/sirixdb/sirix.git
```
cd into the sirix folder that was just cloned.
```
cd sirix/
```
Run the Keycloak container using docker compose
```
sudo docker compose up keycloak
```
Visit http://localhost:8080 and login to the admin console using username: admin and password: admin
From the navigation panel on the left, select Realm Settings and verify that the Name field is set to sirixdb
Select the client with Client ID sirix
1. Verify direct access grant is enabled.
2. Verify that Access Type is set to confidential
3. In the credentials tab
  1. Verify that the Client Authentication is set to Client Id and Secret
  2. Click on Regenerate Secret to generate a new secret. Set the value of the field named client.secret of the configuration file to this secret.
Finally run the SirixDB-HTTP Server and Keycloak container with docker compose
```
 docker compose up
```

SirixDB HTTP-Server Setup Without Docker/docker-compose

To created a fat-JAR. Download our ZIP-file for instance, then

cd bundles/sirix-rest-api
./gradlew build -x test

And a fat-JAR with all required dependencies should have been created in your target folder.

Furthermore, a key.pem and a cert.pem file are needed. These two files have to be in your user home directory in a directory called "sirix-data", where Sirix stores the databases. For demo purposes, they can be copied from our resources directory.

Once also Keycloak is set up we can start the server via:

java -jar -Duser.home=/opt/sirix sirix-rest-api-*-SNAPSHOT-fat.jar -conf sirix-conf.json -cp /opt/sirix/*

If you like to change your user home directory to /opt/sirix for instance.

The fat-JAR in the future will be downloadable from the maven repository.

Run the Integration Tests

In order to run the integration tests under bundles/sirix-rest-api/src/test/kotlin make sure that you assign your admin user all the user roles you have created in the Keycloak setup (last step). Make sure that Keycloak is running first and execute the tests in your favorite IDE for instance.

Note that the following VM parameters currently are needed: -ea --add-modules=jdk.incubator.foreign --enable-preview

Command-line tool

We ship a (very) simple command-line tool for the sirix-query bundle:

Get the latest sirix-xquery JAR with dependencies.

Documentation

We are currently working on the documentation. You may find first drafts and snippets in the documentation and in this README. Furthermore, you are kindly invited to ask any question you might have (and you likely have many questions) in the community forum (preferred) or in the Discord channel. Please also have a look at and play with our sirix-example bundle which is available via Maven or our new asynchronous RESTful API (shown next).

Getting Help

Community Forum

If you have any questions or are considering contributing or using Sirix, please use the Community Forum to ask questions. Any kind of question, may it be an API question or enhancement proposal, questions regarding use cases are welcome... Don't hesitate to ask questions or make suggestions for improvements. At the moment also, API-related suggestions and critics are of utmost importance.

Join us on Discord

You may find us on Discord for quick questions.

Contributors ✨

SirixDB is maintained by

Johannes Lichtenberger

And the Open Source Community.

As the project was forked from a university project called Treetank, my deepest gratitude to Marc Kramis, who came up with the idea of building a versioned, secure, and energy-efficient data store, which retains the history of resources of his Ph.D. Furthermore, Sebastian Graf came up with a lot of ideas and greatly improved the implementation of his Ph.D. Besides, a lot of students worked and improved the project considerably.

Thanks go to these wonderful people, who greatly improved SirixDB lately. SirixDB couldn't exist without the help of the Open Source community:

_{Ilias YAHIA} 💻	_{BirokratskaZila} 📖	_{Andrei Buiza} 💻	_{Bondar Dmytro} 💻	_{santoshkumarkannur} 📖	_{Lars Eckart} 💻	_{Jayadeep K M} 📆
_{Keith Kim} 🎨	_{Theofanis Despoudis} 📖	_{Mario Iglesias Alarcón} 🎨	_{Antonio Nuno Monteiro} 📆	_{Fulton Browne} 📖	_{Felix Rabe} 📖	_{Ethan Willis} 📖
_{Erik Axelsson} 💻	_{Sérgio Batista} 📖	_chaensel 📖	_{Balaji Vijayakumar} 💻	_{Fernanda Campos} 💻	_{Joel Lau} 💻	_add09 💻
_{Emil Gedda} 💻	_{Andreas Rohlén} 💻	_{Marcin Bielecki} 💻	_{Manfred Nentwig} 💻	_Raj 💻	_{Moshe Uminer} 💻

Contributions of any kind are highly welcome!

License

This work is released under the BSD 3-clause license.

sirix's People

Contributors

Stargazers

Watchers

Forkers

peter-lawrey bark batista ssmeets2000 johanneslichtenberger cxz dyb10101 intfrr maniacs-db morristech c0debrain propellingbits tonyle9 mjunaidi quinndiggity praveen2112 ethanwillis felixrabe riverbuilding emrul clojj anndream akj2018 mrexsp sweta271097 karmakaze theodesp kmjayadeep christopher-fredregill loniks alvindu13 sabamosleh santoshkumarkannur erik-d-mueller nickcwh wudmer siddhantvyas-tech jwojciec har-vy mrbuggysan saurabhs714 adamrmelnyk ahmedmamdouh13 s18v parul-sinha mkdika yiss chauhraj nkmurthy patryk2707 mbrukman pradeepchakry sanjayyr moneytech balajiv113 rakeshpratibha2017 fernandacg zacharywhitley nathantran-zl ledbetterw drewdaemon ciphx graydon vishnulakkimsetty rathan-naik soffan20 cuulee arohlen schmek diskostu cazacugmihai nakiscia petewhartnett panilya v1cknesh russiancold reynoldsm88 punit-kulal marcinbieleckilll dorucioclea severussundar lilbond ugly-ugla manfrednentwig shreyachoubey29 btmalone georgeoctavian rbhatmanjunath plsv raj-datta-manohar dms-yondy orcunsagirsoy floriankrammel murrium123 cheparsky kots14 sameerkhurana mureinik gabitchov sayan2306

sirix's Issues

Multiple write-transactions.

Provide synchronized multiple write-transactions via a LockManager.

rtx.getType() return wrong thing.

Code:
rtx.moveTo(diffTuple.getNewNodeKey());
System.out.println("Is element: " +rtx.isElement());
System.out.println("Is attr: " +rtx.isAttribute());
System.out.println("Is text: " +rtx.isText());
System.out.println("type: "+rtx.getType());

Output:
Element in:
Is element: true
Is attr: false
Is text: false
type: xs:untyped

Expected:
type: xs:element

Attribute in:
Is element: false
Is attr: true
Is text: false
type: xs:untyped
type: xs:attribute

Text in:
Is element: false
Is attr: false
Is text: true
[error] play - Cannot invoke the action, eventually got an error: java.lang.IllegalStateException: No other node types supported!
at play.api.Application$class.handleError(Application.scala:293) ~[play_2.10.jar:2.2.2]
at play.api.DefaultApplication.handleError(Application.scala:399) [play_2.10.jar:2.2.2]
at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$2$$anonfun$applyOrElse$3.apply(PlayDefaultUpstreamHandler.scala:261) [play_2.10.jar:2.2.2]
at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$2$$anonfun$applyOrElse$3.apply(PlayDefaultUpstreamHandler.scala:261) [play_2.10.jar:2.2.2]
at scala.Option.map(Option.scala:145) [scala-library.jar:na]
at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$2.applyOrElse(PlayDefaultUpstreamHandler.scala:261) [play_2.10.jar:2.2.2]
Caused by: java.lang.IllegalStateException: No other node types supported!
at org.sirix.page.NamePage.getName(NamePage.java:159) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]
at org.sirix.access.PageReadTrxImpl.getName(PageReadTrxImpl.java:484) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]
at org.sirix.access.NodeReadTrxImpl.getType(NodeReadTrxImpl.java:375) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]
at models.SirixHandler.getDiffs(SirixHandler.java:613) ~[na:na]
at models.SirixHandler.generateDiffs(SirixHandler.java:572) ~[na:na]
at models.SirixHandler.getDiff(SirixHandler.java:751) ~[na:na]

Expekted xs:text or xs:string

Error: import org.exist.xmldb.XQueryService cannot be resolved. I have added exist-xmldb-0.9.2 jar but still unable to resolve

I have imported a project related to xbrl api with oracle berkely db, it has import org.exist.xmldb.XQueryService; but dependency is missing. What is the maven dependency I need to add to resolve this?

unresolved dependencies on maven com.sleepycat#je;5.0.97

If you add
"com.github.sirixdb.sirix" % "sirix-core" % "0.1.2-SNAPSHOT"
and adding maven reppository
"sirixDb" at "https://oss.sonatype.org/content/repositories/snapshots/"
Then you get unresolved dependencies error on
com.sleepycat#je;5.0.97: not found
And on maven webpage i can only find the following versions:
5.0.73
4.0.92
4.1.7
For com.sleepycat je.

Use a slotted page architecture

However, it might be too complicated to implement, because of the versioning-approaches.

Static number of indirect pages should be dynamic

During insertion we have to check if a node exceeds the maximum number of nodes, which can be stored in the record pages and add an indirect page if the node to insert has ID max-number-of-nodes-storable+1. We then have to keep track how many levels we have. But this will improve performance considerably.

Write ahead log directly to the data dir.

Creating a checkpoint and directly writing changes to the data-dir instead of a separate write ahead log.

Reapply transaction log after power failure.

Features in a nutshell not clear.

Features in a nutshell

Import of differences between two XML-documents, that is after the first version of an XML-document is imported an algorithm tries to update the Sirix resource with a minimum of operations to change the first version into the new version.

By reading the first part, "Import of differences between two XML-documents", one assumes that one can import a 'patch file' and apply it to the existing version in the repository.

But on the second part, one gets the idea that adding a second xml file, sirix generates the 'patch' inside and applies the differences.

It would be quite nice to have the first one, is there any import of a 'patch' available or planned?

Factor out indexes once they are working.

Provide listeners to build/update the indexes for update-operations.

JSON node layer

Each node has a unique 64bit identifyer (long).

Object node, consisting of record-nodes (simple name/value)-pairs. The value itself must be a another node, referenced by a long.
Array-node, consitsing of arrayItem-nodes, consisting of any type of nodes (long-values). We can simply store pointers to the right sibling for each arrayItem-node.
Null-node.
Boolean-node.
String-node.
Number-node, maybe the same as string but with a bit set, if it's a number or a string.

This simple design allows fine-granular copy-on-write operations.

How to reset the database in between XQuery queries

We have been reseting the database when using the low-level API with the following snippet:

//deletes the database (mydocs.col)
//uses org.apache.commons.io.FileUtils
File databaseLocation = new File(DATA_LOCATION, "mydocs.col")
FileUtils.deleteQuietly(databaseLocation);

//recreates the database
final DatabaseConfiguration dbConf = new DatabaseConfiguration(databaseLocation);
Databases.truncateDatabase(dbConf);
Databases.createDatabase(dbConf);
Database database;
try {
    database = Databases.openDatabase(dbConf.getFile());
    database.createResource(ResourceConfiguration.newBuilder(SirixHandler.RESOURCE, dbConf).useDeweyIDs(true).useTextCompression(true).buildPathSummary(false).build());
    final Session session = database.getSession(new SessionConfiguration.Builder(SirixHandler.RESOURCE).build());
} catch (SirixException e) { e.printStackTrace(); }

//DO OTHER THINGS...

We want to be able to do the same in between XQuery queries, but we seem to be failing miserably in that.
We can't use the previous code because after removing and recreation, we get a
SirixIOException: org.sirix.exception.SirixIOException: java.io.EOFException
when running any query after the removal in run time.

I assume that we are not recreating the database properly for the query to be performed.

Any help would be, as always, highly appreciated 😃

Node.js GUI

Hi!

I want to build a node.js Application for selecting a json XPATH by a Drag&Drop / Selecting-Elements GUI like the SunburstView. Is there a way to implement this in node / javascript ? It looks really great and would be pure intuitiv to change matching XPATH Elements.

Kind regards!

GUI errors

I have try to build the GUI and i get started the project but as soon as i choose to open resource and select a folder, it is crashing. Same happens when opening an XML document to shredder.
And i don't know really how to use the program as some documentation is kind of needed. I kind of want to visualize the database in some way. Can you give me some pointers?

GUI

Introduce a presentation model, capturing all this ugly mutable state inside the GUIs.
Code review.
Fixing PApplet/PApplet issues (two or more processing views activated are not working as of now).
Unit tests for the models with very simple basic trees.

Transaction over many resource transactions

Finishing the work. For instance fix truncate to just committed revision - 1. Add a resource lock manager to lock the just created revision until all resource transactions have been committed, to prevent dirty reads. Add a method to explicitly retry failed transactions.

Refactoring of page references

Use the idea of a bitmap to mark which entries in an array are null/not null. See array mapped tries in closure for instance.

java/util/Optional

When i update the library today i get the error and i cant realy find what ypu have updated for me to get this error:
java.lang.NoSuchMethodError: org.sirix.node.interfaces.immutable.ImmutableNode.getDeweyID()Ljava/util/Optional;
at org.sirix.xquery.node.DBNode.(DBNode.java:105)
at org.sirix.xquery.node.SubtreeBuilder.startElement(SubtreeBuilder.java:195)
at org.brackit.xquery.node.parser.SAX2SubtreeHandlerAdapter.startElement(SAX2SubtreeHandlerAdapter.java:255)
at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source)
at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
at org.apache.xerces.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.brackit.xquery.node.parser.DocumentParser.parse(DocumentParser.java:130)
at org.sirix.xquery.node.DBStore.create(DBStore.java:189)
at org.sirix.xquery.node.DBStore.create(DBStore.java:164)
at org.sirix.xquery.node.DBStore.create(DBStore.java:45)
at org.brackit.xquery.function.bit.Load.create(Load.java:139)
at org.brackit.xquery.function.bit.Load.execute(Load.java:98)
at org.brackit.xquery.function.FunctionExpr.evaluate(FunctionExpr.java:113)
at org.brackit.xquery.XQuery.run(XQuery.java:88)
at org.brackit.xquery.XQuery.evaluate(XQuery.java:79)
at models.XQueryUsage.init(XQueryUsage.java:218)
at models.XQueryUsage.loadDocumentAndQueryTemporal(XQueryUsage.java:241)
at controllers.Application.testSirix(Application.java:284)

Brackit(.org) binding.

Provide a brackit-binding for full XQuery/XQuery Update Facility support and integrate indexes into the query processor.

Eventually got an error: java.lang.IllegalStateException: Failed to move to nodeKey: X

XML in:
<log tstamp="Sun May 04 23:29:47 CEST 2014" severity="high" foo="bar"> <src>192.168.0.1</src> <content> <a> <a>init file</a> </a> </content> </log>

Query:
replace node doc('mydocs.col')/log/content with <a><div>hej</div></a>
correct result:
<log tstamp="Sun May 04 23:29:47 CEST 2014" severity="high" foo="bar"> <src>192.168.0.1</src> <a> <div>hej</div> </a> </log>

If i use the same input but change the query to:
replace node doc('mydocs.col')/log/content/a with <a><div>hej</div></a>

Next time i run the query:doc('mydocs.col')/log/all-time::*
i get the error:
Eventually got an error: java.lang.IllegalStateException: Failed to move to nodeKey: X
I use the new SirixCompileChain if it has something to do with it.

Provide indexes.

Providing revisioned indexes on the fly during update-operations.

Docker compose

As the RESTful-API and thus the docker container is needing a started Keycloak instance, we could provide a Docker Compose file.

Adding Git transactions for instance as an after commit hook

Thus it would be easy to integrate Git versioning at the file/resource level with EGit for instance.

Correct Name + SVG Logo?

I've added your new system to my encyclopedia of databases:

https://dbdb.io/db/sirix

Two questions:

Is the official name of the database Sirix or SirixDB? The Github repo uses both.
Is there an SVG logo available?

-- Andy

Link NodePages to previous versions

Link NodePages to previous versions to skip the whole tree-traversal (even if based on bit shiftings) such that NodePage reconstruction should be faster.

Insert attribute in element

I am not sure i use the correct syntax.
XML before in mydocs.col:

Running query:
insert node attribute { 'a' } { 5 } into doc('mydocs.col')/a
Is this wrong syntax or is this not implemented?

Running
insert node attribute into doc('mydocs.col')/a
Works.

The first one i get the error:
[error] play - Cannot invoke the action, eventually got an error: java.lang.ClassCastException: org.sirix.access.NodeReadTrxImpl cannot be cast to org.sirix.api.NodeWriteTrx
[error] application -

! @6ie84fmpf - Internal server error, for (POST) [/commit] ->

play.api.Application$$anon$1: Execution exception[[ClassCastException: org.sirix.access.NodeReadTrxImpl cannot be cast to org.sirix.api.NodeWriteTrx]]
at play.api.Application$class.handleError(Application.scala:293) ~[play_2.10.jar:2.2.2]
at play.api.DefaultApplication.handleError(Application.scala:399) [play_2.10.jar:2.2.2]
at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$2$$anonfun$applyOrElse$3.apply(PlayDefaultUpstreamHandler.scala:261) [play_2.10.jar:2.2.2]
at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$2$$anonfun$applyOrElse$3.apply(PlayDefaultUpstreamHandler.scala:261) [play_2.10.jar:2.2.2]
at scala.Option.map(Option.scala:145) [scala-library.jar:na]
at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$2.applyOrElse(PlayDefaultUpstreamHandler.scala:261) [play_2.10.jar:2.2.2]
Caused by: java.lang.ClassCastException: org.sirix.access.NodeReadTrxImpl cannot be cast to org.sirix.api.NodeWriteTrx
at org.sirix.xquery.node.DBNode.setAttribute(DBNode.java:1371) ~[sirix-xquery-0.1.2-SNAPSHOT.jar:na]
at org.sirix.xquery.node.DBNode.setAttribute(DBNode.java:1) ~[sirix-xquery-0.1.2-SNAPSHOT.jar:na]
at org.brackit.xquery.update.op.InsertAttributesOp.doInsert(InsertAttributesOp.java:46) ~[brackit-0.1.3-SNAPSHOT.jar:na]
at org.brackit.xquery.update.op.AbstractInsertOp.apply(AbstractInsertOp.java:56) ~[brackit-0.1.3-SNAPSHOT.jar:na]
at org.brackit.xquery.update.UpdateList.apply(UpdateList.java:94) ~[brackit-0.1.3-SNAPSHOT.jar:na]
at org.brackit.xquery.QueryContext.applyUpdates(QueryContext.java:105) ~[brackit-0.1.3-SNAPSHOT.jar:na]

JSON transaction layer.

Either we use the XdmNodeRead / Write transactions or we provide a simple means to insert.

empty objects.
object-records.
empty arrays.
array-items (with an index).
null-nodes.
boolean-nodes.
string/number-nodes.

We also should be able to delete those in the same way.

Create Java9 and above module-infos.

In order to use Jigsaw/the module system, we should write module definition files.

Adding optional request parameter "prettyPrint=true/false"

For the RESTful-API it would be useful to have a prettyPrint=true/false parameter, if the XML-content
HTTP-response body should be pretty printed or not.

Refactoring the XdmNodeReadTrx

Use the command pattern, basically a factory which creates several commands internally, such that more or less the content of each method is encapsulated in a seperate class, so we can finally write simple unit tests (multiple small classes way better than this monster class).

Cluster nodes.

Cluster nodes during full dumps. However this is an open issue.

Revision the database itself, not only the resources

Idea is simply to create a metadata Sirix resource which stores the names of the resources and their current revision and if they are marked as deleted.

This depends on the database transaction issue, which needs to be finished first.

Fixing the docker image

Issue is:

docker run -t -i -p 9443:9443 sirixdb/sirix
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8080
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:632)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:844)
Caused by: java.net.ConnectException: Connection refused
... 11 more
Dec 27, 2018 3:09:07 PM io.vertx.core.impl.launcher.commands.VertxIsolatedDeployer
SEVERE: Failed in deploying verticle
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8080
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:327)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:632)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:844)
Caused by: java.net.ConnectException: Connection refused
... 11 more

Encryption of the page-fragments

I wanted to use Google Tinke, but something is way off the rails, because it says that I'm using a different key for encrypting/decrypting.

Caused by: java.io.IOException: No matching key found for the ciphertext in the stream.
at com.google.crypto.tink.streamingaead.InputStreamDecrypter.read(InputStreamDecrypter.java:187)
at com.google.crypto.tink.streamingaead.InputStreamDecrypter.read(InputStreamDecrypter.java:130)
at com.google.crypto.tink.streamingaead.InputStreamDecrypter.read(InputStreamDecrypter.java:121)
at java.base/java.io.DataInputStream.readByte(DataInputStream.java:270)
at org.sirix.page.PagePersister.deserializePage(PagePersister.java:49)
at org.sirix.io.file.FileReader.read(FileReader.java:125)

User management / Encryption

Maybe using Apache Shiro!?

"special" COW BPlusTree for index-structures

TextValue/AttributeValue-indexes, Path-index

Keep in mind that once the text-index has been included as well as the binding to Brackit, that we have to make sure that Path-index-updates are executed before Text-index updates.

Docker setup / Fix README

pull the image from Docker Hub
create a docker container
update the container with an adapted sirix-conf.json file (client.secret value for Keycloak auth must be set and maybe the listening port for the HTTP(S) verticle
start the docker container

Make it work, I'm by no means sadly no docker guru.. and fix the README section accordingly.

APIs based on Vert.x and Kotlin

For instance a new RESTful API.

Creating RESTful-API documentation with Swagger

I don't think we can generate it, but maybe "by hand".

update to java 11

consider updating to java 11

java 9 is EOL ... :)

Setup the integration tests on Travis for the RESTful API

Somehow travis doesn't execute the JUnit5 integration tests. That said first of all a docker container with Keycloak and a custom imported realm with roles, the client secret... would have to be set up.

JSON-resource manager

More or less the same as the XdmResourceManager, to start/end transactions on a JSON-resource.

Factor out the Guava caches for reading into the session.

Thus, use the same cache for different reading transactions, which are opened on the same revision.

Xquery replace followed by insert gives unexpected result

I want to do a replace and after a insert query and getting wrong result.
Code:

try (DBStore store= DBStore.newBuilder().build();){
  CompileChain compileChain = new SirixCompileChain(store);
  QueryContext ctx1 = new SirixQueryContext(store);
  String query2="replace node doc('" + databaseName + "')/log/src with <node>aaa</node>";
  new XQuery(compileChain, query2).evaluate(ctx1);
  String query3="insert nodes <ab>abc</ab> into doc('" + databaseName+ "')/log/content";
  new XQuery(compileChain,query3 ).evaluate(ctx1);
}

running query:doc('mydocs.col')/log/all-time::*

<log tstamp="Tue May 06 12:56:10 CEST 2014" severity="high" foo="bar">
  <src>192.168.0.1</src>
  <content>
    <a.txt/>
  </content>
</log>

<log tstamp="Tue May 06 12:56:10 CEST 2014" severity="high" foo="bar">
  <node>aaa</node>
  <content>
    <a.txt/>
  </content>
</log>

<log tstamp="Tue May 06 12:56:10 CEST 2014" severity="high" foo="bar">
  <content>
    <a.txt/>
  </content>
</log>

Expected is that the last one

<log tstamp="Tue May 06 12:56:10 CEST 2014" severity="high" foo="bar">
  <node>aaa</node>
  <content>
    <ab>abc</ab>
    <a.txt/>
  </content>
</log>

If we use two insert it works. or if I use a new store, it also works.

JSON

Implementing two node types, JSONArray and JSONObject. Implementing a JSONReadTrx and a JSONReadWriteTrx on top of the page reader/writer transactions just as in the XML case. We also need a second resource manager as well as methods in the database implementation to create a resource and open a resource manager.

ENTITY dont work

XML file:

Create a volume with 8 GB of space, and specify the availability zone and image:

Giving error:
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,142]
Message: The entity "nbsp" was referenced, but not declared.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:601) ~[na:1.8.0_20]
at com.sun.xml.internal.stream.XMLEventReaderImpl.peek(XMLEventReaderImpl.java:276) ~[na:1.8.0_20]
at org.sirix.service.xml.shredder.XMLShredder.insertNewContent(XMLShredder.java:262) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]
at org.sirix.service.xml.shredder.XMLShredder.call(XMLShredder.java:217) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]
at org.sirix.diff.service.FMSEImport.shredder(FMSEImport.java:99) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]
at org.sirix.diff.service.FMSEImport.dataImport(FMSEImport.java:122) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]

RESTful API enhancement, storing resources from multipart requests

This would allow to store a bunch of resources at once.

Moving a subtree inside its sibling fails with mention to DeletedNode.

I'm trying to move a subtree inside its sibling and am not succeeding.

my input xml:

<bar>
  <foz>
    <baz/>
  </foz>
  <foo>
    <bao/>
  </foo>
</bar>

what I intend as output xml:

<bar>
  <foz>
    <foo>
      <bao/>
    </foo>
    <baz/>
  </foz>
</bar>

function I am calling to achieve this:

NodeWriteTrx moveSubtreeToFirstChild(long fromKey) 
    throws SirixException

relevant calling code

// ...
NodeWriteTrx wtx = session.beginNodeWriteTrx();
// ... code that writes the input xml...
NodeReadTrx rtx = session.beginNodeReadTrx();
// .... code that moves rtx to 'foo' and wtx to 'foz'
long fromKey = rtx.getNodeKey();
wtx.moveSubtreeToFirstChild(fromKey).commit();
// ...

error output

java.lang.ClassCastException: org.sirix.node.DeletedNode cannot be cast to org.sirix.node.interfaces.StructNode

Caused by: java.lang.ClassCastException: org.sirix.node.DeletedNode cannot be cast to org.sirix.node.interfaces.StructNode
    at org.sirix.index.path.summary.PathSummaryWriter.removePathSummaryNode(PathSummaryWriter.java:660) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]
    at org.sirix.index.path.summary.PathSummaryWriter.adaptPathForChangedNode(PathSummaryWriter.java:443) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]
    at org.sirix.access.NodeWriteTrxImpl.moveSubtreeToFirstChild(NodeWriteTrxImpl.java:322) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]

Any help with this would be much appreciated 😃

PS: Some context on your users: I'm working in the same project with @bark, we're doing our master thesis in XML versioning.

Save DeweyIDs to guarantee very fast document order determination.

However space overhead.

SirixUsageException when using FMSEImport

We're getting a SirixUsageException when trying to use the FMSEImport.

SirixUsageException: Resource could not be opened (since it was not created?) at location /.../sirix-data/mydocs.col/resources/shredded

We have an initial commit with:

<log tstamp='%s' severity='%s' foo='bar'>
<src></src>
<msg><bbb></bbb></msg>
<content></content>
</log>

And we're importing another file with:

<log><a></a></log>

Using the following code:

// XML document which should be imported as the new revision.
   File resNewRev = File.createTempFile("temp-file-name", ".tmp"); 
BufferedWriter bw = new BufferedWriter(new FileWriter(resNewRev));
   bw.write("<log><a></a></log>");
   bw.close();
// Determine and import differences between the sirix resource and the
// provided XML document.
final FMSEImport fmse = new FMSEImport();
fmse.main(new String[]{LOCATION.getAbsolutePath()+"/"+databaseName+"/",resNewRev.getAbsolutePath()});

The stack trace:

play.api.Application$$anon$1: Execution exception[[SirixUsageException: Resource could not be opened (since it was not created?) at location /home/bark/sirix-data/mydocs.col/resources/shredded ]]
at play.api.Application$class.handleError(Application.scala:293) ~[play_2.10.jar:2.2.2]
at play.api.DefaultApplication.handleError(Application.scala:399) [play_2.10.jar:2.2.2]
at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$2$$anonfun$applyOrElse$3.apply(PlayDefaultUpstreamHandler.scala:261) [play_2.10.jar:2.2.2]
at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$2$$anonfun$applyOrElse$3.apply(PlayDefaultUpstreamHandler.scala:261) [play_2.10.jar:2.2.2]
at scala.Option.map(Option.scala:145) [scala-library.jar:na]
at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$2.applyOrElse(PlayDefaultUpstreamHandler.scala:261) [play_2.10.jar:2.2.2]
Caused by: org.sirix.exception.SirixUsageException: Resource could not be opened (since it was not created?) at location /home/bark/sirix-data/mydocs.col/resources/shredded 
at org.sirix.access.DatabaseImpl.getSession(DatabaseImpl.java:249) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]
at org.sirix.diff.service.FMSEImport.dataImport(FMSEImport.java:126) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]
at org.sirix.diff.service.FMSEImport.main(FMSEImport.java:169) ~[sirix-core-0.1.2-SNAPSHOT.jar:na]
at models.SirixHandler.commit(SirixHandler.java:171) ~[na:na]
at controllers.Application.commit(Application.java:216) ~[na:na]
at Routes$$anonfun$routes$1$$anonfun$applyOrElse$14$$anonfun$apply$14.apply(routes_routing.scala:229) ~[na:na]

We are a bit out of ideas of why it isn't able to shread the file, can you share any light?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.