Coder Social home page Coder Social logo

basex-lmdb's Introduction

BaseX over LMDB

A long time wish to make BaseX on disk base more robust.

BaseX Core was stripped to its bare essentials (not really, more can be removed to make it even skinnier).

A LmdbData, TableLmdbAccess and related builder an indexes were created to work on top of LMDB with lmdbjni.

The idea is to strengthen the BaseX store structure and later replicate it (with BookKeeper?, jgropus-raft?) for further high availability.

build

gradle clean install test

run

java -jar basex-lmdb.jar

from project basedir

simple usage

In a browser or with curl, issue a HTTP GET request to http://localhost:8080/doc('file://etc/books.xml')

Any xquery will work after http://localhost:8080/

XQuery details

The query string part of the URL will be interpreted as external variables to the XQuery context except for the following two:

content-type: what should be the resulting contents output type? default is "text/xml"

indent-content: if resulting content should be indented. default is "no". use "yes/no", "true/false".

 

All items below assumes you are in the project basedir:
create a collection named etc:

curl -X PUT 'http://localhost:8080/etc'

 

create a document named factbook inside etc collection:

curl --upload-file ./db/xml/etc/factbook.xml 'http://localhost:8080/etc/factbook'

 

create a document named lakes inside etc collection as the result of an xquery:

curl -X PUT 'http://localhost:8080/etc/lakes/<lakes>\{doc("etc/factbook")//lake\}</lakes>'

 

remove the document named factbook from the collection etc:

curl -X DELETE 'http://localhost:8080/etc/factbook'

 

remove the etc collection:

curl -X DELETE 'http://localhost:8080/etc'

 

some updates to factbook document:
curl -d 'rename node doc("etc/factbook")//lake[1] as "LAKE"' -X POST http://localhost:8080```
curl -d 'replace value of node doc("etc/factbook")//LAKE/@name with "Casper Sea"' -X POST http://localhost:8080``` 
curl -d "insert node <lake name='Lago da Paz'/> into doc('etc/factbook')/mondial" -X POST http://localhost:8080```

bigger things

If you want bigger examples, try db/xml/shakespeare.zip and db/xml/religion.zip from the base directory:

> cd db/xml 
> unzip shakespeare.zip curl -X PUT 'http://localhost:8080/shakespeare'
> cd shakespeare
> ls | while read F; do N=`echo $F | cut -d '.' -f 1`; curl --upload-file $F > "http://localhost:8080/shakespeare/$N" & done
> unzip religion.zip
> curl -X PUT 'http://localhost:8080/religion'
> cd ../religion
> ls | while read F; do N=`echo $F | cut -d '.' -f 1`; curl --upload-file $F "http://localhost:8080/religion/$N" & done

even bigger things

I think this is not yet the hardest for basex-lmdb but it is a feasible real world example at hand. download National Library of Medicine (ftp://ftp.nlm.nih.gov/nlmdata/sample/medline/) data and try it like the shakespeare example above. the biggest file there is over 150MB and has over 4.5 million XML nodes.

there's also XMark's Benchmark Data Generator if you want to get serious.

extra documentation

As stated by the title this is nothing less than BaseX itself, so any BaseX documentation regarding XQuery and modules (with some exceptions yet to be listed) can be used as is.

todo

  • considering kafka for replication. can I embed it?
  • optimize xquery updates by writing to a LSM based solution before writing to LMDB thus freeing the sync client faster. considering the idea is to (maybe) replicate by using jgropus-raft and once it uses LevelDB internally, would simply writing to the cluster do the trick?
  • create OS based maven profile for dealing with lmdbjni dependencies
  • need to port tests and improve LmdbDataManager tests
  • improve the return error codes in REST XQueryHandler
  • migrate XQueryHandler to a servlet and create a maven WAR packaged project
  • needs more documentation about configuration and running standalone or servlet
  • document the URI's used in fn:doc(): bxl://, file://, jdbc:// and related configurations where it fits
  • create new URI's accessed through fn:doc(): http:// with HtmlUnit and extras with commons VFS
  • replicate with jgropus-raft. Ideas?
  • assuming above replication is using raft and we have a good cluster, what about distributing XQuery queries amongst the cluster members for load balancing?
  • create a Camel component for basex-lmdb and use it as a solid integration database (in the end canonical messages passing by are all xml anyway... right?).

basex-lmdb's People

Contributors

christiangruen avatar leowoerteler avatar mauricioscastro avatar dimitarp avatar jenserat avatar micheee avatar holu avatar kissdanigh avatar dirkk avatar masoumeh avatar jb8748 avatar charles-dyfis-net avatar adrianber avatar cfoster avatar siahr avatar amazingphil avatar hhv avatar andria009 avatar kristiank avatar davidmathei avatar malamut2 avatar mingarao avatar daveaglick avatar carlosmarcos avatar meersdavy avatar emchateau avatar godmar avatar lukasl avatar mscastro-jetstar avatar asura avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.