unixjunkie / bisec-tree Goto Github PK
View Code? Open in Web Editor NEWBisector tree implementation in OCaml
License: BSD 3-Clause "New" or "Revised" License
Bisector tree implementation in OCaml
License: BSD 3-Clause "New" or "Revised" License
make another implementation, where only the index will be in memory, actual bucket
content's is in a persistent hash table on disk.
maybe just change the bucket type as a start, then add another config option (storage = Volatile | Persistent).
this could create the cool plot of points also
P.dist vp1 vp2 = 0.0
then all remaining points must be put into the same bucket; this bucket might have a size > k
use zmq for paral/dist
implement this instead of using heuristics
tag and push in opam once tests are OK
take a random partition (bootstrap) of all points instead of all points when doing two bands (when too many points to index)
to "grow" an existing tree with more points
because not Array.for_all
cf. code in vpt
since parallelization is poor
instead of marshalling sub trees.
Might scale better.
very easy if nprocs is a power of two
like check but print problems and their address
This is just too annoying for users of the library.
Pass explicitly the heuristic and the bucket size when needed instead.
To try finding a counter example to the triangular inequality when you have doubts about the distance function
output in graphviz format
once I have a stable version released
for a 2D point set
algorithm:
boolean answer: yes/no your query point has neighbors in the tree at given distance;
i.e. the range query, but we don't care about the list of neighbors.
Such queries should terminate faster than the range query.
now this makes a lot of sense with the add method having been added (to grow an existing tree)
might be useful
cut the number of distance computation upon searching into the bst
the code can be simpler:
without creating a potentially huge tmp list
upon construction of a large tree, users have a right to some feedback
With the regulat functor: only exact distances are used.
If the user knows a cheap to compute upper bound, we could exploit that
one as often as possible.
like neighbors, but will also keep the non matching points
in one_band and two_bands.
This is a performance bug since the two VPs are random in that case.
We should use a bucket
I'm a little puzzled by the wording in the README:
A bisector tree allows to do fast but exact nearest neighbor
isn't 'exact nearest neighbor' what we want, so it'd be better to say:
A bisector tree allows to do fast and exact nearest neighbor
or, and I've only read the abstract so it isn't clear but is it actual an inexact nearest neighbor?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.