Comments (28)
guys, is there any news on the multicore index building?
from annoy.
roughly, but it's a bit more complicated._make_tree
also modifies the datastructure. You also need a mutex around line 643-644 and line 702-703. Maybe that mutex could just be folded into _allocate_size
and we could rename the function to something like _add_item
... probably makes more sense
then you obviously need to create pthreads and collect them afterwards... honestly I haven't done that in like 10 years, but iirc it's not too hard
from annoy.
not afaik :(
from annoy.
I think what you are saying makes sense. Thanks for the tip on shared mutex - I hadn't seen that before. It was introduced in c++17 though which may be it's own problem.
I'll take another look and see what I can figure out.
from annoy.
It wouldn't be too hard – you would need a mutex in a couple of places and also watch out for reallocations, but other than that it should be easy
from annoy.
Cool thanks, I'll look into it (although my C++ is a little rusty :))
from annoy.
Has there been any further work on this?
from annoy.
No work on it from my end.
from annoy.
Bummer. I'd love to help out, but I don't think I'm familiar enough with annoy yet (or frankly competent enough with C++) to spearhead anything. If anyone starts working on this and wants a hand, I'm happy to help out.
from annoy.
Actually, I take it back. It looks like a pretty simple change. If I understand it properly, it's line 496 of annoylib.h
:
_roots.push_back(_make_tree(indices));
That can get parallelized, but with a mutex around _roots
to avoid simultaneous push_back
calls, right?
from annoy.
@erikbern Actually I was working on a branch to add this just this weekend. It doesn't quite work yet though (I can only get it to run the precision_test.cpp example, only a debug binary works and it crashes sometimes even then); should I submit a PR?
I didn't touch anything towards the bottom of annoylib.h (near those lines that you mentioned), that might be the problem...
from annoy.
sure, feel free to submit a PR, just make sure to put "WIP" in the subject or something
you definitely need a mutex around those lines, that's probably the issue :)
from annoy.
@tjrileywisc what a coincidence! I also started working on this over the weekend. Mine also doesn't quite work... but I've pushed a copy to my fork: https://github.com/thomas4g/annoy/tree/parallelize_build
I hope to keep working on it tonight, but if you're further along let me know if you'd like any help! Feel free to ping me here or shoot me an email [email protected]
from annoy.
@thomas4g
Just took a peak at your fork - I'm actually using std::thread instead of pthread . I think Windows doesn't have support for pthread built in. std::thread should be available for gcc and MSVC.
from annoy.
@tjrileywisc ahh, I wanted to use that but got thrown off by the build not supporting my #include <mutex>
by default. I didn't want to fiddle with build settings.
from annoy.
Just submitted a PR for my build_trees_threaded branch.
from annoy.
I thought I might have a go at this. The actual threading machinery is all straightforward but I'm having some trouble seeing exactly which bits of _make_tree()
modify common data and need to be guarded with a mutex. As per previous comments I wrapped some functionality into a thread safe _add_item()
function but I'm still getting segfaults indicating data is getting modified out from under my threads somewhere else.
S _add_item() {
std::lock_guard<std::mutex> guard(_mutex);
_allocate_size(_n_nodes + 1);
S item = _n_nodes++;
return item;
}
Any help with this is appreciated.
from annoy.
interesting – you're probably right that it should be fairly straightforward, but i can't think of other critical sections off the top of my head
from annoy.
Unless I'm misunderstanding the code this may be more complicated than I first thought. It appears that there are many places within _make_tree
that touch the underlying data. For example _get
would need a mutex since it could be attempting to access _nodes
while another thread is calling _add_item
. But worse, since it returns a pointer, anywhere that pointer is used would also need a mutex. I believe this is looking like you would wind up needing mutexes around quite a large portion of _make_tree
which would mostly undo the benefits of parallelizing it. It's possible I'm still not understanding the code fully though.
from annoy.
From what you say I think the main issue is that the underlying memory can be reallocated at any point in time, and that invalidates any pointers held by any other thread.
But those reallocations are actually pretty rare so there should be some way to fix.
I haven't dealt with concurrency code in C++ since maybe 2007 so my knowledge is a bit rusty but couldn't you use a shared lock for this? Almost all access will be nonexclusive (so near-zero overhead), but the few times when you need to reallocate the underlying storage, you would have to acquire an exclusive lock. Does that make sense?
from annoy.
I meant a shared mutex. This looks like the right concurrency primitive: https://en.cppreference.com/w/cpp/thread/shared_mutex
So basically acquire a shared lock when writing individual vectors, acquire an exclusive lock when you have to resize the underlying data storage.
But I'm mostly speculating, could be wrong :)
from annoy.
@os-gabe Hey, Is it solved? Any insights on how to parallelize build?
from annoy.
no, this would have to be implemented by someone
from annoy.
@erikbern That's sad. I am trying to build an index for over 1M vectors and it is crashing even with on_disk_build. The process took up more than 30 GB memory and crashed.
from annoy.
i'm not sure if parallelization would have helped, though. do you know what's causing it to crash?
from annoy.
Yeah agreed that parallelization probably would not help.
1M vectors isn't that much unless they are extremely high dimensional vectors. Did you try with a smaller number of vectors to see where it starts failing?
from annoy.
good point – annoy isn't meant for super high dimensionality, so if that's what you're facing then you should probably run dimensionality reduction outside of annoy first!
from annoy.
@os-gabe Hey, Is it solved? Any insights on how to parallelize build?
Unfortunately I had to move on to other things and did not get parallel build working
from annoy.
Related Issues (20)
- Build parallalization is not working. HOT 2
- Support annoy index loading from binary index data
- How many trees should I use? HOT 2
- Memory Leak in Annoy (get_nns_by_vector)? HOT 8
- Annoy Object Not Pickle'able HOT 1
- Add sample weights to distance metric? HOT 3
- Source distribution not availabe for 1.17.2 version HOT 2
- Unable to inherit the AnnoyIndex class HOT 2
- doesn't work correctly if torch tensor is input. But also doesn't throw error. Pls add an assertion that this only takes np arrays not torch tensors HOT 2
- _Vector should use position-only parameter for the index HOT 3
- How do you reduce a vector to 2 coordinates HOT 1
- [Distance] What did I do wrong?
- [MSVC] Annoy failed to run test on Windows HOT 1
- Some segment faults HOT 1
- Regarding updating an existing ANNOY model HOT 2
- Anyone tried storing trees and nodes in DynamoDB? HOT 1
- Is there any workaround to be able to use the Chebyshev distance with this library? HOT 1
- from annoy import AnnoyIndex
- Annoy build failed in MSVC x86 mode
- Using a built Annoy tree in a different device HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from annoy.