<input type="checkbox" id="" disabled=""

Merge <a class="commit-link" data-hovercard-type="commit" data-hovercard-url="https://

The performance issue was fixed in <a class="commit-link" data-hovercard-type="commit"

I just added the helper functions in <a class="commit-link" data-hovercard-type="commi

Write a Python wrapper about falconn HOT 9 CLOSED

falconn-lib commented on May 17, 2024

Write a Python wrapper

from falconn.

Comments (9)

ludwigschmidt commented on May 17, 2024

Merge 6f0ca80 introduces the first version of the Python wrapper. Currently it supports dense data with double and single precision.

The performance on the Python random data benchmark is within roughly 10% of the C++ random data benchmark. The main slow-down seems to be the candidate comparison, where the wrapped code takes roughly 10 - 20% more time, at least on my laptop. Not clear why this is the case, but it might have to do with the fact that we are not using "standard" Eigen vectors but Eigen's Map functionality.

from falconn.

ludwigschmidt commented on May 17, 2024

The performance issue was fixed in 5a69217 . The issue was related to passing Eigen vectors / memory maps correctly: http://eigen.tuxfamily.org/dox/TopicFunctionTakingEigenTypes.html

from falconn.

ilyaraz commented on May 17, 2024

Do we check dtype of the numpy array we pass? Would be nice to do it, if that's not the case yet.

from falconn.

ludwigschmidt commented on May 17, 2024

We are not doing it on the C++ side (the C++ side doesn't really the the numpy object). The swig-generated numpy wrapper might do it automatically (swig does it for "normal" objects at least). Definitely something we should test though.

from falconn.

ilyaraz commented on May 17, 2024

Another thing: not to forget to write IN CAPS that DON'T LET YOUR NUMPY ARRAY TO GET GC'ED! Otherwise, the C++ internals crash silently and that could be very confusing.

Would it be possible to enforce this or throw a more explicit exception if the memory is corrupt?

from falconn.

ludwigschmidt commented on May 17, 2024

Agreed! (it's a bold item in the to-do list above)

A check on the C++ side would be great, but I don't know how to check if a given float or double pointer is valid. I did a quick search and it's not clear if this is possible.

Another workaround would be to write a thin wrapper on the Python side that stores a reference to the numpy array with the LSH table. Then things won't get garbage collected accidentally.

from falconn.

ludwigschmidt commented on May 17, 2024

I just added the helper functions in 60cf7e4 .

So for the first version of the Python wrapper, only documentation (github wiki and the glove example) should be left now.

from falconn.

ilyaraz commented on May 17, 2024

Should we have a thin wrapper around whatever swig produces? It would serve two purposes:

detect dtype automatically and call the appropriate version of the construction function
hold a reference to the dataset so that it does not get garbage-collected
(optionally) we can provide a flag for re-centering

I think it would substantially improve the user experience.

I can draft the first version, which we can review together.

from falconn.

ludwigschmidt commented on May 17, 2024

Yes, I still think that a thin wrapper would be a good idea, especially for the dtype issue. It might be good to keep it as "thin" as possible (small additional functionality) so that the C++ and Python versions don't diverge too much.

Maybe we should only do re-centering with issue #7. That issue is more for "implicit" re-centering (no additional copy of the data), but it would be good to clearly separate the two (e.g., by saying that construct_table never copies the data set and offering a separate function for pre-processing that copies the data?)

from falconn.

Write a Python wrapper about falconn HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent