demoriarty / torchpq Goto Github PK
View Code? Open in Web Editor NEWApproximate nearest neighbor search with product quantization on GPU in pytorch and cuda
License: MIT License
Approximate nearest neighbor search with product quantization on GPU in pytorch and cuda
License: MIT License
hey there! i'm having perf issues because cuml.UMAP does not scale well with the size of my dataset.
probably because it performs a greed KNN algorithm, thereby computing billions of distances.
do you know if it's possible to use TorchPQ to pre-compute a sparse distance matrix i can pass to cuml.UMAP here?
Thanks a lot for the library. I never got cupy to work so I had to use something else. I would be great to get rid of that dependency.
I see the following error when I try to import torchpq.clustering.
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
/tmp/ipykernel_7302/3376715144.py in <module>
----> 1 from torchpq import clustering
~/.local/lib/python3.8/site-packages/torchpq/__init__.py in <module>
18 from .CustomModule import CustomModule
19
---> 20 topk = fn.Topk()
~/.local/lib/python3.8/site-packages/torchpq/fn/Topk.py in __init__(self)
4 class Topk:
5 def __init__(self):
----> 6 self._top32_cuda = TopkSelectCuda(
7 tpb = 32,
8 queue_capacity = 4,
~/.local/lib/python3.8/site-packages/torchpq/kernels/TopkSelectCuda.py in __init__(self, tpb, queue_capacity, buffer_size)
23 self.buffer_size = buffer_size
24
---> 25 with open(get_absolute_path("kernels", "cuda", "topk_select.cu"),'r') as f: ###
26 self.kernel = f.read()
27
FileNotFoundError: [Errno 2] No such file or directory: '/home/XXXX/.local/lib/python3.8/site-packages/torchpq/kernels/cuda/topk_select.cu'
Installation details:
However, I am able to run from torchpq.index import IVFPQIndex
without any issue.
Can you please help me fix this?
Thanks for the nice work!
But when I tried to import MultiKMeans using the command shown in README.md:
from torchpq.kmeans import MultiKMeans
it goes wrong and said:
ModuleNotFoundError: No module named 'torchpq.kmeans'
And when I try to use:
from torchpq.clustering import MultiKMeans
to import, and it goes right.
I wonder if it is correct since it is different from what README.md says.
topk
is not an method of torchpq.index.IVFPQIndex
. Either it should exist or the readme is wrong.
Hello,
Thank you very much for sharing the project. I am interested in using torchPQ inside a deep nets (implemented in pytorch) where in each forward pass, I will call torchPQ. I was wondering is this possible?
Also, I saw https://ai.googleblog.com/2020/07/announcing-scann-efficient-vector.html, have you tried some comparison with other methods?
Thank you!
Hi, firstly thanks for your wonderful work.
I want to get the centroids of the clusters and visualize them. However, from your introduction, it seems I can only get the labels of all samples. Do you have any suggestions that I can get the results?
Thanks again for helping me out.
Hello,
I am trying to run your awesome CUDA-powered k-means. For testing purposes, I would like to make it runnable also on CPU, but I am getting errors during importing because of this:
which results in:
CUDARuntimeError: cudaErrorNoDevice: no CUDA-capable device is detected
Would you mind changing it to something like:
if torch.cuda.is_available():
__device = cp.cuda.Device().id
else:
__device = None
or hiding the imports of get_default_device
and set_default_device
(they seem to be imported after checking torch.cuda.is_available()
anyway, so it should be possible)?
And also getting rid / hiding this:
Line 22 in b8bbadf
Hello, I'm trying to run your Readme example and I get __init__() got an unexpected keyword argument 'blocksize'
on removing blocksize
, then i see __init__() got an unexpected keyword argument 'init_size'
Hi, thanks very much for sharing this project. I have been looking for a package supporting batch kmeans
for a very long period. Very glad to find that TorchPQ
supports that (MultiKMeans
). Many thanks again.
But I have a question regarding the argument sm_size
of initializing MultiKMeans
. I know it is Shared Memory Size of CUDA. I am not familiar with CUDA programming and cannot figure out what the default value 48 * 256 * 4
means (the comment in the code does not mention this argument), even after I search on the internet. Could you briefly explain this here? Also, I guess increasing this value can speed up the computation? Am I right? Thanks for your time.
just tried this today
Traceback (most recent call last):
File "/datadrive/phd-projects/PiCIE/eval_minimal.py", line 18, in <module>
from torchpq.clustering import MinibatchKMeans
File "/anaconda/envs/py38_pytorch/lib/python3.8/site-packages/torchpq/__init__.py", line 11, in <module>
from . import experimental
ImportError: cannot import name 'experimental' from partially initialized module 'torchpq' (most likely due to a circular import) (/anaconda/envs/py38_pytorch/lib/python3.8/site-packages/torchpq/__init__.py)
I'm a beginner, please how can I use multiple GPUs in MinibatchKMeans?
from torchpq.clustering import MinibatchKMeans
import torch
n_data = 10000 # number of data points
d_vector = 128 # dimentionality / number of features
x = torch.randn(d_vector, n_data, device="cuda")
minibatch_kmeans = MinibatchKMeans(n_clusters = 128)
minibatch_kmeans = torch.nn.DataParallel(minibatch_kmeans, device_ids=[0,1,2])
n_iter = 10
tol = 0.001
for i in range(n_iter):
x = torch.randn(d_vector, n_data, device="cuda")
minibatch_kmeans.fit_minibatch(x)
if minibatch_kmeans.error < tol:
break
And I get the below output
Traceback (most recent call last):
File "kmean_torch.py", line 14, in <module>
minibatch_kmeans.fit_minibatch(x)
File "/data/home/dl/anaconda3/envs/clip/lib/python3.7/site-packages/torch/nn/modules/module.py", line 779, in __getattr__
type(self).__name__, name))
torch.nn.modules.module.ModuleAttributeError: 'DataParallel' object has no attribute 'fit_minibatch'
This issue seems to come up when the tensor length (n_data) is greater than 8388480.
n_data = 8388481 # Works when n_data = 8388480
n_kmeans = 5
d_vector = 3
A = torch.randn(n_kmeans, d_vector, n_data, device="cuda")
kmeans = MultiKMeans(n_clusters=10, distance="euclidean")
labels = kmeans.fit(x)
Error message:
---------------------------------------------------------------------------
CUDADriverError Traceback (most recent call last)
<ipython-input-27-75b27aaadf4d> in <module>
6 #x = x.float()
7 kmeans = MultiKMeans(n_clusters=10, distance="euclidean")
----> 8 labels = kmeans3fit(x)
~/.local/lib/python3.8/site-packages/torchpq/clustering/MultiKMeans.py in fit(self, data, centroids)
432 for j in range(self.max_iter):
433 # 1 iteration of clustering
--> 434 maxsims, labels = self.get_labels(data, centroids) #top1 search
435 new_centroids = self.compute_centroids(data, labels)
436 error = self.calculate_error(centroids, new_centroids)
~/.local/lib/python3.8/site-packages/torchpq/clustering/MultiKMeans.py in get_labels(self, data, centroids)
323 # dim=2
324 # )
--> 325 maxsims, labels = self.max_sim_cuda(
326 data,
327 centroids,
~/.local/lib/python3.8/site-packages/torchpq/kernels/MaxSimCuda.py in __call__(self, A, B, dim, mode)
317 vals, inds = self._call_tt(A2, B2, dim)
318 elif mode == "tn":
--> 319 vals, inds = self._call_tn(A2, B2, dim)
320 elif mode == "nt":
321 vals, inds = self._call_nt(A2, B2, dim)
~/.local/lib/python3.8/site-packages/torchpq/kernels/MaxSimCuda.py in _call_tn(self, A, B, dim)
213 blocks_per_grid = (l, math.ceil(n/128), math.ceil(m/128))
214
--> 215 self._fn_tn(
216 grid=blocks_per_grid,
217 block=threads_per_block,
cupy/_core/raw.pyx in cupy._core.raw.RawKernel.__call__()
cupy/cuda/function.pyx in cupy.cuda.function.Function.__call__()
cupy/cuda/function.pyx in cupy.cuda.function._launch()
cupy_backends/cuda/api/driver.pyx in cupy_backends.cuda.api.driver.launchKernel()
cupy_backends/cuda/api/driver.pyx in cupy_backends.cuda.api.driver.check_status()
CUDADriverError: CUDA_ERROR_INVALID_VALUE: invalid argument
Hi,
TorchPQ runs well on a single gpu, but it fails when I switch to multi-gpus. The error occurs in the synchronize step. Do you have any suggestions for multi-gpu usage?
Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.