Coder Social home page Coder Social logo

gpu-topk's People

Contributors

anilshanbhag avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

gpu-topk's Issues

cub question

I am not familiar with CUB's memory allocator. Could you please explain the performance change without using CUB on the GPU ?

Thanks

Can you provide a more general version

Nice Job, Anil.
I see it provides the ultimate opimtized code version in the paper. But it is less general. Like if I change shared memory size or I let one thread works on 2 elemnts instead of current 8, this code won't work. Is that possible to provide a less optimized but more general version like the paper mentioned (eg. before using many shared memory optimization techniques). A lot of people want generasity and can swallow some performance dent. Thanks

VS faiss

Have you ever compare your GPU top-K solution with faiss? If I recall correctly, faiss has a very fast K-selection solution built on CUDA.

Fails on Ubuntu 18: 1080Ti

I've compiled the program with CUDA 10.0 toolkit.
It doesn't seem to work

Please enter the type of value you want to test:
1-float
2-double
3-uint
1
Please enter distribution type: 0
Please enter K: 32
Please enter number of tests to run per K: 3
Please enter start power (dataset size starts at 2^start)(max val: 29): 29
Please enter stop power (dataset size stops at 2^stop)(max val: 29): 29
NOW STARTING A NEW K

The distribution is: UNIFORM FLOATS
Running test 1 of 3 for size: 536870912 and k: 32
TESTING: 2 Bitonic TopK
TESTING: 1 Radix Select

In random moments it just seem to quit.

why does bitonicTopK use std::sort?

template<typename KeyT>
cudaError_t bitonicTopK(KeyT *d_keys_in, unsigned int num_items, unsigned int k, KeyT *d_keys_out,
    CachingDeviceAllocator&  g_allocator) {
...
  KeyT* res_vec = (KeyT*) malloc(sizeof(KeyT) * 2 * numThreads * NUM_GROUPS);
  cudaMemcpy(res_vec, d_keys.Current(), 2 * numThreads * NUM_GROUPS * sizeof(KeyT), cudaMemcpyDeviceToHost);
  std::sort(res_vec, res_vec + 2*numThreads*NUM_GROUPS, std::greater<KeyT>());
  cudaMemcpy(d_keys_out, res_vec, k * sizeof(KeyT), cudaMemcpyHostToDevice);

  if (d_keys.d_buffers[1])
    CubDebugExit(g_allocator.DeviceFree(d_keys.d_buffers[1]));

  return cudaSuccess;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.