Coder Social home page Coder Social logo

Comments (6)

levitation avatar levitation commented on August 15, 2024 1

For me the error what(): Failed to query occupancy. occurred when I had called cmake with too high CUTLASS_NVCC_ARCHS parameter value before compiling and running the performance test.

from cutlass.

kerrmudgeon avatar kerrmudgeon commented on August 15, 2024

Thank you for reporting this.

Could you verify that the CUDA build and execution environment is properly configured? You might be able to do this by simply executing cutlass_profiler --help to see if it prints any information about available GPU devices.

If CUDA isn't installed correctly, perhaps you can resolve this by installing the latest CUDA driver.

Beyond that, could you provide the steps you used to build the CUTLASS profiler?

from cutlass.

d-k-b avatar d-k-b commented on August 15, 2024

Closing until further information is provided, feel free to reopen if the issue still exists.

from cutlass.

tomasohara avatar tomasohara commented on August 15, 2024

I also ran into this problem under an GTX 1080 ti. My build log is attached, which is based on https://github.com/NVIDIA/cutlass.

My two GPU's are detected OK, as shown in the --help output below. Plus, I believe the execution environment is installed OK, because I am able to run tensorfow jobs OK on the GPU (e.g., BERT and Albert).

_build-20Apr21.log

Best,
Tom

$ build/tools/profiler/cutlass_profiler --help
...
--device= CUDA Device ID

[0] - GeForce GTX 1080 Ti - SM 6.1, 28 SMs @ 1645 MHz, L2 cache: 2 MB, Global Memory: 10 GB
[1] - Quadro P400 - SM 6.1, 2 SMs @ 1252.5 MHz, L2 cache: 0 MB, Global Memory: 1 GB

from cutlass.

hwu36 avatar hwu36 commented on August 15, 2024

This error is reported in these four places when calling cudaOccupancyMaxPotentialBlockSize:

./util/include/cutlass/util/reference/device/tensor_compare.h:133:      throw std::runtime_error("Failed to query occupancy.");
./util/include/cutlass/util/reference/device/tensor_compare.h:200:      throw std::runtime_error("Failed to query occupancy.");
./util/include/cutlass/util/reference/device/tensor_foreach.h:53:        throw std::runtime_error("Failed to query occupancy.");
./util/include/cutlass/util/reference/device/tensor_foreach.h:111:        throw std::runtime_error("Failed to query occupancy.");

Would you please 1) identify which one reports the error. 2) Use cudaGetErrorString (https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__ERROR.html#group__CUDART__ERROR_1g4bc9e35a618dfd0877c29c8ee45148f1) to see what is the cause. 3) write a standalone tiny cuda file to call cudaOccupancyMaxPotentialBlockSize ?

from cutlass.

tomasohara avatar tomasohara commented on August 15, 2024

It turns out that I only built it for the Ampere architecture (i.e., -DCUTLASS_NVCC_ARCHS=80). It works fine after recompiling for the Pascal architecture as shown below.

Best,
Tom


$ build/tools/profiler/cutlass_profiler --device=0 --operation=Gemm --m=1024 --n=102
4 --k=128
=>
Bytes: 5242880 bytes
FLOPs: 270532608 flops

     Runtime: 0.0501654  ms
      Memory: 97.3342 GiB/s

        Math: 5392.81 GFLOP/s

...


build sequence (see attached log):

mkdir build && cd build
export CUDACXX=/usr/local/cuda-11.1/bin/nvcc
cmake .. -DCUTLASS_NVCC_ARCHS=61 -DCUTLASS_ENABLE_TESTS=OFF -DCUTLASS_UNITY_BUILD_ENABLED=ON
make cutlass_profiler -j12

_re-build-21apr21.log

from cutlass.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.