Coder Social home page Coder Social logo

Comments (5)

CoffeeVampir3 avatar CoffeeVampir3 commented on July 18, 2024 1

I managed to get things working by compiling the latest pytorch+12.4 using magma124 from source, it appears the issue was related to the pytorch+cuda 12.1 binaries.

from lectures.

andreaskoepf avatar andreaskoepf commented on July 18, 2024

fatal error: cuda_runtime.h: No such file or directory

The problem is that cuda_runtime.h wasn't found. Do you have the cuda toolkit installed? On ubuntu machines this header normally can be found at /usr/local/cuda/include/cuda_runtime.h (the /usr/local/cuda normally is a symlink managed by update-alternatives).

from lectures.

NamburiSrinath avatar NamburiSrinath commented on July 18, 2024

Hi @andreaskoepf,

Sorry for the delay in response.

I can indeed see the file /usr/local/cuda/include/cuda_runtime.h (attached screenshot)

Screen Shot 2024-02-24 at 10 34 49 PM

Please let me know if you would need additional details to debug!

from lectures.

CoffeeVampir3 avatar CoffeeVampir3 commented on July 18, 2024

I'm having a similar issue in the same place,

---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
File [~/miniforge3/lib/python3.10/site-packages/torch/utils/cpp_extension.py:2096](http://localhost:8888/home/blackroot/miniforge3/lib/python3.10/site-packages/torch/utils/cpp_extension.py#line=2095), in _run_ninja_build(build_directory, verbose, error_prefix)
   2095     stdout_fileno = 1
-> 2096     subprocess.run(
   2097         command,
   2098         stdout=stdout_fileno if verbose else subprocess.PIPE,
   2099         stderr=subprocess.STDOUT,
   2100         cwd=build_directory,
   2101         check=True,
   2102         env=env)
   2103 except subprocess.CalledProcessError as e:
   2104     # Python 2 and 3 compatible way of getting the error object.

File [~/miniforge3/lib/python3.10/subprocess.py:526](http://localhost:8888/home/blackroot/miniforge3/lib/python3.10/subprocess.py#line=525), in run(input, capture_output, timeout, check, *popenargs, **kwargs)
    525     if check and retcode:
--> 526         raise CalledProcessError(retcode, process.args,
    527                                  output=stdout, stderr=stderr)
    528 return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

RuntimeError                              Traceback (most recent call last)
Cell In[8], line 1
----> 1 module = load_cuda(cuda_src, cpp_src, ['rgb_to_grayscale'], verbose=True)

Cell In[4], line 2, in load_cuda(cuda_src, cpp_src, funcs, opt, verbose)
      1 def load_cuda(cuda_src, cpp_src, funcs, opt=False, verbose=False):
----> 2     return load_inline(cuda_sources=[cuda_src], cpp_sources=[cpp_src], functions=funcs,
      3                        extra_cuda_cflags=["-O2"] if opt else [], verbose=verbose, name="inline_ext")

File [~/miniforge3/lib/python3.10/site-packages/torch/utils/cpp_extension.py:1635](http://localhost:8888/home/blackroot/miniforge3/lib/python3.10/site-packages/torch/utils/cpp_extension.py#line=1634), in load_inline(name, cpp_sources, cuda_sources, functions, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, with_pytorch_error_handling, keep_intermediates, use_pch)
   1631     _maybe_write(cuda_source_path, "\n".join(cuda_sources))
   1633     sources.append(cuda_source_path)
-> 1635 return _jit_compile(
   1636     name,
   1637     sources,
   1638     extra_cflags,
   1639     extra_cuda_cflags,
   1640     extra_ldflags,
   1641     extra_include_paths,
   1642     build_directory,
   1643     verbose,
   1644     with_cuda,
   1645     is_python_module,
   1646     is_standalone=False,
   1647     keep_intermediates=keep_intermediates)

File [~/miniforge3/lib/python3.10/site-packages/torch/utils/cpp_extension.py:1710](http://localhost:8888/home/blackroot/miniforge3/lib/python3.10/site-packages/torch/utils/cpp_extension.py#line=1709), in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1706                 hipified_sources.add(hipify_result[s_abs].hipified_path if s_abs in hipify_result else s_abs)
   1708             sources = list(hipified_sources)
-> 1710         _write_ninja_file_and_build_library(
   1711             name=name,
   1712             sources=sources,
   1713             extra_cflags=extra_cflags or [],
   1714             extra_cuda_cflags=extra_cuda_cflags or [],
   1715             extra_ldflags=extra_ldflags or [],
   1716             extra_include_paths=extra_include_paths or [],
   1717             build_directory=build_directory,
   1718             verbose=verbose,
   1719             with_cuda=with_cuda,
   1720             is_standalone=is_standalone)
   1721 finally:
   1722     baton.release()

File [~/miniforge3/lib/python3.10/site-packages/torch/utils/cpp_extension.py:1823](http://localhost:8888/home/blackroot/miniforge3/lib/python3.10/site-packages/torch/utils/cpp_extension.py#line=1822), in _write_ninja_file_and_build_library(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_standalone)
   1821 if verbose:
   1822     print(f'Building extension module {name}...', file=sys.stderr)
-> 1823 _run_ninja_build(
   1824     build_directory,
   1825     verbose,
   1826     error_prefix=f"Error building extension '{name}'")

File [~/miniforge3/lib/python3.10/site-packages/torch/utils/cpp_extension.py:2112](http://localhost:8888/home/blackroot/miniforge3/lib/python3.10/site-packages/torch/utils/cpp_extension.py#line=2111), in _run_ninja_build(build_directory, verbose, error_prefix)
   2110 if hasattr(error, 'output') and error.output:  # type: ignore[union-attr]
   2111     message += f": {error.output.decode(*SUBPROCESS_DECODE_ARGS)}"  # type: ignore[union-attr]
-> 2112 raise RuntimeError(message) from e

RuntimeError: Error building extension 'inline_ext'

cuda_runtime.h is found but it just fails with no real indications, the most readable error looking like some sort of syntax issue but that doesn't make much sense.

/home/blackroot/miniforge3/lib/python3.10/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:41:104: error: expected primary-expression before ‘>’ token
   41 | struct has_ivalue_to<T, guts::void_t<decltype(std::declval<IValue>().to<T>())>>
      |                                                                                                        ^
/home/blackroot/miniforge3/lib/python3.10/site-packages/torch/include/ATen/core/boxing/impl/boxing.h:41:107: error: expected primary-expression before ‘)’ token
   41 | struct has_ivalue_to<T, guts::void_t<decltype(std::declval<IValue>().to<T>())>>
      |                                                                                                           ^
ninja: build stopped: subcommand failed.

Haven't been able to hunt down a fix if you've got any ideas. Cheers

from lectures.

oli-clive-griffin avatar oli-clive-griffin commented on July 18, 2024

I had a similar issue, running this code cell at the start, and possibly restarting the instance, worked for me

!apt-get install ninja-build
!pip install wurlitzer

might also want to try this afterwards if the above doesn't work:

!pip uninstall -y torch torchvision
!pip install torch torchvision

from lectures.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.