denzp / rust-ptx-linker Goto Github PK
View Code? Open in Web Editor NEWThe missing puzzle piece for NVPTX experience with Rust
License: MIT License
The missing puzzle piece for NVPTX experience with Rust
License: MIT License
I'm having trouble figuring out how to link with --arch=sm_xx
flags when calling cargo build --target nvptx...
, would appreciate some help.
My assumption is that it will be simpler to merge this project into Rust if it is licensed the exact same way (https://listed.to/@cmr/7767/the-great-relicensing).
It seems like you @denzp is the only person with creative contributions to this project, will you be willing to add the Apache 2.0 license? If so add the necessary changes or if you prefer me to open the PR just state "I license past and future contributions under the dual MIT/Apache-2.0 license, allowing licensees to chose either at their option."
and I will open it for you.
Hi, I would appreciate some documentation on the minimum CUDA install especially lightweight environments like github actions and docker containers. Thanks.
I tried to compile some of the code from rust-inline-cuda-tutorial on Windows, and encountered this error:
error: failed to run custom build command for `chapter-1 v0.1.0 (file:///C:/Users/Brook/workspace/rust-inline-cuda-tutorial/chapter-1/host)`
process didn't exit successfully: `C:\Users\Brook\workspace\rust-inline-cuda-tutorial\target\release\build\chapter-1-f350df76ca927b39\build-script-build` (exit code: 1)
--- stderr
[PTX] Unable to build a PTX crate!
[PTX] Compiling proxy v0.0.0 (file:///C:/Users/Brook/AppData/Local/Temp/ptx-builder/chapter_1_kernel/2777747bd38bda2e)
[PTX] error: linking with `ptx-linker` failed: exit code: 101
[PTX] |
[PTX] = note: "ptx-linker" "-L" "C:\\Users\\Brook\\.xargo\\lib\\rustlib\\nvptx64-nvidia-cuda\\lib" "C:\\Users\\Brook\\AppData\\Local\\Temp\\ptx-builder\\chapter_1_kernel\\2777747bd38bda2e\\target\\nvptx64-nvidia-cuda\\release\\deps\\proxy.3rg6oty3sxxix4x5.rcgu.o" "-o" "C:\\Users\\Brook\\AppData\\Local\\Temp\\ptx-builder\\chapter_1_kernel\\2777747bd38bda2e\\target\\nvptx64-nvidia-cuda\\release\\deps\\proxy.ptx" "C:\\Users\\Brook\\AppData\\Local\\Temp\\ptx-builder\\chapter_1_kernel\\2777747bd38bda2e\\target\\nvptx64-nvidia-cuda\\release\\deps\\proxy.crate.metadata.rcgu.o" "-O1" "-L" "C:\\Users\\Brook\\AppData\\Local\\Temp\\ptx-builder\\chapter_1_kernel\\2777747bd38bda2e\\target\\nvptx64-nvidia-cuda\\release\\deps" "-L" "C:\\Users\\Brook\\AppData\\Local\\Temp\\ptx-builder\\chapter_1_kernel\\2777747bd38bda2e\\target\\release\\deps" "-L" "C:\\Users\\Brook\\.xargo\\lib\\rustlib\\nvptx64-nvidia-cuda\\lib" "-Bstatic" "--whole-archive" "C:\\Users\\Brook\\AppData\\Local\\Temp\\rustcD6eN9t\\libchapter_1_kernel-f74b1c8db75e0ae3.rlib" "--no-whole-archive" "--whole-archive" "C:\\Users\\Brook\\AppData\\Local\\Temp\\rustcD6eN9t\\libmath-86f691a12c8816dd.rlib" "--no-whole-archive" "--whole-archive" "C:\\Users\\Brook\\AppData\\Local\\Temp\\rustcD6eN9t\\libcty-77e4626992983622.rlib" "--no-whole-archive" "--whole-archive" "C:\\Users\\Brook\\AppData\\Local\\Temp\\rustcD6eN9t\\libnvptx_builtins-80d5f762aa9da22c.rlib" "--no-whole-archive" "--whole-archive" "C:\\Users\\Brook\\AppData\\Local\\Temp\\rustcD6eN9t\\libcore-e6982899dc4034c4.rlib" "--no-whole-archive" "C:\\Users\\Brook\\AppData\\Local\\Temp\\rustcD6eN9t\\libcompiler_builtins-663e743ee2409b7f.rlib" "-shared" "-Bdynamic"
[PTX] = note: Unable to find symbol 'LLVMContextCreate' in the LLVM shared lib
[PTX] thread 'main' panicked at 'explicit panic', C:\Users\Brook\AppData\Local\Temp\cargo-installaOfHDV\release\build\rustc-llvm-proxy-8b11c485c2cc7120\out/llvm_gen.rs:4:1
[PTX] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[PTX]
[PTX]
[PTX] error: aborting due to previous error
[PTX]
[PTX] error: Could not compile `proxy`.
[PTX]
[PTX] To learn more, run the command again with --verbose.
I think this is the important bit:
Unable to find symbol 'LLVMContextCreate' in the LLVM shared lib
After updating to 0.4, I now have a new error:
[PTX] Unable to build a PTX crate!
[PTX] Compiling proxy v0.0.0 (file:///C:/Users/Brook/AppData/Local/Temp/ptx-builder-0.4/chapter_1_kernel/2777747bd38bda2e)
[PTX] error[E0658]: The attribute `panic_handler` is currently unknown to the compiler and may have meaning added to it in the future (see issue #29642)
[PTX] --> src\lib.rs:8:1
[PTX] |
[PTX] 8 | #[panic_handler]
[PTX] | ^^^^^^^^^^^^^^^^
[PTX] |
[PTX] = help: add #![feature(custom_attribute)] to the crate attributes to enable
[PTX]
[PTX] error: aborting due to previous error
[PTX]
[PTX] For more information about this error, try `rustc --explain E0658`.
[PTX] error: Could not compile `proxy`.
[PTX]
[PTX] To learn more, run the command again with --verbose.
$ cargo --version
cargo 1.29.0-nightly (6a7672ef5 2018-08-14)
# nightly-x86_64-pc-windows-gnu
I added a call to core::intrinsics::fsin32
to the chapter 1 example, and used a syncthreads
to ensure that it couldn't be optimized away. The linker produces this error when compiling the modified kernel:
[PTX] Compiling proxy v0.0.0 (file:///C:/Users/Brook/AppData/Local/Temp/ptx-builder-0.4/chapter_1_kernel/2777747bd38bda2e)
[PTX] error: linking with `ptx-linker` failed: exit code: 1
[PTX] = note: [INFO] Going to link 2 bitcode modules and 6 rlibs...
[PTX]
[PTX] [DEBUG] Linking bitcode: "C:\\Users\\Brook\\AppData\\Local\\Temp\\ptx-builder-0.4\\chapter_1_kernel\\2777747bd38bda2e\\target\\nvptx64-nvidia-cuda\\release\\deps\\proxy.3wp5o7r8sftxbtji.rcgu.o"
<-- Snip -->
[PTX] [DEBUG] - linking archive item: "compiler_builtins-54267958a4f42a84.it3wtu5gavdx124.rcgu.o"
[PTX] [INFO] Linking with Link Time Optimisation
[PTX] LLVM ERROR: Cannot select: 0x2f13b7a31f0: f32 = fsin 0x2f13b3b5508
[PTX] 0x2f13b3b5508: f32 = fp_round 0x2f13b7a2968, TargetConstant:i64<0>
[PTX] 0x2f13b7a2968: f64,ch = CopyFromReg 0x2f13bde5b68, Register:f64 %28
[PTX] 0x2f13b7a2b08: f64 = Register %28
[PTX] 0x2f13b3b5438: i64 = TargetConstant<0>
[PTX] In function: bilateral_filter
The PTX instruction set does define sin32 and cos32 instructions, so I would expect those to be selected, or at least a better error message to be provided.
With the latest nightly (2022-08-13, rust version 1.65.0-nightly (f22819bcc 2022-08-12)), linking with rust-ptx-linker
fails with the following error:
| error: linking with `rust-ptx-linker` failed: exit status: 101
| |
| = note: "rust-ptx-linker" ... "-Olto" "--debug" "--arch" "sm_35"
| = note: [2022-08-14T00:18:48Z INFO ptx_linker::linker] Going to link 7 bitcode modules and 9 rlibs...
| [2022-08-14T00:18:51Z INFO ptx_linker::linker] Linking with Link Time Optimisation
| Unable to find symbol 'LLVMPassManagerBuilderPopulateLTOPassManager' in the LLVM shared lib
| thread 'main' panicked at 'explicit panic', /tmp/cargo-installObZiAh/release/build/rustc-llvm-proxy-52fe0bbe62a213d2/out/llvm_gen.rs:975:1
| stack backtrace:
| 0: 0x558c0b7e6070 - std::backtrace_rs::backtrace::libunwind::trace::hb54da3449afa138f
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
| 1: 0x558c0b7e6070 - std::backtrace_rs::backtrace::trace_unsynchronized::h3501044dcc17b219
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
| 2: 0x558c0b7e6070 - std::sys_common::backtrace::_print_fmt::h903e3aad495d83f3
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/sys_common/backtrace.rs:66:5
| 3: 0x558c0b7e6070 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h628d9cf86eac8446
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/sys_common/backtrace.rs:45:22
| 4: 0x558c0b80352e - core::fmt::write::hf460af9834904685
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/core/src/fmt/mod.rs:1202:17
| 5: 0x558c0b7e3105 - std::io::Write::write_fmt::h3ffaf1818c6f1cb2
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/io/mod.rs:1672:15
| 6: 0x558c0b7e77e3 - std::sys_common::backtrace::_print::h7b7a35e0b3e2febe
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/sys_common/backtrace.rs:48:5
| 7: 0x558c0b7e77e3 - std::sys_common::backtrace::print::h0f518fa6e6302d2b
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/sys_common/backtrace.rs:35:9
| 8: 0x558c0b7e77e3 - std::panicking::default_hook::{{closure}}::hb8275eea2a477c62
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/panicking.rs:295:22
| 9: 0x558c0b7e74cf - std::panicking::default_hook::h1ed1db7d3418fb1c
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/panicking.rs:314:9
| 10: 0x558c0b7e7e8a - std::panicking::rust_panic_with_hook::hd13d30b9ebb3cd8b
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/panicking.rs:698:17
| 11: 0x558c0b7538aa - std::panicking::begin_panic::{{closure}}::h355aa01bdfe2c65a
| 12: 0x558c0b75387c - std::sys_common::backtrace::__rust_end_short_backtrace::h97f7fe6dfdb0a5ac
| 13: 0x558c0b6f0f8a - std::panicking::begin_panic::h15ebca1fabb791cc
| 14: 0x558c0b753189 - LLVMPassManagerBuilderPopulateLTOPassManager
| 15: 0x558c0b748a6b - ptx_linker::linker::Linker::link::he68ac6ddd17dbd2f
| 16: 0x558c0b7469f4 - ptx_linker::linker_entrypoint::h74fa527f86d52f17
| 17: 0x558c0b6f67e2 - rust_ptx_linker::main::hc811f8d9c30f7ef2
| 18: 0x558c0b6f6993 - std::sys_common::backtrace::__rust_begin_short_backtrace::hdabc330dbee3660b
| 19: 0x558c0b6f5ab9 - std::rt::lang_start::{{closure}}::hb714d67067d8cb37
| 20: 0x558c0b7decef - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h89502326500375c5
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/core/src/ops/function.rs:280:13
| 21: 0x558c0b7decef - std::panicking::try::do_call::h1b45f654a0e2d8f9
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/panicking.rs:492:40
| 22: 0x558c0b7decef - std::panicking::try::h2f77f7488672a963
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/panicking.rs:456:19
| 23: 0x558c0b7decef - std::panic::catch_unwind::h09f8927754dd4218
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/panic.rs:137:14
| 24: 0x558c0b7decef - std::rt::lang_start_internal::{{closure}}::h210ce1ccfa691e3b
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/rt.rs:128:48
| 25: 0x558c0b7decef - std::panicking::try::do_call::h15f29a5a1050cfc3
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/panicking.rs:492:40
| 26: 0x558c0b7decef - std::panicking::try::h28fc945254a5f592
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/panicking.rs:456:19
| 27: 0x558c0b7decef - std::panic::catch_unwind::hdd41bdc703f9c455
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/panic.rs:137:14
| 28: 0x558c0b7decef - std::rt::lang_start_internal::hc9d560260fcc526c
| at /rustc/f22819bcce4abaff7d1246a56eec493418f9f4ee/library/std/src/rt.rs:128:20
| 29: 0x558c0b6f6832 - main
| 30: 0x7f518b686083 - __libc_start_main
| 31: 0x558c0b6f55ae - _start
| 32: 0x0 - <unknown>
|
| error: could not compile `single-source` due to 2 previous errors
[PTX] Failed to compile a PTX crate.
This project unfortunately seems to no longer be maintained. I'm committed to making ptx compilation work in Rust, and I'm working on the replacement for this crate in rust-lang/rust#117458 (comment)
The first PR for the embedded-linker is focused on getting the unit tests back up and running and not making it available for users. It will unfortunately also break the argument parsing in the ptx-linker in addition to the other things that are already broken. See this repo for a few fixes on top of this project that might make it useful long enough for the embedded-linker to be available for users as well: rust-ptx-linker-maintenance
Lastly I will thank @denzp for creating this project and working on ptx and Rust. I would probably not have dared to start messing around with Rust+Cuda if it weren't for you (and of course also the other contributors working on Rust+Cuda).
It seems that panics are handled by infinite-looping. This makes sense in some situations, but on CUDA we can do better. CUDA does support printing output to stdout (though I haven't figured out how to do so in rust yet, since the println!
macro depends on std
). Afterwards, we can terminate the kernel using inline assembly, like so: asm!("trap;");
. This terminates the kernel with an (admittedly unclear) error message regarding an illegal instruction. This is a pretty common way to forcefully terminate a kernel (see stackoverflow for example).
If I reference the synchtreads
intrinsic, the linker produces an invalid PTX file. Relevant PTX output:
LBB0_3:
{ // callseq 0, 0
.reg .b32 temp_param_reg;
call.uni
llvm.cuda.syncthreads,
(
);
} // callseq 0
It looks as though it's trying to call llvm.cuda.syncthreads
rather than emitting the appropriate PTX instruction.
I've recently started working on nvptx support in Rust and has run into the problem of not being able to use ptx-linker in assembly tests.
The ptx-linker
used to be active in the rustc
but was removed in f8f9a2869cce570c994d96afb82f4162b1b44cca. It seems to me like the issue is fixed (in denzp/rustc-llvm-proxy#7), and there is no reason to not add it back and enable nvptx tests, is this correct? If not, what can I do to help the situation? (Relevant discussion here to jog the memory rust-lang/rust#59752)
Looking even further in the future. What is the ideal way to distribute ptx-linker? I imagine it would be better to somehow distribute it together with the nvptx64 toolchain in rustup? Are there anything that can help the situation right now, or is it basically waiting for more rust ptx adoption?
LLVM headers submodule build problem:
https://ci.appveyor.com/project/denzp/rust-ptx-linker/build/1.0.10/job/xepww8hwhf7tvsak
Build times can be unacceptably long when host system doesn't have right LLVM version. Binary releases with trust can help to solve the issue.
cat ~/.cargo/config
[target.nvptx64-nvidia-cuda]
rustflags = ["-C", "link-arg=-arch=sm_xx",]
When running calling a kernel via the "cuModuleLoadData" api, we get the error: CUDA_ERROR_INVALID _PTX
for all sm_xx
, including xx=30.
This is somewhat weird. Is there somewhere else I am supposed to tell rustc or the linker about the arch?
Target: x86_64-unknown-linux-gnu
running 6 tests
test it_should_emit_correct_debug_ir ... ok
test it_should_emit_bc ... ok
test it_should_emit_correct_debug_asm ... ok
test it_should_emit_correct_release_asm ... ok
test it_should_emit_correct_release_ir ... ok
[1] 24212 segmentation fault (core dumped) target/debug/deps/linker-f8c0c5e7ec878963
Currently, it's being tested with already precompiled inputs: bitcode and rlibs. This can help to avoid regressions but doesn't take in account Rust's changes.
Related to issue found at denzp/rust-ptx-builder#6
Perfectly would be able to validate the PTX assembly without actually using CUDA runtime, because Travis CI obviously doesn't have suitable hardware.
If the user mistakenly attempts to compile kernel code for the host, an unclear LLVM error is printed:
LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.read.ptx.sreg.nctaid.x
error: Could not compile `chapter-1-kernel`.
warning: build failed, waiting for other jobs to finish...
error: build failed
Perhaps there is some way to detect this and display a better error message.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.