Coder Social home page Coder Social logo

Comments (17)

littlewu2508 avatar littlewu2508 commented on July 17, 2024 1

I see the same (LLVM ERROR: Unsupported calling convention for call) with LLVM main branch.

Hello, I'd like to know is there any progress on this?

from llvm-project.

littlewu2508 avatar littlewu2508 commented on July 17, 2024 1

@littlewu2508 Which commit hash are you using to build the AMD fork of llvm-project and comgr? I still can't run examples like compile_minimal_test.exe without this error. Also, which ROCm device library version are you using that makes find_package(AMDDeviceLibs REQUIRED CONFIG) succeed?

As I remember I was using ROCm version 5.1.3, whose amd-llvm's commit hash is 5cba46f, and comgr commit hash is 11321060a69dc41f8bed6f6d8f900ad4c146025b.

When I tried bisecting one month ago, I use ROCm-5.4.0 because it's merge-base with vanilla llvm is near llvm-15 release. In that case amd-llvm's commit hash is d6f0fe8. All amd-llvm can make comgr-5.1.3 pass compile_minimal_test.

For AMDDeviceLibs I use ROCm version 5.1.3

from llvm-project.

lamb-j avatar lamb-j commented on July 17, 2024 1

@littlewu2508 and @MathiasMagnus , thanks for looking into this, and sorry for letting it fall off my radar.

As far as I understand it, our main issue is that AMDGPULowerKernelCalls exists in the AMD LLVM branch, but not upstream (https://github.com/RadeonOpenCompute/llvm-project/blob/amd-stg-open/llvm/lib/Target/AMDGPU/AMDGPULowerKernelCalls.cpp). This causes the following issues:

  1. When Comgr is built against trunk, Comgr tests fail
  2. Using -triple amdgcn-amd-amdhsa -target-cpu gfx… options when compiling with trunk (https://github.com/littlewu2508/LLVMAMDGPUcodegenbug)

There appear to have been a couple of efforts to address this, but they weren’t completed:

I’m going to work on getting these efforts revived so we can finally resolve the issues you’re hitting

from llvm-project.

MathiasMagnus avatar MathiasMagnus commented on July 17, 2024 1

@lamb-j @littlewu2508 Thank you for your input. Please note that I'm not an LLVM developer by any measure, so my struggles may seem noobish, my expertise lay at a higher level of the stack.

I published a pre-alpha, not yet functioning proof-of-concept project, spirv2bin. It uses the ROCm repo forks of @Mystro256 as mentioned here, because upstream ROCm libs don't compile against upstream LLVM. (This fact can be seen in the dependency helper project.) I've yet to hook it up to GitHub Actions and provide a CMakePresets.json that shows how to use the helper project, but regular devs of LLVM or comgr can likely build everything without utilities.

I fear I don't possess the know-how to fix my blocker. My current state of affairs is documented here. I'd gladly do some footwork above the compiler's level to hook SPIR-V consumption into comgr and/or the OpenCL Runtime, but minimally as a 3rd party OpenCL Layer, but I do need some help in sorting out my proof-of-concept. I suspect the error isn't on my end, but inside LLVM or comgr.

from llvm-project.

littlewu2508 avatar littlewu2508 commented on July 17, 2024 1

Updates: 31377a7 is landed in rocm-6.1.x, so test failures can be avoided easily by disabling comgr_nested_kernel_test.

from llvm-project.

littlewu2508 avatar littlewu2508 commented on July 17, 2024

Also to mention, due to llvm's change https://reviews.llvm.org/D116011, comgr should be patched:

diff --git a/lib/comgr/src/comgr-compiler.cpp b/lib/comgr/src/comgr-compiler.cpp
index 465187e..181cb81 100644
--- a/lib/comgr/src/comgr-compiler.cpp
+++ b/lib/comgr/src/comgr-compiler.cpp
@@ -666,6 +666,7 @@ AMDGPUCompiler::executeInProcessDriver(ArrayRef<const char *> Args) {
     initializeCommandLineArgs(Argv);
     Argv.append(Arguments.begin(), Arguments.end());
     Argv.push_back(nullptr);
+    Argv.pop_back();

     // By default clang driver will ask CC1 to leak memory.
     auto *IT = find(Argv, StringRef("-disable-free"));

Other wise test will encounter

terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid
zsh: abort      ./compile_minimal_test

from llvm-project.

littlewu2508 avatar littlewu2508 commented on July 17, 2024

Confirm that Fedora and Debian both encountering this problem. Now I'm trying to package rocm-comgr against llvm-14 on Gentoo and find the same issue exists.

from llvm-project.

lamb-j avatar lamb-j commented on July 17, 2024

Hi @littlewu2508, thanks for looking into this!

Regarding the second issue you mentioned (errors due to llvm's change https://reviews.llvm.org/D116011), can you verify which version of Comgr you're testing with and see if it includes a patch we added related to this (ROCm/ROCm-CompilerSupport@a75326c). This should cover your "cc1" case, but if you're still experiencing errors with this patch we may need to revisit.

I'm still looking into the "LLVM ERROR: Unsupported calling convention for call" error you first mentioned, and will update here when I find out anything.

from llvm-project.

preda avatar preda commented on July 17, 2024

I see the same (LLVM ERROR: Unsupported calling convention for call) with LLVM main branch.

from llvm-project.

littlewu2508 avatar littlewu2508 commented on July 17, 2024

Hi @littlewu2508, thanks for looking into this!

Regarding the second issue you mentioned (errors due to llvm's change https://reviews.llvm.org/D116011), can you verify which version of Comgr you're testing with and see if it includes a patch we added related to this (a75326c). This should cover your "cc1" case, but if you're still experiencing errors with this patch we may need to revisit.

Previously I'm using comgr-5.0.2. Now with 5.1.x and 5.2 the bug is gone. Thanks for pointing out that you already have the fix.

from llvm-project.

MathiasMagnus avatar MathiasMagnus commented on July 17, 2024

I too am hitting this issue.

I'm working on a project that uses comgr as a library to process LLVM bitcode coming from elsewhere. Incoming LLVM bitcode may be of mostly any LLVM version (that I choose when building that part of my program), and then I feed that bitcode comgr and would want to use it to build/link/etc in terms of comgr. When adding a device library and trying to link the resulting dataset, I too get LLVM ERROR: Unsupported calling convention for call. I built a debug version of LLVM to see what's happening and can confirm the same enum value (91) of calling convention, but I do not know which node it comes or happens on. (I'm a total LLVM newbie.)

Can anyone point me toward working release versions of LLVM and comgr that are known to work? Possibly as recent as reasonably possible.

from llvm-project.

littlewu2508 avatar littlewu2508 commented on July 17, 2024

I too am hitting this issue.

I'm working on a project that uses comgr as a library to process LLVM bitcode coming from elsewhere. Incoming LLVM bitcode may be of mostly any LLVM version (that I choose when building that part of my program), and then I feed that bitcode comgr and would want to use it to build/link/etc in terms of comgr. When adding a device library and trying to link the resulting dataset, I too get LLVM ERROR: Unsupported calling convention for call. I built a debug version of LLVM to see what's happening and can confirm the same enum value (91) of calling convention, but I do not know which node it comes or happens on. (I'm a total LLVM newbie.)

Can anyone point me toward working release versions of LLVM and comgr that are known to work? Possibly as recent as reasonably possible.

The AMD forked LLVM works without issue: https://github.com/RadeonOpenCompute/llvm-project/

I also built a debug version but still cannot understand the reason. I tried to bisect it from the merge-base of vanilla llvm but there are too many build failures. Hope you can find some clue!

from llvm-project.

MathiasMagnus avatar MathiasMagnus commented on July 17, 2024

@littlewu2508 Which commit hash are you using to build the AMD fork of llvm-project and comgr? I still can't run examples like compile_minimal_test.exe without this error. Also, which ROCm device library version are you using that makes find_package(AMDDeviceLibs REQUIRED CONFIG) succeed?

from llvm-project.

littlewu2508 avatar littlewu2508 commented on July 17, 2024

Confirmed that this issue still exists for llvm-16 (llvm commit dafebd5b5a08dde25f5f52f65cac54bd6ec0ecde) and comgr development branch: (comgr commit 4c092fc02e59a2783c94a23a89d2987a2f5a5239)

from llvm-project.

littlewu2508 avatar littlewu2508 commented on July 17, 2024

Some progress:

I managed to reproduce using pure clang commands. The reproducing method is described in https://github.com/littlewu2508/LLVMAMDGPUcodegenbug. The gdb trace follows similar steps of compile_minimal_test, with less steps going to the Instruction I having Opcode 56 (CallConv) with SubclassData ((365>>2 = 91 which is CallingConv::AMDGPU_KERNEL, I guess) that causes this issue.

Since without comgr I can also reproduce this I will report that to llvm team, too

from llvm-project.

littlewu2508 avatar littlewu2508 commented on July 17, 2024

I too am hitting this issue.

I'm working on a project that uses comgr as a library to process LLVM bitcode coming from elsewhere. Incoming LLVM bitcode may be of mostly any LLVM version (that I choose when building that part of my program), and then I feed that bitcode comgr and would want to use it to build/link/etc in terms of comgr. When adding a device library and trying to link the resulting dataset, I too get LLVM ERROR: Unsupported calling convention for call. I built a debug version of LLVM to see what's happening and can confirm the same enum value (91) of calling convention, but I do not know which node it comes or happens on. (I'm a total LLVM newbie.)

It is now clear that this issue is not specific to comgr. It's because upstream llvm does not support handling calls to kernels.

Can you give an example of your bitcode and the source to generate them?

from llvm-project.

MathiasMagnus avatar MathiasMagnus commented on July 17, 2024

@littlewu2508 I'll get back to you as soon as I can create a minimal repro. It's a dead simple OpenCL C SAXPY kernel, but I produce my bitcode using the SPIRV-LLVM bidirectional translator. First I compile the SAXPY kernel to SPIR-V (using the Clang 16 for convenience via clang.exe --target=spirv64 -fintegrated-objemitter .\OpenCL-Cpp-SAXPY.cl -o .\OpenCL-Cpp-SAXPY.spv, but because it's SPIR-V it should really not matter what LLVM version was used to produce it), then I want to turn it into LLVM IR and then into AMDGPU using custom builds of LLVM, COMGR, DeviceLibs so that all versions match. I'm cleaning up my proof of concept code so that it's buildable by others.

from llvm-project.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.