🐛 Describe the bug <div class="snippet-clipboard-content notranslate position-r

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

GCC 14.1 internal compiler errors? about pytorch HOT 4 CLOSED

Geremia commented on June 15, 2024

GCC 14.1 internal compiler errors?

from pytorch.

Comments (4)

shink commented on June 15, 2024

Thanks for your report! Could you please share your minimal reproducible example?

from pytorch.

Geremia commented on June 15, 2024

@shink I wish I could isolate the issue better.
I get a similar Caffe2 / aten issue when compiling with GCC 14.1.0, too:

/usr/bin/cmake: symbol lookup error: /usr/lib64/libstdc++.so.6: undefined symbol: _ZNKSt7__cxx110messagesIcE7do_openERKNS_12basic_stringIcSt11char_traitsIcESaIcEEERKSt6locale, version GLIBCXX_3.4.21
make[2]: *** [caffe2/CMakeFiles/ATEN_CUDA_FILES_GEN_TARGET.dir/build.make:7118: aten/src/ATen/ops/bitwise_right_shift_cpu_dispatch.h] Error 127
make[2]: *** Deleting file 'aten/src/ATen/ops/bitwise_right_shift_cpu_dispatch.h'
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:1122: caffe2/CMakeFiles/ATEN_CUDA_FILES_GEN_TARGET.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

I'm not sure why it says GLIBCXX_3.4.21; my Libc version is 2.39.

Updated versions from collect_env.py

Collecting environment information...
PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A

OS: Slackware Linux  (x86_64)
GCC version: (GCC) 14.1.0
Clang version: 18.1.5
CMake version: version 3.29.3
Libc version: glibc-2.39

Python version: 3.11.9 (main, Apr  2 2024, 13:43:44) [GCC 13.2.0] (64-bit runtime)
Python platform: Linux-6.9.0-x86_64-AMD_Ryzen_Threadripper_2990WX_32-Core_Processor-with-glibc2.39
Is CUDA available: N/A
CUDA runtime version: 12.4.99
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: GPU 0: Quadro RTX 4000
Nvidia driver version: 550.54.14
cuDNN version: Probably one of the following:
/usr/share/cuda/lib64/libcudnn.so.9.1.1
/usr/share/cuda/lib64/libcudnn_adv.so.9.1.1
/usr/share/cuda/lib64/libcudnn_cnn.so.9.1.1
/usr/share/cuda/lib64/libcudnn_engines_precompiled.so.9.1.1
/usr/share/cuda/lib64/libcudnn_engines_runtime_compiled.so.9.1.1
/usr/share/cuda/lib64/libcudnn_graph.so.9.1.1
/usr/share/cuda/lib64/libcudnn_heuristic.so.9.1.1
/usr/share/cuda/lib64/libcudnn_ops.so.9.1.1
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: N/A

CPU:
Architecture:                         x86_64
CPU op-mode(s):                       32-bit, 64-bit
Address sizes:                        43 bits physical, 48 bits virtual
Byte Order:                           Little Endian
CPU(s):                               64
On-line CPU(s) list:                  0-63
Vendor ID:                            AuthenticAMD
BIOS Vendor ID:                       Advanced Micro Devices, Inc.
Model name:                           AMD Ryzen Threadripper 2990WX 32-Core Processor
BIOS Model name:                      AMD Ryzen Threadripper 2990WX 32-Core Processor Unknown CPU @ 3.0GHz
BIOS CPU family:                      107
CPU family:                           23
Model:                                8
Thread(s) per core:                   2
Core(s) per socket:                   32
Socket(s):                            1
Stepping:                             2
Frequency boost:                      enabled
CPU(s) scaling MHz:                   74%
CPU max MHz:                          3000.0000
CPU min MHz:                          2200.0000
BogoMIPS:                             5999.96
Flags:                                fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca sev sev_es
Virtualization:                       AMD-V
L1d cache:                            1 MiB (32 instances)
L1i cache:                            2 MiB (32 instances)
L2 cache:                             16 MiB (32 instances)
L3 cache:                             64 MiB (8 instances)
NUMA node(s):                         4
NUMA node0 CPU(s):                    0-7,32-39
NUMA node1 CPU(s):                    16-23,48-55
NUMA node2 CPU(s):                    8-15,40-47
NUMA node3 CPU(s):                    24-31,56-63
Vulnerability Gather data sampling:   Not affected
Vulnerability Itlb multihit:          Not affected
Vulnerability L1tf:                   Not affected
Vulnerability Mds:                    Not affected
Vulnerability Meltdown:               Not affected
Vulnerability Mmio stale data:        Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed:               Mitigation; untrained return thunk; SMT vulnerable
Vulnerability Spec rstack overflow:   Mitigation; Safe RET
Vulnerability Spec store bypass:      Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:             Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:             Mitigation; Retpolines; IBPB conditional; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds:                  Not affected
Vulnerability Tsx async abort:        Not affected

Versions of relevant libraries:
[pip3] flake8==7.0.0
[pip3] numpy==1.26.3
[conda] Could not collect

from pytorch.

Geremia commented on June 15, 2024

Removing the -fPIC flag, the build continued further, but I encountered the same issue again:

Fatal glibc error: malloc.c:4376 (_int_malloc): assertion failed: (unsigned long) (size) >= (unsigned long) (nb)
during GIMPLE pass: dce
/tmp/SBo/pytorch-v2.3.0/aten/src/ATen/FunctionalInverses.cpp: In static member function ‘static at::Tensor at::functionalization::FunctionalInverses::_nested_get_values_inverse(const at::Tensor&, const at::Tensor&, at::functionalization::InverseReturnMode)’:
/tmp/SBo/pytorch-v2.3.0/aten/src/ATen/FunctionalInverses.cpp:315:8: internal compiler error: Aborted
  315 | Tensor FunctionalInverses::_nested_get_values_inverse(const Tensor& base, const Tensor& mutated_view, InverseReturnMode inverse_return_mode) {
      |        ^~~~~~~~~~~~~~~~~~
0x1fc8df8 internal_error(char const*, ...)
        ???:0
0x7feb1f696aab __pthread_kill
        ???:0
0x7feb1f642e11 __GI_raise
        ???:0
0x7feb1f62849e abort
        ???:0
0x7feb1f6292c9 __libc_message_impl.cold
        ???:0
0x7feb1f639e02 __libc_assert_fail
        ???:0
0x7feb1f6a3c84 _int_malloc
        ???:0
0x7feb1f6a3f51 _int_realloc
        ???:0
0x7feb1f6a51a5 __libc_realloc
        ???:0
0x2058b30 xrealloc
        ???:0
0xacce0e get_dominated_to_depth(cdi_direction, basic_block_def*, int)
        ???:0
0xacce9a get_all_dominated_blocks(cdi_direction, basic_block_def*)
        ???:0
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

It seems to be a compiler bug.

There is a GCC bug reported related to compiling GridSamplerKernel.cpp. The solution was to use -fno-strict-aliasing, but that didn't help things in my case.

from pytorch.

Geremia commented on June 15, 2024

I was encountering these compiler errors because of a hardware issue; my DRAM MHz was set too high.

from pytorch.

GCC 14.1 internal compiler errors? about pytorch HOT 4 CLOSED

Comments (4)

Updated versions from collect_env.py

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent