Comments (15)
I imagine the neon_dot
build error is the same as tensorflow/tensorflow#48464, but what is failing in neon_v8
microkernels? It is plausible that Apple toolchain for ARMv7 doesn't support neon_dot
because the only Apple SoCs that support it are ARM64-only, but IMO neon_v8
should still compile fine, at there are AArch32-compatible Apple SoCs with ARMv8.
from xnnpack.
The fix for NEONDOT microkernels is in #1404
from xnnpack.
@larryliu0820 Bazel build use
Lines 3971 to 3974 in 6e8c0ce
from xnnpack.
Good news is
apple_aarch32_copts = [ "-mcpu=cyclone", "-mtune=generic", ],
Fixes issue for neonv8 microkernels.
Would you create a PR?
from xnnpack.
No, this code is in AArch32-specific section
from xnnpack.
Currently, there is no way to exclude neon_dot
and neon_v8
microkernels from the build. I'm looking into a possible solution.
from xnnpack.
Are armv8 SoCs passing CMakeLists.txt check: IOS_ARCH MATCHES "^armv7"
?
from xnnpack.
I'd expect so. "armv7"
really means AArch32 with ARMv7 or higher instruction set.
from xnnpack.
"armv7" really means AArch32 with ARMv7 or higher instruction set.
armv7 is failing arm_neon.h:
#if __ARM_ARCH >= 8 && defined(__ARM_FEATURE_DIRECTED_ROUNDING)
hence a lot of vrndz
vrndu
symbols are not found
I imagine the neon_dot build error is the same as tensorflow/tensorflow#48464
The error signature looks similar. I've got:
XNNPACK/src/qs8-gemm/gen/4x16c4-minmax-neondot.c:102:18: error: assigning to 'int32x4_t' (vector of 4 'int32_t' values) from incompatible type 'int'
vacc0x0123 = vdotq_lane_s32(vacc0x0123, vb0123x0123, va0x01234567, 0);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
from xnnpack.
hence a lot of
vrndz
vrndu
symbols are not found
Do you actually get compilation errors about these symbols? XNNPACK sets " -march=armv8-a -mfpu=neon-fp-armv8 "
when compiling ARMv8 NEON microkernels, which should make Clang include the declarations of these intrinsics.
from xnnpack.
Yes the compiler errors looks like this:
XNNPACK/src/f32-vrnd/gen/vrndu-neonv8-x4.c:31:23: error: initializing 'const float32x4_t' (vector of 4 'float32_t' values) with an expression of incompatible type 'int'
const float32x4_t vy0123 = vrndpq_f32(vx0123);
^ ~~~~~~~~~~~~~~~~~~
I've made sure the compiler flags are there.
from xnnpack.
For neondot I ran into these errors after pulling latest commit:
stderr: Undefined symbols for architecture armv7:
"_xnn_qs8_igemm_minmax_ukernel_4x8c4__neondot", referenced from:
_init in libXNNPACKApple-2065260873.a(init.c.o)
"_xnn_qs8_igemm_minmax_ukernel_1x8c4__neondot", referenced from:
_init in libXNNPACKApple-2065260873.a(init.c.o)
"_xnn_qs8_gemm_minmax_ukernel_1x8c4__neondot", referenced from:
_init in libXNNPACKApple-2065260873.a(init.c.o)
"_xnn_qs8_gemm_minmax_ukernel_4x8c4__neondot", referenced from:
_init in libXNNPACKApple-2065260873.a(init.c.o)
I'm going to see if I can build with cmake
from xnnpack.
Good news is
apple_aarch32_copts = [
"-mcpu=cyclone",
"-mtune=generic",
],
Fixes issue for neonv8 microkernels.
from xnnpack.
For neondot I ran into these errors after pulling latest commit:
stderr: Undefined symbols for architecture armv7: "_xnn_qs8_igemm_minmax_ukernel_4x8c4__neondot", referenced from: _init in libXNNPACKApple-2065260873.a(init.c.o) "_xnn_qs8_igemm_minmax_ukernel_1x8c4__neondot", referenced from: _init in libXNNPACKApple-2065260873.a(init.c.o) "_xnn_qs8_gemm_minmax_ukernel_1x8c4__neondot", referenced from: _init in libXNNPACKApple-2065260873.a(init.c.o) "_xnn_qs8_gemm_minmax_ukernel_4x8c4__neondot", referenced from: _init in libXNNPACKApple-2065260873.a(init.c.o)
I'm going to see if I can build with cmake
Fix coming in #1405
from xnnpack.
if (!XNN_PLATFORM_IOS && cpuinfo_has_arm_neon_dot()) {
Wouldn't that disable iphoneos arm64?
from xnnpack.
Related Issues (20)
- Unable to replicate RPi0 tests HOT 3
- clang14.0.7 from Android NDK crashes when compiling bf16-gemm-1x4c8-minmax-neonbf16-bfdot.c HOT 14
- Typo in XNNPACK/doc /dwconv.md HOT 1
- pthreadpool cfi-icall check failure for XNNPACK function pointers casting HOT 2
- will you Plan to support int8 perchannel quantize for linear op? HOT 2
- building failed on Raspberry Pi 4 HOT 1
- Regarding the issue with f32-gemm-bench. HOT 2
- filtering out -mcpu=native when building with Bazel on Arm 64-bit (aarch64) HOT 2
- Help Wanted: How to use SIMD to accelerate Exponential function on CPU.
- Can RVV Kernels be enable by default? HOT 1
- Dynamic Shape Support HOT 4
- Build error when including XNNPACK using FetchContent HOT 4
- Help needed: any doc available.
- experiments-config.h is hiding the xnnpack.h header
- GELU support in XNNPACK HOT 4
- Need help: bench the 'sdpa' operator HOT 1
- Concatenate and Split don't support input that has 0 size dimension HOT 4
- Vector extension errors while building on RISC-V HOT 2
- Avoid undefined behavior in memcpy call in `xnn_define_static_reshape`
- Clamp on empty ranges should be valid HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xnnpack.