projg2 / cpuid2cpuflags Goto Github PK
View Code? Open in Web Editor NEWTool to generate CPU_FLAGS_* for your CPU
License: GNU General Public License v2.0
Tool to generate CPU_FLAGS_* for your CPU
License: GNU General Public License v2.0
cpuid2cpuflags -- CPU_FLAGS_* generator (c) 2017-2024 Michał Górny SPDX-License-Identifier: GPL-2.0-or-later Usage ~~~~~ The program attempts to obtain the identification and capabilities of the currently used CPU, and print the matching set of CPU_FLAGS_* flags for Gentoo. To use it, just run it: $ cpuid2cpuflags CPU_FLAGS_X86: 3dnow 3dnowext mmx mmxext sse sse2 sse3 There are no command-line options. Please note that the program identifies the apparent CPU capabilities using available CPU calls or system interfaces, *not* the capabilities indicated by compiler flags. The flag definitions match the flags described in Gentoo profiles/desc at the time of program release. If additional flags are introduced in the future, they will be added in a future program release. The output format is compatible both with Portage (package.use) and Paludis (use.conf/options.conf). If you find it useful to generate/update it automatically, you can use a dedicated file: $ mkdir /etc/portage/package.use # if not used yet $ echo "*/* $(cpuid2cpuflags)" > /etc/portage/package.use/00cpuflags Building ~~~~~~~~ These are the steps necessary to build the ./cpuid2cpuflags program: $ autoreconf -vi $ ./configure $ make Implementation details ~~~~~~~~~~~~~~~~~~~~~~ X86 (incl. x86-64) ------------------ On x86 platforms, cpuid2cpuflags issues the CPUID instruction to obtain processor capabilities. This should work reliably across different systems and kernels, unless the system somehow blocks this instruction. If this is the case, please report a bug. ARM and AArch64 --------------- On ARM platforms, the userspace processes are not allowed to obtain processor information directly. Instead, the program is relying on kernel identification of the CPU provided via the system interfaces. Currently, only Linux is supported. On Linux, two interfaces are used: uname() to identify the CPU family, and getauxval(AT_HWCAP*...) to obtain detailed feature flags. The textual value obtained from uname (armv* or aarch64) is used to enable appropriate ARM version flags and some feature flags. It is also used to determine whether the kernel is 64- or 32-bit since that affects the interpretation of AT_HWCAP* flags. Afterwards, the remaining feature flags are enabled based on either flag bits provided by AT_HWCAP*, or implicitly based on the subarchitecture (i.e. currently a number of features is always set on AArch64). It should be noted that the program strongly depends on correct identification of the CPU in the kernel. If you find the results incorrect, please report a bug but I can't promise I'll be able to find a good workaround.
#https://github.com/fritzone/autocmake
https://cmake.org/cmake/help/v3.7/module/CPackDeb.html or CPackrpm
toys like binary-gentoo on pypi for say a cheap web binhost or renting gravaton2 arm64 docker to run gentoo builds might be of some small use to where one is rammed with rpm/or deb vps servers as a defacto...
or github actions etc... build gentoo toys >>> github/lfs .... ( @spreequalle/gentoo-binhost )
(resolve march native is python based .. binary-gentoo has want of cpuid2cpuflags as well )
hopefully cmakes multi-arch abilities helps add mor arches... whom knows
its a start anyways.
I don't know about Paludis, but portage config files use key=value format for configuration on make.conf and category/package flags for per package configuration files, but the tool is currently printing in the var:values format.
Easy enough to edit by hand, but easy enough to fix too, either on the README or the program itself. Maybe adding a flag to make it more script friendly, with which it would only print the flags.
It has been suggested that we could have a tool that would take an -march=
option and output the corresponding CPU_FLAGS_*
. I'm thinking the cleanest approach would be to actually to defer to GCC to expand CFLAGS
into a set of -m
options, and map them onto CPU_FLAGS_*
. However, we first need to check if all of the flags have corresponding -m
options.
Hi,
related with discussion in [1] and [2], would it be possible to include rdrand detection? Currently, dev-libs/json-c
and dev-haskell/cryptonite
have rdrand local USE flag [3], so it could be included in cpu_flags_x86
USE_EXPAND. Should this be added in profiles/desc
first, before including it here? (according to #1 (comment))
[1] gentoo/gentoo#15895
[2] https://bugs.gentoo.org/724354
[3] https://packages.gentoo.org/useflags/cpu-flags-x86-rdrand
cc @juippis
I have a recent AMD ZEN 4 (AMD Ryzen 7 PRO 7840U) and it seems that f.ex. avx512_vnni
flag is not registered. I have not cross referenced the list below, so there might be more missing.
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics
CPU family: 25
Model: 116
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
Stepping: 1
CPU(s) scaling MHz: 50%
CPU max MHz: 6076,0000
CPU min MHz: 400,0000
BogoMIPS: 6590,60
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht sys
call nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_a
picid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave a
vx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs sk
init wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd m
ba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a
avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt x
savec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irp
erf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid dec
odeassists pausefilter pfthreshold v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospk
e avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid overflow_recov succ
or smca flush_l1d
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 256 KiB (8 instances)
L1i: 256 KiB (8 instances)
L2: 8 MiB (8 instances)
L3: 16 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-15
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Retbleed: Not affected
Spec rstack overflow: Vulnerable: Safe RET, no microcode
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Enhanced / Automatic IBRS, IBPB conditional, STIBP always-on, RSB filling, PBRSB-eIBRS Not a
ffected
Srbds: Not affected
Tsx async abort: Not affected
# cpuid2cpuflags
CPU_FLAGS_X86: aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt rdrand sse sse2 sse3 sse4_1 sse4_2 ssse3
# cpuid2cpuflags >> /etc/portage/make.conf
requires editing make.conf... IMHO it would be better to have sth like
# cpuid2cpuflags
CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt rdrand sse sse2 sse3 sse4_1 sse4_2 ssse3"
I've played with it on ARM today and it seems to work. Two changes were required:
getauxval(3)
, a non-standard glibc extension, with our native elf_aux_info(3)
callarm
without any version (generation)The actual patch is here.
Seems to work fine, except that the GLIBC test in src/arm.c is counter-productive...
I'm running Gentoo Linux as a QEMU 9.0 guest on a Windows 10 host, with -cpu Westmere
.
cpuid2cpuflags
shows the following output:
CPU_FLAGS_X86: aes avx2 mmx mmxext pclmul popcnt sha sse sse2 sse3 sse4_1 sse4_2 ssse3 vpclmulqdq
but avx2
, sha
, and vpclmulqdq
look wrong to me.
I do not have a real Westmere system to compare.
I suggest to adjust the format a little bit to enable simple copy and paste
Output now:
CPU_FLAGS_X86: aes avx f16c mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3
What I suggest:
CPU_FLAGS_X86="aes avx f16c mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3"
Hi,
I checked this issue out and apparently the way grepwood implemented it is the easiest.
The only thing analogous to x86's cpuid instruction on PowerPC is copying the value of the Processor Version Register. This special register is read-only and it is restricted to supervisors only. Some kernels like Linux and FreeBSD do the smart thing and trap its value in memory and expose it for userspace applications to have a look at. That makes reading /proc/self/auxv fair enough a method. The PVR is a 32bit register and its first half contains the version, the other revision. More info: http://www.cebix.net/downloads/bebox/pem32b.pdf pages 78-79
Example: the PVR on Cell Broadband Engine is 0x 0070 0501.
From arch/powerpc/kernel/cputable.c:
{ /* Cell Broadband Engine */
.pvr_mask = 0xffff0000,
.pvr_value = 0x00700000,
.cpu_name = "Cell Broadband Engine",
.cpu_features = CPU_FTRS_CELL,
.cpu_user_features = COMMON_USER_PPC64 |
PPC_FEATURE_CELL | PPC_FEATURE_HAS_ALTIVEC_COMP |
PPC_FEATURE_SMT,
.mmu_features = MMU_FTRS_CELL,
.icache_bsize = 128,
.dcache_bsize = 128,
.num_pmcs = 4,
.pmc_type = PPC_PMC_IBM,
.oprofile_cpu_type = "ppc64/cell-be",
.oprofile_type = PPC_OPROFILE_CELL,
.platform = "ppc-cell-be",
We can see that version clearly matches 0x0070. Checking for PPC_FEATURE_HAS_ALTIVEC is legit, because of $(grep PPC_FEATURE_HAS_ALTIVEC arch/powerpc) in the kernel source tree. These 2 macros are always equal.
Any chance to get detection for rdrand?
On my vps cpuid2cpuflags does not list ssse3 even though it should support it according to wikipedia.
lscpu output:
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 40 bits physical, 48 bits virtual
CPU(s): 1
On-line CPU(s) list: 0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel Xeon Processor (Skylake, IBRS)
Stepping: 4
CPU MHz: 2100.000
BogoMIPS: 4200.00
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 4096K
L3 cache: 16384K
NUMA node0 CPU(s): 0
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single pti ssbd ibrs ibpb fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 arat
`
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.