Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
550 views
in Technique[技术] by (71.8m points)

sse - Do all CPUs which support AVX2 also support SSE4.2 and AVX?

I am planning to implement runtime detection of SIMD extensions. Is it such that if I find out that the processor has AVX2 support, it is also guaranteed to have SSE4.2 and AVX support?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Support for a more-recent Intel SIMD ISA extension implies support for previous SIMD ones.

AVX2 definitely implies AVX1.

I think AVX1 implies all of SSE/SSE2/SSE3/SSSE3/SSE4.1/SSE4.2 feature bits must also be set in CPUID. If not formally guaranteed, many things make this assumption and a CPU that violated it would probably not be commercially viable for general use.

Note that popcnt has its own feature bit, so in theory you could have a CPU with AVX2 and SSE4.2, but not popcnt, but many things treat SSE4.2 as implying popcnt. So it's more like you can advertize support for popcnt without SSE4.2.


In theory you could make a CPU (or virtual machine) with AVX but which didn't accept the non-VEX legacy-SSE encoding of SSE4.2 instructions like pcmpistri, but I think you'd be violating Intel's guarantees about what the AVX feature bit implies. Not sure if that's formally written down in a manual, but most software will assume that.

But AVX1 does imply support for the VEX encoding of all SSE4.2 and earlier SIMD instructions, e.g. vpcmpistri or vminss

gcc -mavx2 definitely implies AVX1 and previous extensions, but will only emit code that uses the VEX encoding. It will define the __SSE4_2__ macro and so on, though, so gcc does treat AVX2 as implying earlier SSE extensions and popcnt, but not FMA, AES-NI or PCLMUL. Those are separate features even for GCC.

(In practice you should use gcc -march=native or gcc -march=znver1 or whatever to enable all the features your CPU has, and set tuning options for it. Not just -mavx2 -mfma, that leaves tuning settings at bad defaults like splitting every possibly-unaligned 256-bit load/store into 128-bit halves.)

(Note that MSVC doesn't have as many SIMD ISA detection macros; it has one for AVX but not for all of the earlier SSE* extensions. MSVC's model is designed around the assumption that programs will do runtime CPU detection instead of being compiled for the local machine. Although MSVC does now have AVX and AVX2 options to use those as baselines.)


Note that AVX512 kind of breaks the traditions. AVX512F implies support for AVX2 and everything before it, but beyond that AVX512DQ doesn't come "before" or "after" AVX512ER, for example. You can (in theory) have either, both, or neither. (In practice, Skylake-X/Cannonlake/etc. has only a bit of overlap with Xeon Phi (Knight's Landing / Knight's Mill), beyond AVX512F. https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...