Bug Report: Wrong SystemStats::hasAVX2() result with clang-cl / 64-bit

Just stumbled upon this while investigating why a dynamically dispatched (SSE/AVX/AVX2) routine runs slower on Windows than on macOS on the exact same machine.

JUCE 8.0.12 compiled with clang-cl, running on 2018 Mac mini i7-8700B / Windows, SystemStats::hasAVX2() returns false.
As far as I understand, this is due to a stale value from previous cpuid call. Here’s the relevant bit from juce_SystemStats_windows.cpp (CPUInformation::initialise(), #if JUCE_CLANG branch):

callCPUID (info, 0x80000001);
hasFMA4  = (info[2] & (1 << 16)) != 0;

// At this point ECX holds some result, so we're not gonna get the expected
// EAX=7, ECX=0 subleaf from the next call unless we clear ECX, e.g.:
// info[2] = 0;
callCPUID (info, 7);

Presumably this affects all of hasAVX512...() as well.

2 Likes

Assuming JUCE has a good reason to use inline asm callCPUID() as opposed to an intrinsic like __cpuid() for this combo, here’s a PR that shoud match the behavior of the former with the latter (i.e. clear ecx before doing cpuid):

Thank you for reporting. Fix added here: Windows Clang: Fix a bug querying CPU information · juce-framework/JUCE@28706b9 · GitHub

1 Like