Potential speedup for MKL operations on AMD Ryzen/Threadripper/Epyc

Hi all, thought I’d share this in case anyone is using newer AMD hardware. Apparently Intel’s MKL library runs a hardware check, and by default will prevent the AVX2 unit on AMD processors from functioning. For a detailed example of the speedup in some workflows, see here.

On linux, the workaround is quite simple, set the environment variable below. Note that intel may disable this debug command in the future, but for now it appears to give real boosts in some applications.

export MKL_DEBUG_CPU_TYPE=5

Cheers,

-Greg

Hi @gdodge,

This is interesting- have you seen any improvements while using this environment variable with cryoSPARC? We have a 2nd gen Threadripper compute node here at the office- we’ll try this out and report here if we notice anything different! We’re also planning on building a Threadripper 3 machine, so we’ll keep this post updated.

Hey @stephan

Are there any particular jobs that are MKL heavy? I haven’t done any benchmarks myself, but I’d be happy to toggle the flag on and off and report back! I’m using a threadripper 2950x here.

-Greg

I’ve tested this with cisTEM, which relies heavily on MKL and has ICC compiled binaries. It resulted in a ~2x speed-up.

Nice, glad it’s proven to be useful! It’s a shame these new chips are getting handicapped via software like this, they really are fabulous otherwise.

@gdodge Actually this has been an issue for more than a decade. After following up on widespread suspicion at the time, kernel developer Agner Fog discovered ICC’s discriminatory CPU dispatching. He even used a feature of VIA CPUs to alter the VendorID flag and provide a smoking gun.

After a legal battle Intel was forced to advertise ICC as providing the best performance only on Intel chips, and not on all CPUs as claimed previously. You can read about the saga (and many technical details) on his blog.

1 Like