Over the Golden Week holiday, I had a local refinement crash with a NaN error on 600 particles. This extraction stack had previously gone through local refinement, local and multiple rounds of global CTF refinement without error, but in a final local refinement run crashed with a NaN error. I cleared and restarted the run, which has now completed successfully.
System is ECC-RDIMM, Threadripper Pro with Quadro A6000 (Ampere) cards with no cryoSPARC, GPU or memory related errors in dmesg
or other system logs.
Anyone had a similar experience?