CryoSPARC job error (numpy.linalg.LinAlgError: Singular matrix)

On CryoSPARC version v5.0.1, we are seeing several Homogenous Refinement jobs failing with a numpy.linalg.LinAlgError: Singular matrix.

Traceback (most recent call last):

File "cli/run.py", line 105, in cli.run.run_job

File "cli/run.py", line 210, in cli.run.run_job_function

File "compute/jobs/refine/run.py", line 604, in compute.jobs.refine.run.run_homo_refine

File "compute/jobs/refine/run.py", line 605, in compute.jobs.refine.run.run_homo_refine

File "compute/jobs/ctf_refinement/run.py", line 436, in compute.jobs.ctf_refinement.run.full_ctf_refine

File "/opt/cryosparc/cryosparc_worker/.pixi/envs/worker/lib/python3.12/site-packages/numpy/linalg/linalg.py", line 409, in solve

r = gufunc(a, b, signature=signature, extobj=extobj)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/opt/cryosparc/cryosparc_worker/.pixi/envs/worker/lib/python3.12/site-packages/numpy/linalg/linalg.py", line 112, in _raise_linalgerror_singular

raise LinAlgError("Singular matrix")

numpy.linalg.LinAlgError: Singular matrix

The jobs are all after importing beam shift information from EPU with a job of Import Beam Shift, and Exposure Group Utilities setup with ‘cluster&split”, Correspond particles to exposures and enforce consistency of exposure group IDs, 57 clusters, kmeans clustering, split outputs by exposure groups.

I believe only the particles from the “Exposure Group Utilities” are included in the downstream jobs. And these Homogenous Refinement jobs can occasionally be rerun and successfully complete, so the crash with the python error seems random or not always reproducible.

It looks like there were past issues with a similar crash “numpy.linalg.LinAlgError: Singular matrix” but I don’t know if those were directly resolved or people changed how they were running the jobs.

Any suggestions here for what we should try to avoid this issue?

@larsonmattr Have you tested this workflow, including the subsequent (now failing in v5.0.1) homogenous refinement, before the upgrade to v5 and found it to be working in v4?

@wtempel we have not used the ‘Import Beam Shift’ job very typically, and I don’t have a comparison from before the v5.0.1 - cannot say from a newer issue or old. The issues appeared to all occur in a single workspace, and in Homogenous Refinements, but after use of that Import Beam Shift job.