CryoSPARC job error (numpy.linalg.LinAlgError: Singular matrix)

On CryoSPARC version v5.0.1, we are seeing several Homogenous Refinement jobs failing with a numpy.linalg.LinAlgError: Singular matrix.

Traceback (most recent call last):

File "cli/run.py", line 105, in cli.run.run_job

File "cli/run.py", line 210, in cli.run.run_job_function

File "compute/jobs/refine/run.py", line 604, in compute.jobs.refine.run.run_homo_refine

File "compute/jobs/refine/run.py", line 605, in compute.jobs.refine.run.run_homo_refine

File "compute/jobs/ctf_refinement/run.py", line 436, in compute.jobs.ctf_refinement.run.full_ctf_refine

File "/opt/cryosparc/cryosparc_worker/.pixi/envs/worker/lib/python3.12/site-packages/numpy/linalg/linalg.py", line 409, in solve

r = gufunc(a, b, signature=signature, extobj=extobj)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "/opt/cryosparc/cryosparc_worker/.pixi/envs/worker/lib/python3.12/site-packages/numpy/linalg/linalg.py", line 112, in _raise_linalgerror_singular

raise LinAlgError("Singular matrix")

numpy.linalg.LinAlgError: Singular matrix

The jobs are all after importing beam shift information from EPU with a job of Import Beam Shift, and Exposure Group Utilities setup with ‘cluster&split”, Correspond particles to exposures and enforce consistency of exposure group IDs, 57 clusters, kmeans clustering, split outputs by exposure groups.

I believe only the particles from the “Exposure Group Utilities” are included in the downstream jobs. And these Homogenous Refinement jobs can occasionally be rerun and successfully complete, so the crash with the python error seems random or not always reproducible.

It looks like there were past issues with a similar crash “numpy.linalg.LinAlgError: Singular matrix” but I don’t know if those were directly resolved or people changed how they were running the jobs.

Any suggestions here for what we should try to avoid this issue?

@larsonmattr Have you tested this workflow, including the subsequent (now failing in v5.0.1) homogenous refinement, before the upgrade to v5 and found it to be working in v4?

@wtempel we have not used the ‘Import Beam Shift’ job very typically, and I don’t have a comparison from before the v5.0.1 - cannot say from a newer issue or old. The issues appeared to all occur in a single workspace, and in Homogenous Refinements, but after use of that Import Beam Shift job.

Hi there @larsonmattr,

We suspect that this may be due to the presence of one exposure group that has a small number of particles. Typically a good number of particles per exposure group are required for aberration fitting to succeed. When using exposure group utilities in the cluster&split mode (clustering on beam shift), it is possible that one or more of these clusters will end up with few micrographs, and by extension, few particles in it. It is difficult to give a general estimate of how many particles is sufficient to for aberrations per group, but as a ballpark, I would personally hope to have at least something like ~5k particles per group.

This could be resolved via two ways:

  1. Re-run the cluster&split mode but reduce the number of clusters (accepting that the separation into beam shift groups will be coarser)
  2. Remove exposure groups from the dataset if they have few particles in them. This should be possible via Exposure Group Utilities and/or Manually Curate Exposures jobs

If you need specific guidance re: the last tip, please let us know and we can give more detailed instructions to remove these exposure groups from your dataset.

Best,
Michael

@mmclean ,

Thanks much for this reply and the additional suggestions. It’s good to know what we should target for the number of particles in an Exposure Group for aberration fitting.

I tried this workflow using a “Manually Curate” job and doing a threshold based on picked particles [100, 50000] and this was used to reject 301 exposures, and drop 3 of the Exposure Groups with lowest number of particles ~ 53, 96, 99 particles in these smallest groups. Afterwards, I repeated Global CTF, Local CTF, and a final Homogenous Refinement. These jobs all completed without issue - the last one of Homogenous Refinement was where previously the “numpy.linalg.LinAlgError: Singular Matrix” would occur. I think this is a solution for our issue.

We will need to try experimenting with using fewer clusters to see whether this is good / bad for our refinements.

Thanks again and have a good weekend!

1 Like

Awesome, I’m glad this has worked for your case :slight_smile: No problem and have a good weekend as well!
Michael