numpy.linalg.LinAlgError: Array must not contain infs or NaNs from 3D Variability of v3.1.0

Hi All,

I searched the following error but no solution so I posted it as a separate topic. Any suggestion? Thanks.

[CPU: 14.90 GB] Start iteration 6 of 20
[CPU: 14.90 GB] batch 3567 of 3567
[CPU: 14.89 GB] Done. Solving…
[CPU: 15.83 GB] diagnostic: min-ev nan
[CPU: 15.40 GB] Traceback (most recent call last):
File “cryosparc_worker/cryosparc_compute/run.py”, line 84, in cryosparc_compute.run.main
File “cryosparc_worker/cryosparc_compute/jobs/var3D/run.py”, line 524, in cryosparc_compute.jobs.var3D.run.run
File “cryosparc_worker/cryosparc_compute/jobs/var3D/run.py”, line 436, in cryosparc_compute.jobs.var3D.run.run.M_step
File “<array_function internals>”, line 6, in eigvals
File “/data/donghua/cryosparc/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/numpy/linalg/linalg.py”, line 1063, in eigvals
_assert_finite(a)
File “/data/donghua/cryosparc/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/numpy/linalg/linalg.py”, line 209, in _assert_finite
raise LinAlgError(“Array must not contain infs or NaNs”)
numpy.linalg.LinAlgError: Array must not contain infs or NaNs

Dear @donghuachen, can you please see if this post helps? 3DVar must not contain infs or NaN

@spunjani, Thanks. I am re-running the same job with another GPU on the same computer.

@spunjani, I re-run the same job on another GPU of the same computer and succeeded.

1 Like

Hi,
I am receiving the same exact error on iteration 6 of 20. @donghuachen said they were able to use a different GPU and it worked. I also tried this and was still met with the same problem. What would cause this error? I have ran many 3D var jobs and never had this issue before.

Thanks,
Karl

@kherbine Please can you post the output of this command (replacing P99 and J123 with the job’s actual project and job IDs):

cryosparcm cli "get_job('P99', 'J123', 'job_type', 'version', 'params_spec', 'input_slot_groups')"

Hi @wtempel,

I actually came across another thread and increased Lambda from 0.01 to 0.02 and that fixed my issue and the job ran to completion. However, here is the output:

{'_id': '667b28894222c50cf6d95794', 'input_slot_groups': [{'connections': [{'group_name': 'particles', 'job_uid': 'J913', 'slots': [{'group_name': 'particles', 'job_uid': 'J913', 'result_name': 'blob', 'result_type': 'particle.blob', 'slot_name': 'blob', 'version': 'F'}, {'group_name': 'particles', 'job_uid': 'J913', 'result_name': 'ctf', 'result_type': 'particle.ctf', 'slot_name': 'ctf', 'version': 'F'}, {'group_name': 'particles', 'job_uid': 'J913', 'result_name': 'alignments3D', 'result_type': 'particle.alignments3D', 'slot_name': 'alignments3D', 'version': 'F'}, {'group_name': 'particles', 'job_uid': 'J913', 'result_name': 'alignments2D', 'result_type': 'particle.alignments2D', 'slot_name': None, 'version': 'F'}, {'group_name': 'particles', 'job_uid': 'J913', 'result_name': 'location', 'result_type': 'particle.location', 'slot_name': None, 'version': 'F'}, {'group_name': 'particles', 'job_uid': 'J913', 'result_name': 'pick_stats', 'result_type': 'particle.pick_stats', 'slot_name': None, 'version': 'F'}]}], 'count_max': inf, 'count_min': 1, 'description': 'Particle stacks to use. Multiple stacks will be concatenated.', 'name': 'particles', 'repeat_allowed': False, 'slots': [{'description': '', 'name': 'blob', 'optional': False, 'title': 'Particle data blobs', 'type': 'particle.blob'}, {'description': '', 'name': 'ctf', 'optional': False, 'title': 'Particle ctf parameters', 'type': 'particle.ctf'}, {'description': '', 'name': 'alignments3D', 'optional': False, 'title': 'Particle alignments3D parameters', 'type': 'particle.alignments3D'}], 'title': 'Particle stacks', 'type': 'particle'}, {'connections': [{'group_name': 'mask', 'job_uid': 'J917', 'slots': [{'group_name': 'mask', 'job_uid': 'J917', 'result_name': 'mask', 'result_type': 'volume.blob', 'slot_name': 'mask', 'version': 'F'}]}], 'count_max': 1, 'count_min': 1, 'description': '', 'name': 'mask', 'repeat_allowed': False, 'slots': [{'description': '', 'name': 'mask', 'optional': False, 'title': 'Mask raw data', 'type': 'volume.blob'}], 'title': 'Mask', 'type': 'mask'}], 'job_type': 'var_3D', 'params_spec': {'compute_use_ssd': {'value': False}, 'var_K': {'value': 5}, 'var_filter_res': {'value': 6}, 'var_lambda': {'value': 0.02}}, 'project_uid': 'P21', 'uid': 'J918', 'version': 'v4.5.1'}

Thanks,
Karl

1 Like

I encountered this problem now. I got the error at the first iteration. In the job, I had chosen “Input” for the “Per-particle scale” option. Reverting to default “Optimal” resolved the problem.

If you have not overridden the failed job with the per-particle scale setting, please can you post the outputs of these commands

csprojectid=P99 # replace with actual project ID
csjobid=J199 # replace with id of failed job
cryosparcm joblog $csprojectid $csjobid | tail -n 40
cryosparcm eventlog $csprojectid $csjobid | tail -n 40
cryosparcm cli "get_job('$csprojectid', '$csjobid', 'job_type', 'version', 'instance_information', 'status',  'params_spec', 'errors_run')"