Hi,
since about 2 weeks we run our jobs ob v3. and we noticed an accumulation of errors especially 3D work (ab-initio, refinements, 3D variability) dies at seemingly random time points in a non reproduceable fashion. a few examples:
one ab-initio job:
Traceback (most recent call last):
File “cryosparc_worker/cryosparc_compute/run.py”, line 84, in cryosparc_compute.run.main
File “cryosparc_worker/cryosparc_compute/jobs/abinit/run.py”, line 304, in cryosparc_compute.jobs.abinit.run.run_homo_abinit
File “cryosparc_worker/cryosparc_compute/engine/engine.py”, line 1119, in cryosparc_compute.engine.engine.process
File “cryosparc_worker/cryosparc_compute/engine/engine.py”, line 1120, in cryosparc_compute.engine.engine.process
File “cryosparc_worker/cryosparc_compute/engine/engine.py”, line 1078, in cryosparc_compute.engine.engine.process.work
File “cryosparc_worker/cryosparc_compute/engine/engine.py”, line 389, in cryosparc_compute.engine.engine.EngineThread.find_and_set_best_pose_shift
File “<array_function internals>”, line 6, in unravel_index
ValueError: index 1063733622 is out of bounds for array with size 960
ab-initio-job with the same inputs but happend at different iteration:
Traceback (most recent call last):
File “cryosparc_worker/cryosparc_compute/run.py”, line 84, in cryosparc_compute.run.main
File “cryosparc_worker/cryosparc_compute/jobs/abinit/run.py”, line 222, in cryosparc_compute.jobs.abinit.run.run_homo_abinit
File “/users/svc_cryosparc/software/regular/cryosparc2_worker/cryosparc_compute/sigproc.py”, line 428, in align_density
assert n.all(n.isfinite(M))
AssertionError
a 3D variability job:
Traceback (most recent call last): File "cryosparc_worker/cryosparc_compute/run.py", line 84, in cryosparc_compute.run.main File "cryosparc_worker/cryosparc_compute/jobs/var3D/run.py", line 524, in cryosparc_compute.jobs.var3D.run.run File "cryosparc_worker/cryosparc_compute/jobs/var3D/run.py", line 436, in cryosparc_compute.jobs.var3D.run.run.M_step File "<__array_function__ internals>", line 6, in eigvals File "/users/svc_cryosparc/software/regular/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/numpy/linalg/linalg.py", line 1063, in eigvals _assert_finite(a) File "/users/svc_cryosparc/software/regular/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/numpy/linalg/linalg.py", line 209, in _assert_finite raise LinAlgError("Array must not contain infs or NaNs") numpy.linalg.LinAlgError: Array must not contain infs or NaNs
Other ab-initio run in other project:
Traceback (most recent call last):
File “cryosparc_worker/cryosparc_compute/run.py”, line 84, in cryosparc_compute.run.main
File “cryosparc_worker/cryosparc_compute/jobs/abinit/run.py”, line 304, in cryosparc_compute.jobs.abinit.run.run_homo_abinit
File “cryosparc_worker/cryosparc_compute/engine/engine.py”, line 1119, in cryosparc_compute.engine.engine.process
File “cryosparc_worker/cryosparc_compute/engine/engine.py”, line 1120, in cryosparc_compute.engine.engine.process
File “cryosparc_worker/cryosparc_compute/engine/engine.py”, line 1078, in cryosparc_compute.engine.engine.process.work
File “cryosparc_worker/cryosparc_compute/engine/engine.py”, line 389, in cryosparc_compute.engine.engine.EngineThread.find_and_set_best_pose_shift
File “<array_function internals>”, line 6, in unravel_index
ValueError: index -1085655912 is out of bounds for array with size 8064
The problems accumulate in multiple very different dataset since the update and ab-initio jobs seem to be especially vulnerable. Also old jobs that actually run through that I just clone get this issue now.
Any idea what the reason could be?
Best,
David