Hello,
I recently updated our Cryosparc instance to 3.1. Since then all jobs fail with this error on our worker nodes.
Here is what I see on my end.
cryosparcm joblog j17 p36
Traceback (most recent call last):
File "/opt/cryosparc-v2/cryosparc2_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/opt/cryosparc-v2/cryosparc2_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/opt/cryosparc-v2/cryosparc2_master/cryosparc_compute/client.py", line 86, in <module>
print(eval("cli."+command))
File "<string>", line 1, in <module>
File "/opt/cryosparc-v2/cryosparc2_master/cryosparc_compute/client.py", line 59, in func
assert False, res['error']
AssertionError: {'code': 500, 'data': None, 'message': "OtherError: argument of type 'NoneType' is not iterable", 'name': 'OtherError'}
On the UI, there is a different error:
Loading raw movie data from J8/imported/14sep05c_00024sq_00003hl_00002es.frames.tif …
Done in 2.79s
Loading gain data from J8/imported/norm-amibox05-0.mrc …
Done in 0.04s
Processing …[CPU: 1.25 GB] Traceback (most recent call last):
File “cryosparc2_compute/jobs/runcommon.py”, line 1685, in run_with_except_hook
run_old(*args, **kw)
File “/cryosparc/worker/cryosparc2_worker/deps/anaconda/lib/python2.7/threading.py”, line 754, in run
self.__target(*self.__args, **self.__kwargs)
File “cryosparc2_compute/jobs/pipeline.py”, line 165, in thread_work
work = processor.process(item)
File “cryosparc2_worker/cryosparc2_compute/jobs/motioncorrection/run_patch.py”, line 157, in cryosparc2_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
File “cryosparc2_worker/cryosparc2_compute/jobs/motioncorrection/run_patch.py”, line 160, in cryosparc2_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
File “cryosparc2_worker/cryosparc2_compute/jobs/motioncorrection/run_patch.py”, line 161, in cryosparc2_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.motionworker.process
File “cryosparc2_worker/cryosparc2_compute/jobs/motioncorrection/patchmotion.py”, line 77, in cryosparc2_compute.jobs.motioncorrection.patchmotion.unbend_motion_correction
File “cryosparc2_worker/cryosparc2_compute/jobs/motioncorrection/patchmotion.py”, line 303, in cryosparc2_compute.jobs.motioncorrection.patchmotion.unbend_motion_correction
File “cryosparc2_worker/cryosparc2_compute/jobs/motioncorrection/patchmotion.py”, line 297, in cryosparc2_compute.jobs.motioncorrection.patchmotion.unbend_motion_correction.do_align_rigid
File “cryosparc2_worker/cryosparc2_compute/jobs/motioncorrection/patchmotion.py”, line 272, in cryosparc2_compute.jobs.motioncorrection.patchmotion.unbend_motion_correction.do_align_rigid.val_and_deriv_gpu
File “/cryosparc/worker/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/pycuda/gpuarray.py”, line 551, in fill
value, self.gpudata, self.mem_size)
File “/cryosparc/worker/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/pycuda/driver.py”, line 495, in function_prepared_async_call
func._set_block_shape(*block)
LogicError: cuFuncSetBlockShape failed: invalid resource handle[CPU: 166.4 MB] Outputting partial results now…
[CPU: 166.4 MB] Traceback (most recent call last):
File “cryosparc2_worker/cryosparc2_compute/run.py”, line 85, in cryosparc2_compute.run.main
File “cryosparc2_worker/cryosparc2_compute/jobs/motioncorrection/run_patch.py”, line 359, in cryosparc2_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi
AssertionError: Child process with PID 5471 has terminated unexpectedly!
After the installation, I made sure all the dependencies are good by running
cryosparcm/cryosparcw foredeps
The master and the worker nodes are using cuda-11. The worker node here has 4x2080tis with 460.xx driver from Nvidia.
I’d really appreciate any help on this.