Hello Everyone,
I am queuing a series of heterogeneous refinement jobs, I have four GPU, the first three jobs were running well, but the forth failed with
Traceback (most recent call last):
File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 104, in func
with make_json_request(self, "/api", data=data, _stacklevel=4) as request:
File "/home/cryosparc/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 225, in make_request
raise CommandError(error_reason, url=url, code=code, data=resdata)
cryosparc_tools.cryosparc.errors.CommandError: *** (http://localhost:39002/api, code 500) Timeout Error
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "cryosparc_master/cryosparc_compute/run.py", line 95, in cryosparc_master.cryosparc_compute.run.main
File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_compute/jobs/utilities/run_cache_particles.py", line 31, in run
particles.read_blobs(proj_dir_abs, do_cache=True)
File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_compute/particles.py", line 120, in read_blobs
u_blob_paths = cache_run(u_rel_paths)
File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_compute/jobs/cache.py", line 112, in download_and_return_cache_paths
rc.cli.cache_sync_in_use(worker_hostname, rc._project_uid, rc._job_uid) # ignore self
File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 107, in func
raise CommandError(
cryosparc_tools.cryosparc.errors.CommandError: *** (http://localhost:39002, code 500) Encounted error from JSONRPC function "cache_sync_in_use" with params ('localhost', 'P8', 'J2570')
after ten minutes running,
I noticed the error in “cache_sync_in_use” as it prints, so i ran a “Cache Particles on SSD” job with same particles, and it failed as the same
Traceback (most recent call last):
File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 104, in func
with make_json_request(self, "/api", data=data, _stacklevel=4) as request:
File "/home/cryosparc/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/contextlib.py", line 113, in __enter__
return next(self.gen)
File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 225, in make_request
raise CommandError(error_reason, url=url, code=code, data=resdata)
cryosparc_tools.cryosparc.errors.CommandError: *** (http://localhost:39002/api, code 500) Timeout Error
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "cryosparc_master/cryosparc_compute/run.py", line 95, in cryosparc_master.cryosparc_compute.run.main
File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_compute/jobs/utilities/run_cache_particles.py", line 31, in run
particles.read_blobs(proj_dir_abs, do_cache=True)
File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_compute/particles.py", line 120, in read_blobs
u_blob_paths = cache_run(u_rel_paths)
File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_compute/jobs/cache.py", line 112, in download_and_return_cache_paths
rc.cli.cache_sync_in_use(worker_hostname, rc._project_uid, rc._job_uid) # ignore self
File "/home/cryosparc/cryosparc/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 107, in func
raise CommandError(
cryosparc_tools.cryosparc.errors.CommandError: *** (http://localhost:39002, code 500) Encounted error from JSONRPC function "cache_sync_in_use" with params ('localhost', 'P8', 'J2570')
I want to know what is going on, any help is grateful.