Hi @hsnyder
Some of them were the same error, ill paste it below. But I had deleted most of the failed jobs to declutter my workspace. (It was a stupid mistake, I thought it was an error that would go away if I restarted the job and I didn’t realize it would be important to try and figure out what the issue is if it turns out to be a recurring problem as it did)
[CPU: 27.75 GB]
Traceback (most recent call last):
File "/projappl/project_2000889/usrappl/mustafan/cryoSPARC/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 2294, in run_with_except_hook
run_old(*args, **kw)
File "cryosparc_master/cryosparc_compute/gpu/gpucore.py", line 134, in cryosparc_master.cryosparc_compute.gpu.gpucore.GPUThread.run
File "cryosparc_master/cryosparc_compute/gpu/gpucore.py", line 135, in cryosparc_master.cryosparc_compute.gpu.gpucore.GPUThread.run
File "cryosparc_master/cryosparc_compute/engine/engine.py", line 1080, in cryosparc_master.cryosparc_compute.engine.engine.process.work
File "cryosparc_master/cryosparc_compute/engine/engine.py", line 129, in cryosparc_master.cryosparc_compute.engine.engine.EngineThread.load_image_data_gpu
File "/projappl/project_2000889/usrappl/mustafan/cryoSPARC/cryosparc_worker/cryosparc_compute/particles.py", line 34, in get_original_real_data
data = self.blob.view()
File "/projappl/project_2000889/usrappl/mustafan/cryoSPARC/cryosparc_worker/cryosparc_compute/blobio/mrc.py", line 145, in view
return self.get()
File "/projappl/project_2000889/usrappl/mustafan/cryoSPARC/cryosparc_worker/cryosparc_compute/blobio/mrc.py", line 140, in get
_, data, total_time = prefetch.synchronous_native_read(self.fname, idx_start = self.page, idx_limit = self.page+1)
File "cryosparc_master/cryosparc_compute/blobio/prefetch.py", line 82, in cryosparc_master.cryosparc_compute.blobio.prefetch.synchronous_native_read
OSError:
IO request details:
Error ocurred (Invalid argument) at line 680 in mrc_readmic (1)
The requested frame/particle cannot be accessed. The file may be corrupt, or there may be a mismatch between the file and its associated metadata (i.e. cryosparc .cs file).
filename: /run/nvme/job_21979108/data/instance_puhti-login14.bullx:39401/links/P1-J20-1718041228/e56454fc75c697eae6dba45f6e92b2ba7caaf880.mrc
filetype: 0
header_only: 0
idx_start: 0
idx_limit: 1
eer_upsampfactor: 2
eer_numfractions: 40
num_threads: 6
buffer: (nil)
buffer_sz: 0
nx, ny, nz: 0 0 0
dtype: 0
total_time: -1.000000
io_time: 0.000000
[CPU: 2.92 GB]
Finalizing Job...
[CPU: 22.35 GB]
Traceback (most recent call last):
File "/projappl/project_2000889/usrappl/mustafan/cryoSPARC/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 2294, in run_with_except_hook
run_old(*args, **kw)
File "cryosparc_master/cryosparc_compute/gpu/gpucore.py", line 134, in cryosparc_master.cryosparc_compute.gpu.gpucore.GPUThread.run
File "cryosparc_master/cryosparc_compute/gpu/gpucore.py", line 135, in cryosparc_master.cryosparc_compute.gpu.gpucore.GPUThread.run
File "cryosparc_master/cryosparc_compute/engine/engine.py", line 1080, in cryosparc_master.cryosparc_compute.engine.engine.process.work
File "cryosparc_master/cryosparc_compute/engine/engine.py", line 129, in cryosparc_master.cryosparc_compute.engine.engine.EngineThread.load_image_data_gpu
File "/projappl/project_2000889/usrappl/mustafan/cryoSPARC/cryosparc_worker/cryosparc_compute/particles.py", line 34, in get_original_real_data
data = self.blob.view()
File "/projappl/project_2000889/usrappl/mustafan/cryoSPARC/cryosparc_worker/cryosparc_compute/blobio/mrc.py", line 145, in view
return self.get()
File "/projappl/project_2000889/usrappl/mustafan/cryoSPARC/cryosparc_worker/cryosparc_compute/blobio/mrc.py", line 140, in get
_, data, total_time = prefetch.synchronous_native_read(self.fname, idx_start = self.page, idx_limit = self.page+1)
File "cryosparc_master/cryosparc_compute/blobio/prefetch.py", line 82, in cryosparc_master.cryosparc_compute.blobio.prefetch.synchronous_native_read
OSError:
IO request details:
Error ocurred (Invalid argument) at line 680 in mrc_readmic (1)
The requested frame/particle cannot be accessed. The file may be corrupt, or there may be a mismatch between the file and its associated metadata (i.e. cryosparc .cs file).
filename: /run/nvme/job_21979108/data/instance_puhti-login14.bullx:39401/links/P1-J20-1718041228/e56454fc75c697eae6dba45f6e92b2ba7caaf880.mrc
filetype: 0
header_only: 0
idx_start: 80
idx_limit: 81
eer_upsampfactor: 2
eer_numfractions: 40
num_threads: 6
buffer: (nil)
buffer_sz: 0
nx, ny, nz: 0 0 0
dtype: 0
total_time: -1.000000
io_time: 0.000000
And one had an error regarding SSD. I don’t know what changed with that job to cause this issue since the lane setup is the same, but it didn’t occur again, so I’m not sure it’s relevant, but I’ll paste it below anyway. I’m not sure how to sort out such an issue if it does occur again without disabling SSD Caching, though.
[CPU: 21.25 GB]
Traceback (most recent call last):
File "cryosparc_master/cryosparc_compute/run.py", line 115, in cryosparc_master.cryosparc_compute.run.main
File "cryosparc_master/cryosparc_compute/jobs/class2D/run_streaming.py", line 191, in cryosparc_master.cryosparc_compute.jobs.class2D.run_streaming.run_class_2D_streaming
File "cryosparc_master/cryosparc_compute/jobs/class2D/run_streaming.py", line 355, in cryosparc_master.cryosparc_compute.jobs.class2D.run_streaming.prepare_particles
File "/projappl/project_2000889/usrappl/mustafan/cryoSPARC/cryosparc_worker/cryosparc_compute/particles.py", line 120, in read_blobs
u_blob_paths = cache_run(u_rel_paths)
File "/projappl/project_2000889/usrappl/mustafan/cryoSPARC/cryosparc_worker/cryosparc_compute/jobs/cache_v2.py", line 821, in run
return run_with_executor(rel_sources, executor)
File "/projappl/project_2000889/usrappl/mustafan/cryoSPARC/cryosparc_worker/cryosparc_compute/jobs/cache_v2.py", line 859, in run_with_executor
state = drive.allocate(sources, active_run_ids=info["active_run_ids"])
File "/projappl/project_2000889/usrappl/mustafan/cryoSPARC/cryosparc_worker/cryosparc_compute/jobs/cache_v2.py", line 582, in allocate
raise RuntimeError(
RuntimeError: SSD cache needs additional 3.44 TiB but drive has 1.73 TiB free. CryoSPARC can only free a maximum of 0.00 B. This may indicate that programs other than CryoSPARC are using the SSD. Please remove any files on the SSD that are outside the /run/nvme/job_21992923/data/instance_puhti-login14.bullx:39401 folder or disable SSD cache for this job.