Hello,
Some of our cyrosparc users have reported a problem where enabling the particle cache sometimes fails. The job log contains this for the actual caching:
--------------------------------------------------------------
SSD cache ACTIVE at /scratch/cryosparc_cache/instance_computer:39001 (10 GB reserve)
---------------------------------------------------
- Disk use - Amount - Cache use - Amount -
---------------------------------------------------
- Total - 5.82 TiB - Hits - 0.00 B -
- Usable - 5.81 TiB - Misses - 2.05 TiB -
- Used - 5.78 TiB - Acquired - 2.05 TiB -
- Free - 36.82 MiB - Required - 0.00 B -
---------------------------------------------------
Progress: [▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇] 13750/13750 (100%)
Transferred:
018445773618901923716_SomeThing_13543301_Data_13530043_10_20260114_174927_fractions_patch_aligned_doseweighted_particles.mrc
(108.00 MiB)
Threads: 2
Avg speed: 690.83 MiB/s
Remaining: 0h 00m 00s (0.00 B)
Elapsed: 0h 52m 18s
Active jobs: P40-J106
SSD cache complete for 13750 file(s)
--------------------------------------------------------------
But then the job fails with:
Traceback (most recent call last):
File "cli/run.py", line 105, in cli.run.run_job
File "cli/run.py", line 210, in cli.run.run_job_function
File "compute/jobs/class2D/run.py", line 255, in compute.jobs.class2D.run.run_class_2D
File "/computer/cryosparc/cryosparc_worker/compute/particles.py", line 56, in get_prepared_fspace_data
return fourier.resample_fspace(fourier.fft(self.get_prepared_real_data()), self.dataset.N)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/computer/cryosparc/cryosparc_worker/compute/particles.py", line 49, in get_prepared_real_data
self.dataset.prepare_real_window * (self.get_original_real_data())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/computer/cryosparc/cryosparc_worker/compute/particles.py", line 38, in get_original_real_data
data = self.blob.view()
^^^^^^^^^^^^^^^^
File "/computer/cryosparc/cryosparc_worker/compute/blobio/mrc.py", line 173, in view
return self.get()
^^^^^^^^^^
File "/computer/cryosparc/cryosparc_worker/compute/blobio/mrc.py", line 164, in get
x, y, z, dtype, total_time, io_time, data = ioengine.sync_file_read(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/computer/cryosparc/cryosparc_worker/core/ioengine/cmdbuf.py", line 187, in sync_file_read
return await_async_file_read(iocb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/computer/cryosparc/cryosparc_worker/core/ioengine/cmdbuf.py", line 159, in await_async_file_read
iocb.wait()
File "/computer/cryosparc/cryosparc_worker/core/ioengine/cmdbuf.py", line 104, in wait
raise IOError("\n\n".join(errs))
OSError: I/O error, mrc_readmic (1) line 1023: Invalid argument
The requested frame/particle cannot be accessed. The file may be corrupt, or there may be a mismatch between the file and its associated metadata (i.e. cryosparc .cs file).
I/O request details:
filename:
/scratch/cryosparc_cache/instance_computer:39001/links/P40-J106-1772625386/8e269673a7f71cb0673c5a296e5492bfdf82b16e.mrc
data type: 0x10
frames: [75:76]
eer upsample factor: 2
eer number of fractions: 40
Looking at the implicated symlink /scratch/cryosparc_cache/instance_computer:39001/links/P40-J106-1772625386/8e269673a7f71cb0673c5a296e5492bfdf82b16e.mrc, it’s pointing at a file in the store-v2 tree which has a non-zero size, but it occupies 0 blocks on disk, i.e. a sparse file containing no data (note the “0” at the start):
ls -ls 8e269673a7f71cb0673c5a296e5492bfdf82b16e
0 -rw-r--r-- 1 cryosparc cryosparc 168100864 Mar 5 17:02 8e269673a7f71cb0673c5a296e5492bfdf82b16e
The cache is on an xfs filesystem, backed by a ~6TB software RAID0 consisting of 2x nvme devices. The filesystem is only used by this cryosparc instance. There is only one computer in this cryosparc instance.
Can someone help, please? We’re running cryosparc 5.0.1 on Rocky 8.10.
Thanks,
Mark