4.2.1 ab initio error - "Could not open file. Make sure the path exists."

Hi there, we encountered the following error when running ab inito jobs recently:

Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 96, in cryosparc_compute.run.main
  File "cryosparc_master/cryosparc_compute/jobs/abinit/run.py", line 309, in cryosparc_compute.jobs.abinit.run.run_homo_abinit
  File "cryosparc_master/cryosparc_compute/engine/engine.py", line 1142, in cryosparc_compute.engine.engine.process
  File "cryosparc_master/cryosparc_compute/engine/engine.py", line 1143, in cryosparc_compute.engine.engine.process
  File "cryosparc_master/cryosparc_compute/engine/engine.py", line 1028, in cryosparc_compute.engine.engine.process.work
  File "cryosparc_master/cryosparc_compute/engine/engine.py", line 87, in cryosparc_compute.engine.engine.EngineThread.load_image_data_gpu
  File "/mnt/net/software/cryosparc/cryosparc2_worker/cryosparc_compute/particles.py", line 33, in get_original_real_data
    return self.blob.view().copy()
  File "/mnt/net/software/cryosparc/cryosparc2_worker/cryosparc_compute/blobio/mrc.py", line 127, in view
    return self.get()
  File "/mnt/net/software/cryosparc/cryosparc2_worker/cryosparc_compute/blobio/mrc.py", line 122, in get
    _, data, total_time = prefetch.synchronous_native_read(self.fname, idx_start = self.page, idx_limit = self.page+1)
  File "cryosparc_master/cryosparc_compute/blobio/prefetch.py", line 68, in cryosparc_compute.blobio.prefetch.synchronous_native_read
RuntimeError: Error ocurred (No such file or directory) at line 548 in fopen

Could not open file. Make sure the path exists.

IO request details:
filename:    /scratch/cryosparc/instance_cryosparc.ipd:39001/projects/P296/J40/extract/007413350361346145234_FoilHole_30519973_Data_30519837_30519839_20230430_112008_particles.mrc
filetype:    0
header_only: 0
idx_start:   301
idx_limit:   302
eer_upsampfactor: 2
eer_numfractions: 40
num_threads: 6
buffer:      (nil)
nx, ny, nz:  0 0 0
dtype:       0
total_time:  -1.000000

Our IT department looked into this, and this is what they have replied with:

This job ran on gpu3, which has 800GB+ of free space in /scratch/cryosparc. So, not sure what problem cryosparc had here.

The path “/scratch/cryosparc/instance_cryosparc.ipd:39001/projects/P296/J40/extract” exists and was created when the job started, but is empty. Thus the message about unable to open “/scratch/cryosparc/instance_cryosparc.ipd:39001/projects/P296/J40/extract/007413350361346145234_FoilHole_30519973_Data_30519837_30519839_20230430_112008_particles.mrc” is correct. I guess cryosparc expects to find the images in the that folder, but didn’t copy them? Since it’s a new folder, it’s not like it’s left over from another job and should have had images that got cleaned up in the meantime.

Not sure what’s going wrong here.

sjob -v 44153541

JobID: 44153541
Submitted: 2023-05-04 10:02:32
Started: 2023-05-04 10:02:32
Elapsed: 00:39:47
Job Name: cryosparc_P296_J45
User: thuddy (ipd)
Partition: gpu
Work Dir: /projects/em/thuddy/CS-webs/J45
Req Mem: 8.0GB
Req CPUs: 4
Req Nodes: 1
Req Resrc: billing=4,cpu=4,gres/gpu:titan=1,gres/gpu=1,mem=8388608K,node=1
Node(s): gpu3
State: COMPLETED
CPU Used: 00:26:43 => 16.8% efficiency
Mem Used: 1.1GB => 14.3% efficiency

root@gpu3:~# df -h /scratch/cryosparc/
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 916G 63G 807G 8% /scratch

root@gpu3:/projects/em/thuddy/CS-webs/J45# ls -la ‘/scratch/cryosparc/instance_cryosparc.ipd:39001/projects/P296/J40/extract/’
total 44
drwxrwxr-x 2 cryosparc baker 36864 May 4 10:42 .
drwxrwxr-x 3 cryosparc baker 4096 May 4 10:03 …

root@gpu3:~# stat ‘/scratch/cryosparc/instance_cryosparc.ipd:39001/projects/P296/J40/extract’/
File: /scratch/cryosparc/instance_cryosparc.ipd:39001/projects/P296/J40/extract/
Size: 36864 Blocks: 80 IO Block: 4096 directory
Device: 801h/2049d Inode: 40371113 Links: 2
Access: (0775/drwxrwxr-x) Uid: ( 504/cryosparc) Gid: ( 3000/ baker)
Access: 2023-05-04 11:23:48.134280753 -0700
Modify: 2023-05-04 10:42:55.342261755 -0700
Change: 2023-05-04 10:42:55.342261755 -0700
Birth: 2023-05-04 10:03:21.550243369 -0700

Has anyone seen something like this?

Thank you!

The complexity of the situation is increased as particle caching is involved.
Please can you post the outputs of these commands (on gpu3):

stat -f /scratch/cryosparc/
ls -al /nonscratch/path/to/P296/J40/extract/

We have the exact same issue, that occurs all the time. The longer the job runs, the more likely this error ia about to happen. I would be happy to provide any further information and it would be great if you could look into this. Right now, we are forced to run longer jobs without caching to SSD.

This is the error message:

Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 96, in cryosparc_compute.run.main
File “cryosparc_master/cryosparc_compute/jobs/class3D/run.py”, line 619, in cryosparc_compute.jobs.class3D.run.run_class_3D
File “cryosparc_master/cryosparc_compute/jobs/class3D/run.py”, line 1132, in cryosparc_compute.jobs.class3D.run.class3D_engine_run
File “cryosparc_master/cryosparc_compute/jobs/class3D/run.py”, line 1153, in cryosparc_compute.jobs.class3D.run.class3D_engine_run
File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 459, in cryosparc_compute.engine.newengine.EngineThread.read_image_data
File “/users/svc_cryosparc/cryosparc_worker/cryosparc_compute/particles.py”, line 33, in get_original_real_data
return self.blob.view().copy()
File “/users/svc_cryosparc/cryosparc_worker/cryosparc_compute/blobio/mrc.py”, line 127, in view
return self.get()
File “/users/svc_cryosparc/cryosparc_worker/cryosparc_compute/blobio/mrc.py”, line 122, in get
_, data, total_time = prefetch.synchronous_native_read(self.fname, idx_start = self.page, idx_limit = self.page+1)
File “cryosparc_master/cryosparc_compute/blobio/prefetch.py”, line 68, in cryosparc_compute.blobio.prefetch.synchronous_native_read
RuntimeError: Error ocurred (No such file or directory) at line 548 in fopen

Could not open file. Make sure the path exists.

IO request details:
filename: /scratch-cbe/users/svc_cryosparc/instance_imp-cryosparc-1.vbc.ac.at:39001/projects/P2/J502/extract/000948477979965485614_FoilHole_7194065_Data_7181921_7181923_20210908_064626_fractions_patch_aligned_doseweighted_particles.mrc
filetype: 0
header_only: 0
idx_start: 26
idx_limit: 27
eer_upsampfactor: 2
eer_numfractions: 40
num_threads: 6
buffer: (nil)
nx, ny, nz: 0 0 0
dtype: 0
total_time: -1.000000

@DerLorenz Please can you post the output of these commands:

stat -f /scratch-cbe/users/svc_cryosparc/
ls -l /nonscratch/path/to/P2/J502/extract/000948477979965485614_FoilHole_7194065_Data_7181921_7181923_20210908_064626_fractions_patch_aligned_doseweighted_particles.mrc

where you replaced /nonscratch/path/to/P2/ with the actual path to the relevant project directory.
Is /scratch-cbe/users/svc_cryosparc/ storage shared between multiple worker nodes?

1 Like

Thanks for the quick reply.

stat -f /scratch-cbe/users/svc_cryosparc/
File: “/scratch-cbe/users/svc_cryosparc/”
ID: 0 Namelen: 255 Type: fhgfs
Block size: 524288 Fundamental block size: 524288
Blocks: Total: 512546133 Free: 177192290 Available: 177192290
Inodes: Total: 0 Free: 0

ls -l /path/to/P305/J502/000948477979965485614_FoilHole_7194065_Data_7181921_7181923_20210908_064626_fractions_patch_aligned_doseweighted_particles.mrc

ls: cannot access /path/to/P305/J502/000948477979965485614_FoilHole_7194065_Data_7181921_7181923_20210908_064626_fractions_patch_aligned_doseweighted_particles.mrc: No such file or directory

The project number does not match as I reimported this project from an older instance. Also, before running the failed job i ran the recache particle job.

Yes /scratch-cbe/users/svc_cryosparc/ is shared by multiple worker nodes.

Cache sharing between multiple scheduler targets is not currently supported. For details and a possible workaround, please see Cached files of running jobs deleted - #7 by wtempel

1 Like