Long job fails due to fopen error

Hello all,

We have been having this problem of long jobs failing due to errors like below:

This typically happens for 3D classification jobs that takes days to run and at some point the jobs fail. Initially we thought the error occurs due to cache being full and particle stacks were replaced before the end of the jobs. But when we checked, the file that cryosparc could not open was still present. Could anyone make some suggests regarding this issue?

Thanks in advance!

Best,
Hui

@Hui Please can you post

  • the traceback as text
  • the CryoSPARC version and patch
  • whether /fs/pool/pool-briggs-scratch/cryosparc is shared between multiple CryoSPARC workers
  • user name and user id of the Linux account that runs CryoSPARC processes
  • the outputs of the commands on the worker under the account that runs CryoSPARC processes
    stat -f /fs/pool/pool-briggs-scratch/cryosparc
    stat <full path of particles.mrc file>
    
  • the output of
    ps -eouser,cmd | grep supervisord
    
    on the CryoSPARC master

Thanks for the reply! Our admin is out of town. I’ll ask him to provide the info once he’s back.

Here comes the traceback and the other data.

best

Florian

Traceback (most recent call last): File “cryosparc_master/cryosparc_compute/run.py”, line 96, in cryosparc_compute.run.main File “cryosparc_master/cryosparc_compute/jobs/class3D/run.py”, line 747, in cryosparc_compute.jobs.class3D.run.run_class_3D File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 2877, in cryosparc_compute.engine.newengine.get_initial_noise_estimate File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 2896, in cryosparc_compute.engine.newengine.get_initial_noise_estimate File “cryosparc_master/cryosparc_compute/engine/newengine.py”, line 459, in cryosparc_compute.engine.newengine.EngineThread.read_image_data File “/fs/gpfs41/lv07/fileset02/home/b_baumei/cryosparcuser/csV4.2.1/cryosparc_worker_hpcl900x/cryosparc_compute/particles.py”, line 33, in get_original_real_data return self.blob.view().copy() File “/fs/gpfs41/lv07/fileset02/home/b_baumei/cryosparcuser/csV4.2.1/cryosparc_worker_hpcl900x/cryosparc_compute/blobio/mrc.py”, line 127, in view return self.get() File “/fs/gpfs41/lv07/fileset02/home/b_baumei/cryosparcuser/csV4.2.1/cryosparc_worker_hpcl900x/cryosparc_compute/blobio/mrc.py”, line 122, in get _, data, total_time = prefetch.synchronous_native_read(self.fname, idx_start = self.page, idx_limit = self.page+1) File “cryosparc_master/cryosparc_compute/blobio/prefetch.py”, line 68, in cryosparc_compute.blobio.prefetch.synchronous_native_read RuntimeError: Error ocurred (Permission denied) at line 548 in fopen Could not open file. Make sure the path exists. IO request details: filename: /fs/pool/pool-briggs-scratch/cryosparc/instance_brcryosparc:38001/imports/fs/pool/pool-briggs/hguo/influenza/M1_in_vitro/cryoEM/M1_PR8_after_ni_column_2023_04_07/cryosparc_import/J714_relion_subtract_Nterm_centered/Particles/subtracted_rank4_opticsgroup1.mrcs filetype: 0 header_only: 0 idx_start: 90351 idx_limit: 90352 eer_upsampfactor: 2 eer_numfractions: 40 num_threads: 6 buffer: (nil) nx, ny, nz: 0 0 0 dtype: 0 total_time: -1.000000

Version: Currently running version: v4.2.1

/fs/pool/pool-briggs-scratch/cryosparc is shared between workers.

user: cryosparcuser; echo $UID 5404

stat -f /fs/pool/pool-briggs-scratch/cryosparc
File: “/fs/pool/pool-briggs-scratch/cryosparc”
ID: ef0009d00000010 Namelen: 255 Type: gpfs
Block size: 4194304 Fundamental block size: 4194304
Blocks: Total: 102236160 Free: 71966643 Available: 71966643
Inodes: Total: 2048576512 Free: 2038344224

stat /fs/pool/pool-briggs-scratch/cryosparc/instance_brcryosparc:38001/imports/fs/pool/pool-briggs/hguo/influenza/M1_in_vitro/cryoEM/M1_PR8_after_ni_column_2023_04_07/cryosparc_import/J714_relion_subtract_Nterm_centered/Particles/subtracted_rank4_opticsgroup1.mrcs
File: /fs/pool/pool-briggs-scratch/cryosparc/instance_brcryosparc:38001/imports/fs/pool/pool-briggs/hguo/influenza/M1_in_vitro/cryoEM/M1_PR8_after_ni_column_2023_04_07/cryosparc_import/J714_relion_subtract_Nterm_centered/Particles/subtracted_rank4_opticsgroup1.mrcs
Size: 34665137152 Blocks: 67705360 IO Block: 4194304 regular file
Device: 3eh/62d Inode: 145646291 Links: 1
Access: (0644/-rw-r–r–) Uid: ( 5404/cryosparcuser) Gid: ( 250/b_baumei)
Access: 2023-08-30 15:34:10.558133837 +0200
Modify: 2023-08-19 11:44:57.331866676 +0200
Change: 2023-08-19 11:44:57.331866676 +0200
Birth: -

ps -eouser,cmd | grep supervisord
cryospa+ grep --color=auto supervisord
cryospa+ python /fs/home/cryosparcuser/csV4.2.1/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/bin/supervisord -c /fs/home/cryosparcuser/csV4.2.1/cryosparc_master/supervisord.conf