Failed Exposures from mounted drive in live session

After updating to v4.6.0, the import of exposures in cryoSPARC lives from a mounted drive (PATH1) fails with the following log:

Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/jobs/rtp_workers/run.py”, line 269, in cryosparc_master.cryosparc_compute.jobs.rtp_workers.run.symlink_path
FileExistsError: [Errno 17] File exists: ‘PATH1/FoilHole_10105489_Data_10086807_44_20240910_130434_EER.eer’ → ‘PATH2/S1/import_movies/FoilHole_10105489_Data_10086807_44_20240910_130434_EER.eer’

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/jobs/rtp_workers/run.py”, line 381, in cryosparc_master.cryosparc_compute.jobs.rtp_workers.run.rtp_worker
File “cryosparc_master/cryosparc_compute/jobs/rtp_workers/run.py”, line 450, in cryosparc_master.cryosparc_compute.jobs.rtp_workers.run.process_movie
File “cryosparc_master/cryosparc_compute/jobs/rtp_workers/run.py”, line 479, in cryosparc_master.cryosparc_compute.jobs.rtp_workers.run.do_check
File “cryosparc_master/cryosparc_compute/jobs/rtp_workers/run.py”, line 272, in cryosparc_master.cryosparc_compute.jobs.rtp_workers.run.symlink_path
Exception: Failed to create symbolic link /PATH2/S1/import_movies/FoilHole_10105489_Data_10086807_44_20240910_130434_EER.eer

Anybody with the same issue?

Welcome to the forum @schaefer-jh . Please can you

  • post the outputs of these commands
    stat -f /PATH2/S1/import_movies/
    ls /PATH2/S1/import_movies/*eer | tail -n10
    
  • let us know if you successfully performed CryoSPARC Live processing on this instance prior to the update and using the same project directory storage
  • post the CryoSPARC version from which you updated
  • post a history of disruptions/crashes of the CryoSPARC instance, if any
  • output oft the command
    cryosparcm cli "get_scheduler_targets()"
    and a screenshot of the Configuration panel of the CryoSPARC Live session showing the Preprocessing Lane and Number of Preprocessing GPU Workers

Thanks @wtempel for helping:

  1. Command output:
    $ stat -f /nPATH2/S1/import_movies/
    File: “/PATH2/S1/import_movies/”
    ID: 0 Namelen: 255 Type: nfs
    Block size: 65536 Fundamental block size: 65536
    Blocks: Total: 5851873792 Free: 985420500 Available: 985420500
    Inodes: Total: 14980802112 Free: 14955078965

ls /PATH2/S1/import_movies/*eer | tail -n10
/PATH2/S1/import_movies/FoilHole_10105481_Data_10086807_40_20240910_125351_EER.eer
/PATH2/S1/import_movies/FoilHole_10105482_Data_10086807_55_20240910_131334_EER.eer
/PATH2/S1/import_movies/FoilHole_10105483_Data_10086807_48_20240910_131339_EER.eer
/PATH2/S1/import_movies/FoilHole_10105484_Data_10086807_51_20240910_131344_EER.eer
/PATH2/S1/import_movies/FoilHole_10105485_Data_10086807_41_20240910_130414_EER.eer
/PATH2/S1/import_movies/FoilHole_10105486_Data_10086807_31_20240910_130419_EER.eer
/PATH2/S1/import_movies/FoilHole_10105487_Data_10086807_26_20240910_130424_EER.eer
/PATH2/S1/import_movies/FoilHole_10105488_Data_10086807_32_20240910_130429_EER.eer
/PATH2/S1/import_movies/FoilHole_10105489_Data_10086807_44_20240910_130434_EER.eer
/PATH2/S1/import_movies/FoilHole_10105490_Data_10086807_42_20240910_125826_EER.eer

  1. yes, live processing worked before the update using the same configuration
  2. updated from v.4.5.3 → v.4.6.0
  3. no similar crashes
  4.  $ cryosparcm cli 'get_scheduler_targets()'
     [{'cache_path': '/scratch', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'custom_var_names': [], 'custom_vars': {}, 'desc': None, 'hostname': 'cryosparc', 'lane': 'cryosparc', 'name': 'cryosparc', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': "sinfo --format='%.8N %.6D %.10P %.6T %.14C %.5c %.6z %.7m %.7G %.9d %20E'", 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qstat_code_cmd_tpl': None, 'qsub_cmd_tpl': 'sbatch  {{ script_path_abs }}', 'script_tpl': '#!/bin/bash\n#SBATCH --job-name {{ project_uid }}_{{ job_uid }}\n#SBATCH --partition=cryosparc\n#SBATCH --output={{ job_log_path_abs }}\n#SBATCH --error={{ job_log_path_abs }}\n#SBATCH --nodes=1\n#SBATCH --mem={{ (ram_gb*1000)|int }}M\n#SBATCH --ntasks-per-node=1\n#SBATCH --cpus-per-task={{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH --gres-flags=enforce-binding\n##SBATCH --exclusive\n\nsrun {{ run_cmd }}\n\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'cryosparc', 'tpl_vars': ['command', 'run_cmd', 'num_gpu', 'project_uid', 'ram_gb', 'job_uid', 'num_cpu', 'job_log_path_abs', 'cluster_job_id'], 'type': 'cluster', 'worker_bin_path': '/home_local/hpc/cryosparc2/cryosparc2_worker/bin/cryosparcw'}, {'cache_path': '/scratch', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'custom_var_names': [], 'custom_vars': {}, 'desc': None, 'hostname': 'cryosparc1', 'lane': 'cryosparc1', 'name': 'cryosparc1', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': "sinfo --format='%.8N %.6D %.10P %.6T %.14C %.5c %.6z %.7m %.7G %.9d %20E'", 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qstat_code_cmd_tpl': None, 'qsub_cmd_tpl': 'sbatch  {{ script_path_abs }}', 'script_tpl': '#!/bin/bash\n#SBATCH --job-name {{ project_uid }}_{{ job_uid }}\n#SBATCH --partition=cryosparc1\n#SBATCH --output={{ job_log_path_abs }}\n#SBATCH --error={{ job_log_path_abs }}\n#SBATCH --nodes=1\n#SBATCH --mem={{ (ram_gb*1000)|int }}M\n#SBATCH --ntasks-per-node=1\n#SBATCH --cpus-per-task={{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH --gres-flags=enforce-binding\n##SBATCH --exclusive\n\nsrun {{ run_cmd }}\n\n\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'cryosparc1', 'tpl_vars': ['command', 'run_cmd', 'num_gpu', 'project_uid', 'ram_gb', 'job_uid', 'num_cpu', 'job_log_path_abs', 'cluster_job_id'], 'type': 'cluster', 'worker_bin_path': '/home_local/hpc/cryosparc2/cryosparc2_worker/bin/cryosparcw'}, {'cache_path': '/scratch', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'custom_var_names': [], 'custom_vars': {}, 'desc': None, 'hostname': 'cryosparc2', 'lane': 'cryosparc2', 'name': 'cryosparc2', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': "sinfo --format='%.8N %.6D %.10P %.6T %.14C %.5c %.6z %.7m %.7G %.9d %20E'", 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qstat_code_cmd_tpl': None, 'qsub_cmd_tpl': 'sbatch  {{ script_path_abs }}', 'script_tpl': '#!/bin/bash\n#SBATCH --job-name {{ project_uid }}_{{ job_uid }}\n#SBATCH --partition=cryosparc2\n#SBATCH --output={{ job_log_path_abs }}\n#SBATCH --error={{ job_log_path_abs }}\n#SBATCH --nodes=1\n#SBATCH --mem={{ (ram_gb*1000)|int }}M\n#SBATCH --ntasks-per-node=1\n#SBATCH --cpus-per-task={{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH --gres-flags=enforce-binding\n##SBATCH --exclusive\n\nsrun {{ run_cmd }}\n\n\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'cryosparc2', 'tpl_vars': ['command', 'run_cmd', 'num_gpu', 'project_uid', 'ram_gb', 'job_uid', 'num_cpu', 'job_log_path_abs', 'cluster_job_id'], 'type': 'cluster', 'worker_bin_path': '/home_local/hpc/cryosparc2/cryosparc2_worker/bin/cryosparcw'}, {'cache_path': '/scratch', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'custom_var_names': [], 'custom_vars': {}, 'desc': None, 'hostname': 'cryosparc3', 'lane': 'cryosparc3', 'name': 'cryosparc3', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': "sinfo --format='%.8N %.6D %.10P %.6T %.14C %.5c %.6z %.7m %.7G %.9d %20E'", 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qstat_code_cmd_tpl': None, 'qsub_cmd_tpl': 'sbatch  {{ script_path_abs }}', 'script_tpl': '#!/bin/bash\n#SBATCH --job-name {{ project_uid }}_{{ job_uid }}\n#SBATCH --partition=cryosparc3\n#SBATCH --output={{ job_log_path_abs }}\n#SBATCH --error={{ job_log_path_abs }}\n#SBATCH --nodes=1\n#SBATCH --mem={{ (ram_gb*1000)|int }}M\n#SBATCH --ntasks-per-node=1\n#SBATCH --cpus-per-task={{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH --gres-flags=enforce-binding\n##SBATCH --exclusive\n\nsrun {{ run_cmd }}\n\n\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'cryosparc3', 'tpl_vars': ['command', 'run_cmd', 'num_gpu', 'project_uid', 'ram_gb', 'job_uid', 'num_cpu', 'job_log_path_abs', 'cluster_job_id'], 'type': 'cluster', 'worker_bin_path': '/home_local/hpc/cryosparc2/cryosparc2_worker/bin/cryosparcw'}, {'cache_path': '/scratch', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'custom_var_names': [], 'desc': None, 'hostname': 'cryosparc4', 'lane': 'cryosparc4', 'name': 'cryosparc4', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': "sinfo --format='%.8N %.6D %.10P %.6T %.14C %.5c %.6z %.7m %.7G %.9d %20E'", 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qsub_cmd_tpl': 'sbatch  {{ script_path_abs }}', 'script_tpl': '#!/bin/bash\n#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}\n#SBATCH --partition=cryosparc4\n#SBATCH --output={{ job_log_path_abs }}\n#SBATCH --error={{ job_log_path_abs }}\n#SBATCH --nodes=1\n#SBATCH --mem={{ (ram_gb*1000)|int }}M\n#SBATCH --ntasks-per-node=1\n#SBATCH --cpus-per-task={{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH --gres-flags=enforce-binding\n##SBATCH --exclusive\n\nsrun {{ run_cmd }}\n\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'cryosparc4', 'tpl_vars': ['command', 'run_cmd', 'num_gpu', 'project_uid', 'ram_gb', 'job_uid', 'num_cpu', 'job_log_path_abs', 'cluster_job_id'], 'type': 'cluster', 'worker_bin_path': '/home_local/hpc/cryosparc2/cryosparc2_worker/bin/cryosparcw'}]
    

Hope this helps.

Thanks @schaefer-jh .
I wonder whether the problem would persist if the last (non-empty) line of the cluster lane’s script template were modified from currently

to simply
{{ run_cmd }}
You may want to create a test lane with that change by specifying a unique "name": inside the cluster_info.json file that is used with the
cryosparcm cluster connect command (guide). As a starting point for your edits, you may write out the current configuration by running the command
cryosparcm cluster dump name-of-your-existing-lane (guide).

The issue was in a faulty fiber optic connection. The problem has been resolved. Thanks for your support @wtempel.

1 Like