This is a new system, new installation. We have run cryosparc on other systems in the past though, so we are familiar with the installation and configuration.
Here is the output from the job log command. Last line repeats many times, but truncated here to save space.
================= CRYOSPARCW ======= 2020-09-16 11:13:54.823956 =========
Project P2 Job J3
Master vision.structbio.pitt.edu Port 39002
===========================================================================
========= monitor process now starting main process
MAINPROCESS PID 698874
========= monitor process now waiting for main process
MAIN PID 698874
motioncorrection.run_patch cryosparc2_compute.jobs.jobregister
***************************************************************
Running job on hostname %s vision
Allocated Resources : {u'lane': u'vision', u'target': {u'lane': u'vision', u'qdel_cmd_tpl': u'scancel {{ cluster_job_id }}', u'name': u'vision', u'title': u'vision', u'hostname': u'vision', u'qstat_cmd_tpl': u'squeue -j {{ cluster_job_id }}', u'worker_bin_path': u'/opt/cryoem/cryosparc/cryosparc2_worker/bin/cryosparcw', u'qinfo_cmd_tpl': u'sinfo', u'qsub_cmd_tpl': u'sbatch {{ script_path_abs }}', u'cache_path': u'/local', u'cache_quota_mb': None, u'script_tpl': u'#!/usr/bin/env bash\n#### cryoSPARC cluster submission script template for SLURM\n## Available variables:\n## {{ run_cmd }} - the complete command string to run the job\n## {{ num_cpu }} - the number of CPUs needed\n## {{ num_gpu }} - the number of GPUs needed. \n## Note: the code will use this many GPUs starting from dev id 0\n## the cluster scheduler or this script have the responsibility\n## of setting CUDA_VISIBLE_DEVICES so that the job code ends up\n## using the correct cluster-allocated GPUs.\n## {{ ram_gb }} - the amount of RAM needed in GB\n## {{ job_dir_abs }} - absolute path to the job directory\n## {{ project_dir_abs }} - absolute path to the project dir\n## {{ job_log_path_abs }} - absolute path to the log file for the job\n## {{ worker_bin_path }} - absolute path to the cryosparc worker command\n## {{ run_args }} - arguments to be passed to cryosparcw run\n## {{ project_uid }} - uid of the project\n## {{ job_uid }} - uid of the job\n## {{ job_creator }} - name of the user that created the job (may contain spaces)\n## {{ cryosparc_username }} - cryosparc username of the user that created the job (usually an email)\n##\n## What follows is a simple SLURM script:\n\n#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}\n#SBATCH -n {{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH -p defq\n#SBATCH --mem={{ (ram_gb*1000)|int }}MB \n#SBATCH -o {{ job_dir_abs }}/out.txt\n#SBATCH -e {{ job_dir_abs }}/err.txt\n\navailable_devs=""\nfor devidx in $(seq 0 15);\ndo\n if [[ -z $(nvidia-smi -i $devidx --query-compute-apps=pid --format=csv,noheader) ]] ; then\n if [[ -z "$available_devs" ]] ; then\n available_devs=$devidx\n else\n available_devs=$available_devs,$devidx\n fi\n fi\ndone\nexport CUDA_VISIBLE_DEVICES=$available_devs\n\n{{ run_cmd }}\n\n', u'cache_reserve_mb': 10000, u'type': u'cluster', u'send_cmd_tpl': u'{{ command }}', u'desc': None}, u'license': True, u'hostname': u'vision', u'slots': {u'GPU': [0, 1], u'RAM': [0, 1, 2, 3], u'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]}, u'fixed': {u'SSD': False}, u'lane_type': u'vision', u'licenses_acquired': 2}
**** handle exception rc
set status to failed
Traceback (most recent call last):
File "cryosparc2_worker/cryosparc2_compute/run.py", line 85, in cryosparc2_compute.run.main
File "cryosparc2_master/cryosparc2_compute/jobs/motioncorrection/run_patch.py", line 52, in cryosparc2_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi
File "/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/os.py", line 157, in makedirs
mkdir(name, mode)
OSError: [Errno 17] File exists: '/tank/conwaylab/conway/cryosparc/2018-03-30_CChen-PTang-F34_PolF3MmCcC230np115kx/P2/J3/thumbnails'
========= main process now complete.
========= monitor process now complete.
tail: /tank/conwaylab/conway/cryosparc/2018-03-30_CChen-PTang-F34_PolF3MmCcC230np115kx/P2/J3/job.log: file truncated
================= CRYOSPARCW ======= 2020-09-16 11:14:18.004313 =========
Project P2 Job J3
Master vision.structbio.pitt.edu Port 39002
===========================================================================
========= monitor process now starting main process
MAINPROCESS PID 699436
========= monitor process now waiting for main process
MAIN PID 699436
motioncorrection.run_patch cryosparc2_compute.jobs.jobregister
***************************************************************
Running job on hostname %s vision
Allocated Resources : {u'lane': u'vision', u'target': {u'lane': u'vision', u'qdel_cmd_tpl': u'scancel {{ cluster_job_id }}', u'name': u'vision', u'title': u'vision', u'hostname': u'vision', u'qstat_cmd_tpl': u'squeue -j {{ cluster_job_id }}', u'worker_bin_path': u'/opt/cryoem/cryosparc/cryosparc2_worker/bin/cryosparcw', u'qinfo_cmd_tpl': u'sinfo', u'qsub_cmd_tpl': u'sbatch {{ script_path_abs }}', u'cache_path': u'/local', u'cache_quota_mb': None, u'script_tpl': u'#!/usr/bin/env bash\n#### cryoSPARC cluster submission script template for SLURM\n## Available variables:\n## {{ run_cmd }} - the complete command string to run the job\n## {{ num_cpu }} - the number of CPUs needed\n## {{ num_gpu }} - the number of GPUs needed. \n## Note: the code will use this many GPUs starting from dev id 0\n## the cluster scheduler or this script have the responsibility\n## of setting CUDA_VISIBLE_DEVICES so that the job code ends up\n## using the correct cluster-allocated GPUs.\n## {{ ram_gb }} - the amount of RAM needed in GB\n## {{ job_dir_abs }} - absolute path to the job directory\n## {{ project_dir_abs }} - absolute path to the project dir\n## {{ job_log_path_abs }} - absolute path to the log file for the job\n## {{ worker_bin_path }} - absolute path to the cryosparc worker command\n## {{ run_args }} - arguments to be passed to cryosparcw run\n## {{ project_uid }} - uid of the project\n## {{ job_uid }} - uid of the job\n## {{ job_creator }} - name of the user that created the job (may contain spaces)\n## {{ cryosparc_username }} - cryosparc username of the user that created the job (usually an email)\n##\n## What follows is a simple SLURM script:\n\n#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}\n#SBATCH -n {{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH -p defq\n#SBATCH --mem={{ (ram_gb*1000)|int }}MB \n#SBATCH -o {{ job_dir_abs }}/out.txt\n#SBATCH -e {{ job_dir_abs }}/err.txt\n\navailable_devs=""\nfor devidx in $(seq 0 15);\ndo\n if [[ -z $(nvidia-smi -i $devidx --query-compute-apps=pid --format=csv,noheader) ]] ; then\n if [[ -z "$available_devs" ]] ; then\n available_devs=$devidx\n else\n available_devs=$available_devs,$devidx\n fi\n fi\ndone\nexport CUDA_VISIBLE_DEVICES=$available_devs\n\n{{ run_cmd }}\n\n', u'cache_reserve_mb': 10000, u'type': u'cluster', u'send_cmd_tpl': u'{{ command }}', u'desc': None}, u'license': True, u'hostname': u'vision', u'slots': {u'GPU': [0, 1], u'RAM': [0, 1, 2, 3], u'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]}, u'fixed': {u'SSD': False}, u'lane_type': u'vision', u'licenses_acquired': 2}
Process Process-1:2:
Traceback (most recent call last):
File "/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
self.run()
File "/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "cryosparc2_compute/jobs/pipeline.py", line 155, in process_work_simple
process_setup(proc_idx) # do any setup you want on a per-process basis
File "cryosparc2_master/cryosparc2_compute/jobs/motioncorrection/run_patch.py", line 80, in cryosparc2_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.process_setup
File "cryosparc2_compute/engine/__init__.py", line 8, in <module>
from engine import *
File "cryosparc2_worker/cryosparc2_compute/engine/engine.py", line 4, in init cryosparc2_compute.engine.engine
File "/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/pycuda/driver.py", line 62, in <module>
from pycuda._driver import * # noqa
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
Process Process-1:1:
Traceback (most recent call last):
File "/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/multiprocessing/process.py", line 267, in _bootstrap
self.run()
File "/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "cryosparc2_compute/jobs/pipeline.py", line 155, in process_work_simple
process_setup(proc_idx) # do any setup you want on a per-process basis
File "cryosparc2_master/cryosparc2_compute/jobs/motioncorrection/run_patch.py", line 80, in cryosparc2_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.process_setup
File "cryosparc2_compute/engine/__init__.py", line 8, in <module>
from engine import *
File "cryosparc2_worker/cryosparc2_compute/engine/engine.py", line 4, in init cryosparc2_compute.engine.engine
File "/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/pycuda/driver.py", line 62, in <module>
from pycuda._driver import * # noqa
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
Traceback (most recent call last):
File "/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/multiprocessing/queues.py", line 268, in _feed
send(obj)
IOError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/multiprocessing/queues.py", line 268, in _feed
send(obj)
IOError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/multiprocessing/queues.py", line 268, in _feed
send(obj)
IOError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/multiprocessing/queues.py", line 268, in _feed
send(obj)
IOError: [Errno 32] Broken pipe
Traceback (most recent call last):