I also facing the same problem and i am new in the computational biology field. I am not good with linux and its commands. Can you explain me in detail, how can i solve this problem and what is the reason.
Welcome to the forum @Lalit123.
To avoid any confusion, please can you paste, as text any error messages that you have encountered, where (which log(s), which part(s) of the user interface) and when (for example, during submission of a job of which type).
Does the job log (accessible via Metadata|Log) include any useful information that could indicate the cause of the termination? A gentle reminder: Please paste error messages and warnings as text so that this discussion can be found by interested forum visitors with a text search.
When i checked i found there are run time error (no cuda error) and broken pipe error.Here i am attaching the error messages that i have got in my job log.
================= CRYOSPARCW ======= 2023-10-17 19:34:31.426748 =========
Project P15 Job J5
Master darwin Port 39002
===========================================================================
========= monitor process now starting main process at 2023-10-17 19:34:31.426816
MAINPROCESS PID 175965
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "cryosparc_master/cryosparc_compute/run.py", line 184, in cryosparc_compute.run.run
File "/home/radha/cryosparc/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 2179, in get_gpu_info
cudrv.init()
pycuda._driver.RuntimeError: cuInit failed: no CUDA-capable device is detected
MAIN PID 175965
motioncorrection.run_patch cryosparc_compute.jobs.jobregister
***************************************************************
Running job on hostname %s darwin
Allocated Resources : {'fixed': {'SSD': False}, 'hostname': 'darwin', 'lane': 'default', 'lane_type': 'node', 'license': True, 'licenses_acquired': 1, 'slots': {'CPU': [0, 1, 2, 3, 4, 5], 'GPU': [0], 'RAM': [0, 1]}, 'target': {'cache_path': '/mnt/ssd1/cryosparc_cache', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 8513716224, 'name': 'NVIDIA GeForce GTX 1070'}, {'id': 1, 'mem': 8514109440, 'name': 'NVIDIA GeForce GTX 1070'}], 'hostname': 'darwin', 'lane': 'default', 'monitor_port': None, 'name': 'darwin', 'resource_fixed': {'SSD': True}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5], 'GPU': [0, 1], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7]}, 'ssh_str': 'radha@darwin', 'title': 'Worker node darwin', 'type': 'node', 'worker_bin_path': '/home/radha/cryosparc/cryosparc_worker/bin/cryosparcw'}}
Process Process-1:1:
Traceback (most recent call last):
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/radha/cryosparc/cryosparc_worker/cryosparc_compute/jobs/pipeline.py", line 200, in process_work_simple
process_setup(proc_idx) # do any setup you want on a per-process basis
File "cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_patch.py", line 83, in cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.process_setup
File "cryosparc_master/cryosparc_compute/engine/cuda_core.py", line 29, in cryosparc_compute.engine.cuda_core.initialize
pycuda._driver.RuntimeError: cuInit failed: no CUDA-capable device is detected
Traceback (most recent call last):
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/queues.py", line 245, in _feed
send_bytes(obj)
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/queues.py", line 245, in _feed
send_bytes(obj)
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/queues.py", line 245, in _feed
send_bytes(obj)
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/queues.py", line 245, in _feed
send_bytes(obj)
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/connection.py", line 411, in _send_bytes
self._send(header + buf)
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Traceback (most recent call last):
File "/home/radha/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/queues.py", line 245, in _feed
send_bytes(obj)
Please can you post the output of these commands in a new shell:
eval $(/home/radha/cryosparc/cryosparc_worker/cryosparc_compute/bin/cryosparcw env)
nvidia-smi --query-gpu=index,name,driver_version --format=csv
nvcc -V
which nvcc
cryosparcw gpulist
echo $CUDA_VISIBLE_DEVICES