Here is the output. I changed our server address to “Server ID”:
================= CRYOSPARCW ======= 2021-02-15 09:01:38.798445 =========
Project P1 Job J7
Master server ID
========= monitor process now starting main process
MAINPROCESS PID 2417
========= monitor process now waiting for main process
MAIN PID 2417
motioncorrection.run_patch cryosparc_compute.jobs.jobregister
Running job on hostname %s Server ID
Allocated Resources : {‘fixed’: {‘SSD’: False}, ‘hostname’: ‘server ID’ ‘lane’: ‘default’, ‘lane_type’: ‘default’, ‘license’: True, ‘licenses_acquired’: 1, ‘slots’: {‘CPU’: [0
, 1, 2, 3, 4, 5], ‘GPU’: [0], ‘RAM’: [0, 1]}, ‘target’: {‘cache_path’: ‘/scratch/cryosparc_cache’, ‘cache_quota_mb’: None, ‘cache_reserve_mb’: 10000, ‘desc’: None, ‘gpus’: [{‘id’: 0, 'mem
‘: 8513978368, ‘name’: ‘GeForce GTX 1080’}, {‘id’: 1, ‘mem’: 8513978368, ‘name’: ‘GeForce GTX 1080’}, {‘id’: 2, ‘mem’: 8510701568, ‘name’: ‘GeForce GTX 1080’}], ‘hostname’: Server ID’, ‘lane’: ‘default’, ‘monitor_port’: None, ‘name’: Server ID, ‘resource_fixed’: {‘SSD’: True}, ‘resource_slots’: {‘CPU’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], ‘GPU’:
[0, 1, 2], ‘RAM’: [0, 1, 2, 3]}, ‘ssh_str’: ‘cryosparcuser@ServerID’, ‘title’: 'Worker node ServerID, ‘type’: ‘node’, ‘worker_bin_path’: ‘/devel/cryosparc/cryosparc
_worker/bin/cryosparcw’}}
/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/cublas.py:284: UserWarning: creating CUBLAS context to get version number
warnings.warn(‘creating CUBLAS context to get version number’)
Process Process-1:1:
Traceback (most recent call last):
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/cublas.py”, line 280, in _get_cublas_version
utils.get_soname(cublas_path)).groups()
AttributeError: ‘NoneType’ object has no attribute ‘groups’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/multiprocessing/process.py”, line 297, in _bootstrap
self.run()
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/multiprocessing/process.py”, line 99, in run
self._target(*self._args, **self._kwargs)
File “/devel/cryosparc/cryosparc_worker/cryosparc_compute/jobs/pipeline.py”, line 154, in process_work_simple
process_setup(proc_idx) # do any setup you want on a per-process basis
File “cryosparc_worker/cryosparc_compute/jobs/motioncorrection/run_patch.py”, line 81, in cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.process_setup
File “/devel/cryosparc/cryosparc_worker/cryosparc_compute/engine/init.py”, line 8, in
from .engine import * # noqa
File “cryosparc_worker/cryosparc_compute/engine/engine.py”, line 11, in init cryosparc_compute.engine.engine
File “cryosparc_worker/cryosparc_compute/engine/gfourier.py”, line 6, in init cryosparc_compute.engine.gfourier
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/fft.py”, line 20, in
from . import misc
_worker/bin/cryosparcw’}}
/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/cublas.py:284: UserWarning: creating CUBLAS context to get version number
warnings.warn(‘creating CUBLAS context to get version number’)
Process Process-1:1:
Traceback (most recent call last):
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/cublas.py”, line 280, in _get_cublas_version
utils.get_soname(cublas_path)).groups()
AttributeError: ‘NoneType’ object has no attribute ‘groups’
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/multiprocessing/process.py”, line 297, in _bootstrap
self.run()
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/multiprocessing/process.py”, line 99, in run
self._target(*self._args, **self._kwargs)
File “/devel/cryosparc/cryosparc_worker/cryosparc_compute/jobs/pipeline.py”, line 154, in process_work_simple
process_setup(proc_idx) # do any setup you want on a per-process basis
File “cryosparc_worker/cryosparc_compute/jobs/motioncorrection/run_patch.py”, line 81, in cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi.process_setup
File “/devel/cryosparc/cryosparc_worker/cryosparc_compute/engine/init.py”, line 8, in
from .engine import * # noqa
File “cryosparc_worker/cryosparc_compute/engine/engine.py”, line 11, in init cryosparc_compute.engine.engine
File “cryosparc_worker/cryosparc_compute/engine/gfourier.py”, line 6, in init cryosparc_compute.engine.gfourier
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/fft.py”, line 20, in
from . import misc
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/misc.py”, line 25, in
from . import cublas
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/cublas.py”, line 292, in
_cublas_version = int(_get_cublas_version())
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/cublas.py”, line 285, in _get_cublas_version
h = cublasCreate()
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/cublas.py”, line 203, in cublasCreate
cublasCheckStatus(status)
File “/devel/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/skcuda/cublas.py”, line 179, in cublasCheckStatus
raise e
skcuda.cublas.cublasNotInitialized
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
**** handle exception rc
set status to failed
========= main process now complete.
========= monitor process now complete.
Waiting for data… (interrupt to abort)
Is this a CUDA issue? If so, what should we do to fix it?
Thank you!!