Hi @team,
I have encountered an unusual error during local motion correction which (interestingly) only happens after 4 hours of running the job.
In this particular machine, only CUDA 11.5 is installed at the moment and I realized that even the latest cryoSPARC is still not adapted to versions superior to CUDA 11.2. While this may indeed be the reason for my error and the fix might be about installing and configuring cryoSPARC to CUDA 11.2, I decided to give it a shot here: do you think it could be something else? Otherwise, I need to contact the admin of this machine.
(Btw, most jobs requiring GPU run without problems using CUDA 11.5)
Please, have a look at the following.
Current cryoSPARC version:
v3.3.1+220215
Error appearing on the overview window of the job:
[CPU: 353.8 MB] Traceback (most recent call last):
File âcryosparc_worker/cryosparc_compute/run.pyâ, line 85, in cryosparc_compute.run.main
File âcryosparc_master/cryosparc_compute/jobs/motioncorrection/run_local.pyâ, line 378, in cryosparc_compute.jobs.motioncorrection.run_local.run_local_motion_correction_multi
File â/home/angr5008/Software/cryosparc/cryosparc_worker/cryosparc_compute/dataset.pyâ, line 588, in subset_mask
assert len(mask) == len(self)
AssertionError
The end of job log file:
========= sending heartbeat
========= sending heartbeat
HOST ALLOCATION FUNCTION: using cudrv.pagelocked_empty
HOST ALLOCATION FUNCTION: using cudrv.pagelocked_empty
**** handle exception rc
set status to failed
========= main process now complete.
========= monitor process now complete.
OS version:
Linux R2D2 5.13.0-30-generic #33~20.04.1-Ubuntu SMP Mon Feb 7 14:25:10 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Memory at the time:
total used free shared buff/cache available
Mem: 263786216 6802516 3504520 21948 253479180 254689732
Swap: 134217724 2825984 131391740
nvcc --version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0
nvidia-smi:
Tue Feb 22 09:48:48 2022
±----------------------------------------------------------------------------+
| NVIDIA-SMI 495.29.05 Driver Version: 495.29.05 CUDA Version: 11.5 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ⊠On | 00000000:03:00.0 On | N/A |
| 27% 34C P8 19W / 250W | 411MiB / 7979MiB | 2% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 1 NVIDIA GeForce ⊠On | 00000000:21:00.0 Off | N/A |
| 30% 41C P8 24W / 350W | 6MiB / 24268MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 2 NVIDIA GeForce ⊠On | 00000000:4B:00.0 Off | N/A |
| 27% 26C P8 8W / 250W | 6MiB / 7982MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2140 G /usr/lib/xorg/Xorg 188MiB |
| 0 N/A N/A 2423 G /usr/bin/gnome-shell 34MiB |
| 0 N/A N/A 2982 G âŠAAAAAAAAA= --shared-files 16MiB |
| 0 N/A N/A 6493 G /usr/lib/firefox/firefox 167MiB |
| 1 N/A N/A 2140 G /usr/lib/xorg/Xorg 4MiB |
| 2 N/A N/A 2140 G /usr/lib/xorg/Xorg 4MiB |
±----------------------------------------------------------------------------+
Thanks in advance for your opinion!
André