Update to v4.4 in WSL2 breaks CryoSPARC CUDA

Hi!

I am running CryoSPARC in WSL2 Ubuntu 22.04. CryoSPARC run great in WSL2 until version 4.3.1, but updating to CryoSPARC version >= v4.4.0 results in jobs failing due to a CUDA error. The same problem occurs with a fresh CryoSPARC v.4.4.1 install.

I understand that WSL2 is not specifically supported, but I would appreciate any tips on how the issue can be solved!

When running extensive validation, the patch motion correction and patch ctf run correctly. Then I get an error when running the Blob picker job, the traceback of the error and the workstation configuration is below.

Thanks!

Genis
##########################################################################

[CPU: 417.4 MB Avail: 61.43 GB]
Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 95, in cryosparc_master.cryosparc_compute.run.main
File “cryosparc_master/cryosparc_compute/jobs/template_picker_gpu/run.py”, line 55, in cryosparc_master.cryosparc_compute.jobs.template_picker_gpu.run.run
File “cryosparc_master/cryosparc_compute/jobs/template_picker_gpu/run.py”, line 93, in cryosparc_master.cryosparc_compute.jobs.template_picker_gpu.run.do_pick
File “cryosparc_master/cryosparc_compute/jobs/template_picker_gpu/run.py”, line 341, in cryosparc_master.cryosparc_compute.jobs.template_picker_gpu.run.do_pick
File “/home/genis/cryosparc/cryosparc_worker/cryosparc_compute/skcuda_internal/fft.py”, line 112, in init
self.handle = gpufft.gpufft_get_plan(
RuntimeError: cuda failure (driver API): cuCtxGetDevice(&device)
→ [unknown error code]

########################################################################
Type: single workstation
Version: v4.4.1

eval $(/home/genis/cryosparc/cryosparc_worker/bin/cryosparcw env)
env | grep PATH

CRYOSPARC_PATH=/home/genis/cryosparc/cryosparc_worker/bin
PYTHONPATH=/home/genis/cryosparc/cryosparc_worker
NUMBA_CUDA_INCLUDE_PATH=/home/genis/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/include
PATH=/home/genis/cryosparc/cryosparc_worker/bin:/home/genis/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin:/home/genis/cryosparc/cryosparc_worker/deps/anaconda/condabin:/home/genis/cryosparc/cryosparc_master/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.0/bin:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.0/libnvvp:/mnt/c/Program Files/AdoptOpenJDK/jdk-11.0.10.9-hotspot/bin:/mnt/c/Anaconda:/mnt/c/Program Files/IMOD/bin:/mnt/c/WINDOWS/system32:/mnt/c/WINDOWS:/mnt/c/WINDOWS/System32/Wbem:/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0:/mnt/c/WINDOWS/System32/OpenSSH:/mnt/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/mnt/c/Program Files/Git/cmd:/mnt/c/bin:/mnt/c/msys64/usr/bin:/mnt/c/Program Files/dotnet:/mnt/c/WINDOWS/system32:/mnt/c/WINDOWS:/mnt/c/WINDOWS/System32/Wbem:/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0:/mnt/c/WINDOWS/System32/OpenSSH:/mnt/c/Program Files/Tailscale:/mnt/c/Program Files/MATLAB/R2023a/bin:/mnt/c/Program Files (x86)/Intel/Intel(R) Management Engine Components/DAL:/mnt/c/Program Files/Intel/Intel(R) Management Engine Components/DAL:/mnt/c/Program Files/PowerShell/7:/mnt/c/Users/Genis Valentin/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/Genis Valentin/AppData/Local/Programs/Microsoft VS Code/bin:/mnt/c/Users/Genis Valentin/.dotnet/tools:/mnt/c/Users/Genis Valentin/AppData/Local/Programs/oh-my-posh/bin:/mnt/c/Program Files/Azure Data Studio/bin:/mnt/c/Users/Genis Valentin/AppData/Roaming/Warp:/snap/bin

nvcc --version
Command ‘nvcc’ not found, but can be installed with:
sudo apt install nvidia-cuda-toolkit

python -c “import pycuda.driver; print(pycuda.driver.get_version())”
Traceback (most recent call last):
File “”, line 1, in
ModuleNotFoundError: No module named ‘pycuda’

which python
/home/genis/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin/python

/sbin/ldconfig -p | grep -i cuda
libicudata.so.70 (libc6,x86-64) => /lib/x86_64-linux-gnu/libicudata.so.70
libcudadebugger.so.1 (libc6,x86-64) => /usr/lib/wsl/lib/libcudadebugger.so.1
libcuda.so.1 (libc6,x86-64) => /usr/lib/wsl/lib/libcuda.so.1

uname -a
Linux DESKTOP-A45DKHA 5.15.133.1-microsoft-standard-WSL2 #1 SMP Thu Oct 5 21:02:42 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

free -g
total used free shared buff/cache available
Mem: 62 25 2 0 34 36
Swap: 16 3 12

nvidia-smi
Thu Dec 7 13:03:06 2023
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.01 Driver Version: 546.01 CUDA Version: 12.3 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Quadro RTX 5000 On | 00000000:4F:00.0 Off | Off |
| 33% 29C P8 15W / 230W | 0MiB / 16384MiB | 0% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+
| 1 Quadro RTX 5000 On | 00000000:91:00.0 Off | Off |
| 33% 30C P8 8W / 230W | 902MiB / 16384MiB | 9% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+

±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 14 G /Xwayland N/A |
| 0 N/A N/A 32 G /Xwayland N/A |
| 1 N/A N/A 14 G /Xwayland N/A |
| 1 N/A N/A 32 G /Xwayland N/A |
±--------------------------------------------------------------------------------------+

2 Likes

Hi Genval, I have encountered the same problem. I’m running CryoSPARC on WSL2 as well. After I updated it to v4.4 from v4.3, jobs like NU-Refinement went down with the same error “RuntimeError: cuda failure (driver API): cuCtxGetDevice(&device)”. Somehow, 3D-flex refinement still works though. It is really weird.

Hi Genval, the same problem for me on Ubuntu 22.04 LTS on WSL2 (Windows 11)
Updating to CryoSPARC version >= v4.4.0 results in jobs failing due to a CUDA error. The same problem occurs with a fresh CryoSPARC v.4.4.1 install.

Error when running the Blob picker job

Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 95, in cryosparc_master.cryosparc_compute.run.main
File “cryosparc_master/cryosparc_compute/jobs/template_picker_gpu/run.py”, line 55, in cryosparc_master.cryosparc_compute.jobs.template_picker_gpu.run.run
File “cryosparc_master/cryosparc_compute/jobs/template_picker_gpu/run.py”, line 93, in cryosparc_master.cryosparc_compute.jobs.template_picker_gpu.run.do_pick
File “cryosparc_master/cryosparc_compute/jobs/template_picker_gpu/run.py”, line 341, in cryosparc_master.cryosparc_compute.jobs.template_picker_gpu.run.do_pick
File “/home/genis/cryosparc/cryosparc_worker/cryosparc_compute/skcuda_internal/fft.py”, line 112, in init
self.handle = gpufft.gpufft_get_plan(
RuntimeError: cuda failure (driver API): cuCtxGetDevice(&device)
→ [unknown error code]

Dear CryoSparc TEAM, can you consider an embedded version of Cuda 11.8 compatible with WSL2 cuda metapackage or allow us as in previous versions (Cryosparc v4.3) to use our separate WSL2 Nvidia cuda-toolkit metapackage (I use the 12.3.1 actually)? More and more of us are using the WSL2 hybrid system which gives excellent results on many image processing packages. Thanks in advance to help us to solve this problem !

@genval @Zdai @elarquet Please can you explain your respective use cases of running CryoSPARC on WSL:

  • CryoSPARC instance type, such as
    • single workstation (master, worker processes on same host) or
    • connected workers
  • number and type(s) of GPUs for use with CryoSPARC
  • reason for preference for running Linux under WSL2 rather than “directly”

Hi Wtempel, I’m running cryoSPARC on a single workstation with one GPU Nvidia RTX4070. I chose to use WSL2 because Windows has better software to use such as AI and photoshop for me, and WSL2 is more convenient than installing two operating systems on the same PC. In fact, it worked just as well as Linux system to me running either cryoSPARC or MD before.

Hi @wtempel ! Here is my use case:

  • Single workstation
  • 2 x Nvidia Quadro RTX 5000
  • Reasons to use Windows:
    • I do use WARP a lot, which is only available in Windows
    • The ability to run cryosparc in a dedicated virtual machine, so that it is easier to maintain. WSL virtual machines are easy to set up and allow direct GPU usage.
    • I have not noticed any performance issues when running within WSL compared to a dedicated Ubuntu
1 Like

Interesting, I might do some benchmarking with CryoSPARC on WSL2, then, as my tests with Blender showed the difference between bare-metal Linux and WSL2 Linux to be pretty much a wash, but tests with RELION showed a performance drop of anything from 20-50% in WSL2 (and in one case a 3D refinement took four times longer, but that was caused by browser GPU acceleration demanding too much GPU time). It would be very interesting if CryoSPARC doesn’t have that (at least the versions that work
!)

I think the reason the bundled CUDA in CryoSPARC 4.4 doesn’t work is that it’s the Linux native package - I was curious enough to try the Linux native CUDA .run file in WSL2 at one point, and while CUDA installed, it wouldn’t see the GPU. I’m pretty sure nVidia have done some stuff in the background to get CUDA to see through the hypervisor without needing to fully pass through the GPU (and thus, lose it as usable in Windows and end up with a blank screen)


Hi @wtempel !

Here is my Mobile Single workstation config

DELL 7680 Mobile Workstation

  • Processor : IntelÂź Coreℱ i9-13900H, vProÂź Enterprise (24MB Cache, 14 Cores, 20 Threads, 2.6-5.4 GHz Turbo, 45W)
  • Windows 11 Pro with Ubuntu 22.04.3 LTS (Jammy Jellyfish) on WSL2
  • Graphic Card : NVIDIAÂź GeForce RTXℱ 4090, 16GB GDDR6
    (Cuda toolkit and Cuda Developer Tools version 12.3.2_545.23.08)
  • Memory : 64 GB: 2 x 32 GB, LPDDR5, 6000 MT/s
  • Storage : 1 TB, M.2 2280, Gen 4 PCIe NVMe, SSD, Class 40 (for system)
    4 TB, WD_BLACK SN850X NVMeℱ SSD, PCIe¼ Gen4 16 Gt/s
    (NTFS format for Windows Datas)
    4 TB, WD_BLACK SN850X NVMeℱ SSD, PCIe¼ Gen4 16 Gt/s
    (Ext4 format for Ubuntu/WSL2 Datas)

Reasons to use Windows:

  • Academic teaching
  • Image processing on small projects.

Ubuntu under WSL2 works perfectly without loss of performance if:

Test on Image processing packages : Relion 4 and 5, Scipion 3.1 / Xmipp 3.23.11, Eman 2.99.47 , Cryosparc 4.3 → 4.3.1, Phenix 1.21-5207, CCP4, CCP-EM Doppio 


Works well on Graphical environment like coot 0.9.8.92, pymol 2.4, UCSF Chimera 1.17.3, UCSF ChimeraX 1.7 ( ChimeraX Browser don’t work)

I also have not noticed any performance issues when running within WSL2 compared to a dedicated Ubuntu 22.24 LTS (single Workstation)

Thanks @Zdai @elarquet @genval @rbs_sci for your feedback

@elarquet What was the URL for the specific installation file you used and how did you confirm its function on WSL2?

You could find all informations here :
CUDA Toolkit 12.3 Update 2 Downloads | NVIDIA Developer

@elarquet Is this the same file that (non-WSL2) could also use to install the toolkit?

No, I do not think so ! if you look at the local debian package, you will see that it is specific to the WSL system

wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/12.3.2/local_installers/cuda-repo-wsl-ubuntu-12-3-local_12.3.2-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-12-3-local_12.3.2-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-12-3-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get updatesudo apt-get -y install cuda-toolkit-12-3

You could find more informations on § 3.9 WSL on
CUDA Installation Guide for Linux (nvidia.com)

Hi @wtempel , I’m wondering if there is any way to redirect the CUDA path of cryoSPARC v4.4 from bundled CUDA 11.8 to the CUDA environment of WSL2. I feel that this could be a possible solution.

Hi @wtempel !
I am getting exactly the same error running the latest v4.5.3 on WSL2. Everything runs smoothly until “Blob_Picker” fails.
I don’t see any updates since January. I wonder whether this issue was resolved?