Ubuntu 22.04 install problems (nvidia-driver-515, conda libstdc++.so.6)

While trying to cryosparc standalone on single desktop on still unsupported Ubuntu 22.04 LTS desktop I have encountered 3 problems:

  1. recent cryosparc_worker package is shipped with 2 worker directories: cryosparsc2_worker and cryosparc_worker. But, the cryosparc2_worker directory has only version file. One, probably should not blindly configure this directory as worker during master install. However, I have checked creatings a symlink to cryosparc_worker which seems to be ok.

  2. Preferred nvidia driver (with nvidia-utils and nvidis-smi) and cuda toolkit installation process is via ubuntu repositories:

sudo ubuntu-drivers devices
sudo ubuntu-drivers autoinstall
sudo update-initramfs -u
reboot
nvidia-smi 2>&1 >/dev/null;echo $?
sudo apt install nvidia-cuda-toolkit

This installs 515 recommended driver and utils, and then cuda toolkit 11.5. But the toolkit install removes nvidia-utils package together with nvidia-smi command! Utils reinstall on the other hand removes toolkit!
Also, ubuntu nvidia-cuda-toolkit package spreads cuda libraries and binaries into /usr/bin, /usr/lib/x86_64-linux-gnu and so on. They’re not placed in any /usr/local/cuda directory. Thus, during master/worker installation one cannot give a single cuda-path parameter, unless one manually create fake directory (ie. /usr/local/cuda) and symlink cuda dirs (bin, lib64,.include, share)

The option might be to install driver and toolkit from nvidia repository, but this is officially not recommended by ubuntu.

  1. If we ignore the missing nvidia-smi program and go with the installation, master and worker install seemingly ok. But during the worker connect attempt, one gets:

> ImportError: /opt/cryosparc2/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/pycuda/_driver.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZSt28__throw_bad_array_new_lengthv

which might suggest bad cuda install or python 3.7 compiles badly _driver.cpython-37m-x86_64-linux-gnu.so against libstdc++.so.6. The second option is more viable, as libstdc++.so.6.0.26 provided with conda environment does not have “ZSt28” symbol, while my system library has. So for some reason python module inside conda environment was compiled with system library (libstdc++.so.6.0.30) instead of conda one.

root@cryoem02:/opt# objdump -T /lib/x86_64-linux-gnu/libstdc++.so.6  | grep throw_bad_array
00000000000a52eb g    DF .text  0000000000000035  GLIBCXX_3.4.29 _ZSt28__throw_bad_array_new_lengthv
00000000000a26ac g    DF .text  0000000000000035  CXXABI_1.3.8 __cxa_throw_bad_array_new_length
00000000000a2590 g    DF .text  0000000000000033  CXXABI_1.3.8 __cxa_throw_bad_array_length
root@cryoem02:/opt# objdump -T /opt/cryosparc2/cryosparc_worker/deps/anaconda/lib/libstdc++.so.6  | grep throw_bad_array
00000000000a8e1d g    DF .text  000000000000002f  CXXABI_1.3.8 __cxa_throw_bad_array_new_length
00000000000a8d90 g    DF .text  000000000000002f  CXXABI_1.3.8 __cxa_throw_bad_array_length

Unfortunately, the makeshift replacing library is conda env (with the newer one) is no-go, as the python module still crashes with the same text. The only solution i have found is switching to conda-forge, but I have no idea how to implement it inside cryosparc (https://github.com/conda/conda/issues/10757). Attempt to update cryosparc_worker conda or activate cryosparc conda env cryosparc_worker_env and update conda there manually failed so far (conda calls for init).

Any hints how to get is to work with Ubuntu 22.04?

UPDATE: Problem 2 looked like toolkit depending on libnvidia-compute-510 insted of 515, so I have reinstalled nvidia driver and util down to 510 version. Now I have smi and nvcc. But, problem 3 still persists!

lscpu && free -g && uname -a && nvidia-smi

Does not show any problems:

root@cryoem02:~# lscpu && free -g && uname -a && nvidia-smi
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         46 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  28
  On-line CPU(s) list:   0-27
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Core(TM) i9-10940X CPU @ 3.30GHz
    CPU family:          6
    Model:               85
    Thread(s) per core:  2
    Core(s) per socket:  14
    Socket(s):           1
    Stepping:            7
    CPU max MHz:         4800,0000
    CPU min MHz:         1200,0000
    BogoMIPS:            6599.98
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art
                          arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca ss
                         e4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single ssbd mba ibrs ibp
                         b stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx sma
                         p clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts avx512
                         _vnni md_clear flush_l1d arch_capabilities
Virtualization features:
  Virtualization:        VT-x
Caches (sum of all):
  L1d:                   448 KiB (14 instances)
  L1i:                   448 KiB (14 instances)
  L2:                    14 MiB (14 instances)
  L3:                    19,3 MiB (1 instance)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-27
Vulnerabilities:
  Itlb multihit:         KVM: Mitigation: VMX disabled
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Mitigation; Clear CPU buffers; SMT vulnerable
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Enhanced IBRS, IBPB conditional, RSB filling
  Srbds:                 Not affected
  Tsx async abort:       Mitigation; TSX disabled
               total        used        free      shared  buff/cache   available
Mem:             125           1         122           0           1         122
Swap:              7           0           7
Linux cryoem02 5.15.0-43-generic #46-Ubuntu SMP Tue Jul 12 10:30:17 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Fri Aug 19 03:09:10 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.02    Driver Version: 510.85.02    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A6000    Off  | 00000000:17:00.0 Off |                  Off |
| 30%   40C    P8     7W / 300W |     16MiB / 49140MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA RTX A6000    Off  | 00000000:65:00.0 Off |                  Off |
| 30%   37C    P8     9W / 300W |      5MiB / 49140MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1476      G   /usr/lib/xorg/Xorg                  9MiB |
|    0   N/A  N/A      1778      G   /usr/bin/gnome-shell                5MiB |
|    1   N/A  N/A      1476      G   /usr/lib/xorg/Xorg                  4MiB |
+-----------------------------------------------------------------------------+

UPDATE2 : ugly solution to libstdc++ problem was suggested in https://discuss.cryosparc.com/t/worker-connect-does-not-work-during-installation/7862.

Slightly reversed installation order (fake cuda, worker first manually, then lib swap and then full master):

# mkdir /usr/local/cuda
# cd /usr/local/cuda
# ln -s /usr/lib/nvidia-cuda-toolkit/bin bin
# ln -s /usr/include include
# ln -s /usr/lib/x86_64-linux-gnu lib64
# ln -s /usr/share share
# su - cryosparc
$ export LICENSE_ID="XXXX"
$ curl -L https://get.cryosparc.com/download/master-latest/$LICENSE_ID > cryosparc2_master.tar.gz
$ curl -L https://get.cryosparc.com/download/worker-latest/$LICENSE_ID > cryosparc2_worker.tar.gz
$ tar -xf cryosparc2_master.tar.gz
$ tar -xf cryosparc2_worker.tar.gz
$ rm -f /opt/cryosparc2/cryosparc2_worker
$ cd cryosparc_worker
$ ./install.sh --license $LICENSE_ID --cudapath /usr/local/cuda --standalone
$ cd cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib
$ mv libstdc++.so.6.0.28 libstdc++.so.6.0.28.old
$ ln -s /lib/x86_64-linux-gnu/libstdc++.so.6.0.30 libstdc++.so.6.0.28
$ cd /opt/cryosparc2/cryosparc_master
$ ./install.sh --standalone --license $LICENSE_ID --worker_path /opt/cryosparc2/cryosparc_worker --cudapath /usr/local/cuda  ...

I’m sure one of the developers will reply, but AFAIK,

  1. The cryosparc2_worker directory is a compatibility relic of the update mechanism from cryoSPARC2. I think it mentions it in the install guide?

  2. I wouldn’t recommend using the Ubuntu-provided version of CUDA - as you say it installs everything into places which are a bit troublesome. Either download and install the .run file for 64-bit Ubuntu 22.04, or use the nVidia repositories (the difficulty with the repositories option is that you’ll have multi-gigabyte updates occasionally, and fairly frequent driver updates; normally they’re painless but occasionally dpkg will get stuck and manual intervention will be needed to fix driver package conflicts).

  3. You seem to be using the “ugly” solution, with even more complicated modifications?

Now 22.04.1 is out, I’ll soon begin the process of moving 18.04 boxes (and 20.04 boxes if everything goes smoothly) to it, so I guess it wouldn’t hurt to post a full breakdown of getting everything set up on here…

@1. I guess it’s safe to delete it in fresh install.

@2. Ubuntu gurus have suggested, that installing driver from Ubuntu repositories is ok and even preferred, as it is shipped with some kernel required components. On the other hand, CUDA libraries are suggested to be installed via .run file. During installation process, one should shun the driver installation offers and install only toolkit. This way, we can have nviadia-smi and nvcc and whole CUDA package in well-known /usr/local/cuda directory.

@3. I have shown my variant of the installation, for those who still wish to go with pure repositories install with driver-510 and compatible toolkit. And YES, it’s ugly, especially to maintain. For a longer run I guess copying library instead of linking it would be preferable, as system glibc update can delete link source, when it installs libstdc++.so.6.0.31. Alternatively, linking system libstdc++.so (which is link to current libstdc++) to cryosparc_worker_env libstdc++.so.6.0.28.
.

Hi Silvan,
I have rtx 3090 card and I have installed CUDA 12 with NVIDIA 525 drivers in ubuntu 22.04. Unfortunately, the worker does not connect as it says pycuda install failure. Do you have any suggestions?

CUDA 12 is not supported by cryoSPARC.

Yah, i figured it out.