3D Flex Training error (library not found)

Hi,
I am testing the new 3D flex tool. I follow the tutorial up until the 3D flex training job, where I stumble upon this error:

Any ideas on what is going on?
Thanks for your help :smiley:
Cheers,
Samara

Hi @Smona. To make it easier to search the forum (and find topics like Pytorch installation failure (libnccl.so.2) :smiley:), please post error messages as text instead of screenshots.
Does the link help resolve the ImportError you observed?

1 Like

Thanks for your reply. It actually gave me another error. Is there any way to fix this?

[CPU:  418.0 MB  Avail: 117.86 GB]
Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 83, in cryosparc_compute.run.main
  File "/home/cryspc/Cryosparc/cryosparc_worker/cryosparc_compute/jobs/jobregister.py", line 442, in get_run_function
    runmod = importlib.import_module(".."+modname, __name__)
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 1174, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "cryosparc_master/cryosparc_compute/jobs/flex_refine/run_train.py", line 12, in init cryosparc_compute.jobs.flex_refine.run_train
  File "cryosparc_master/cryosparc_compute/jobs/flex_refine/flexmod.py", line 19, in init cryosparc_compute.jobs.flex_refine.flexmod
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/__init__.py", line 1465, in <module>
    from . import _meta_registrations
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_meta_registrations.py", line 7, in <module>
    from torch._decomp import _add_op_to_registry, global_decomposition_table, meta_table
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_decomp/__init__.py", line 169, in <module>
    import torch._decomp.decompositions
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_decomp/decompositions.py", line 10, in <module>
    import torch._prims as prims
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_prims/__init__.py", line 33, in <module>
    from torch._subclasses.fake_tensor import FakeTensor, FakeTensorMode
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_subclasses/__init__.py", line 3, in <module>
    from torch._subclasses.fake_tensor import (
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_subclasses/fake_tensor.py", line 13, in <module>
    from torch._guards import Source
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_guards.py", line 78, in <module>
    class ShapeGuard(NamedTuple):
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_guards.py", line 79, in ShapeGuard
    expr: sympy.Expr
NameError: name 'sympy' is not defined


To help us understand the current state of your 3DFlex dependencies installation, please can you

  • describe any installation/reconfiguration actions undertaken after observing ImportError: libnccl.so.2...
  • post outputs of the following commands
csw=/home/cryspc/Cryosparc/cryosparc_worker/bin/cryosparcw
$csw call /usr/bin/env | grep PATH
$csw call python -c "import torch, pycuda.driver; print(f'pycuda version? {pycuda.driver.get_version()}\nTorch version {torch.__version__}\nTorch CUDA available? {torch.cuda.is_available()}')"
$csw call which nvcc
nvidia-smi

Hi @wtempel here are the outputs of the commands you requested.
Thanks for your help!

(base) cryspc@quartet:~$ csw=/home/cryspc/Cryosparc/cryosparc_worker/bin/cryosparcw
(base) cryspc@quartet:~$ $csw call /usr/bin/env | grep PATH
SB_ORIG_PATH=/usr/local/cuda-11.4/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
CRYOSPARC_PATH=/home/cryspc/Cryosparc/cryosparc_worker/bin
SB_ORIG_CLASSPATH=
SB_ORIG_LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64/stubs:/usr/local/cuda-11.4/lib64
PYTHONPATH=/home/cryspc/Cryosparc/cryosparc_worker
SB_ORIG_DYLD_LIBRARY_PATH=
CRYOSPARC_CUDA_PATH=/usr/local/cuda-11.4
LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64:/home/cryspc/Cryosparc/cryosparc_worker/deps/external/cudnn/lib:/usr/local/cuda-11.4/lib64/stubs
SB_ORIG_PYTHONPATH=
SB_ORIG_MANPATH=
PATH=/usr/local/cuda-11.4/bin:/home/cryspc/Cryosparc/cryosparc_worker/bin:/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/bin:/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/condabin:/home/cryspc/anaconda3/bin:/home/cryspc/anaconda3/condabin:/home/cryspc/Cryosparc/cryosparc_master/bin:/software/cryosparc/bin:/programs/x86_64-linux/system/sbgrid_bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/programs/share/bin:/programs/share/sbgrid/bin:/programs/x86_64-linux/sbgrid_installer/latest
(base) cryspc@quartet:~$ $csw call python -c "import torch, pycuda.driver; print(f'pycuda version? {pycuda.driver.get_version()}\nTorch version {torch.__version__}\nTorch CUDA available? {torch.cuda.is_available()}')"
No sympy found
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/__init__.py", line 1465, in <module>
    from . import _meta_registrations
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_meta_registrations.py", line 7, in <module>
    from torch._decomp import _add_op_to_registry, global_decomposition_table, meta_table
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_decomp/__init__.py", line 169, in <module>
    import torch._decomp.decompositions
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_decomp/decompositions.py", line 10, in <module>
    import torch._prims as prims
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_prims/__init__.py", line 33, in <module>
    from torch._subclasses.fake_tensor import FakeTensor, FakeTensorMode
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_subclasses/__init__.py", line 3, in <module>
    from torch._subclasses.fake_tensor import (
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_subclasses/fake_tensor.py", line 13, in <module>
    from torch._guards import Source
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_guards.py", line 78, in <module>
    class ShapeGuard(NamedTuple):
  File "/home/cryspc/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/torch/_guards.py", line 79, in ShapeGuard
    expr: sympy.Expr
NameError: name 'sympy' is not defined
(base) cryspc@quartet:~$ $csw call which nvcc
/usr/local/cuda-11.4/bin/nvcc
(base) cryspc@quartet:~$ nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
(base) cryspc@quartet:~$ 

May I suggest:

  1. to check whether the aforementioned error persists after a system reboot.
  2. to resolve the aforementioned mismatch. Details depend on your OS (you may want to post the output of uname -a ) and on how the current nvidia driver was installed. Ensure the installed nvidia driver is at least version 450. I’d try v515 or even v530.
  3. then, to ensure that nvidia-smi is working. The preceding steps need to be performed with root privileges. The following steps need to be performed under the non-root Linux account that “owns” the CryoSPARC installation.
  4. then, on the GPU computer, to run
    /home/cryspc/Cryosparc/cryosparc_worker/bin/cryosparcw forcedeps
    
    Proceed to the next step if there are no errors. Otherwise, please post any errors here.
  5. then, to run
    /home/cryspc/Cryosparc/cryosparc_worker/bin/cryosparcw install-3dflex 2>&1 | tee /tmp/install_3dflex.log
    
  6. then, to confirm there are no errors when you run
    /home/cryspc/Cryosparc/cryosparc_worker/bin/cryosparcw call python -c "import torch, pycuda.driver; print(f'pycuda version? {pycuda.driver.get_version()}\nTorch version {torch.__version__}\nTorch CUDA available? {torch.cuda.is_available()}')"
    
1 Like