4.2.1 - worker update fails: error pycuda

Greetings,

I’m once again trying to update our cluster to version 4 from stable 3.4. The master process is fine, but installing the worker process results in error below.

I’d prefer to complete the upgrade as we’re eager to use new features in V4. Can you offer any ideas or solution to the error?

-------------------------------
 ------------------------------------------------------------------------
  Preparing to install all pip packages...
  ------------------------------------------------------------------------
DEPRECATION: --no-binary currently disables reading from the cache of locally built wheels. In the future --no-binary will not influence the wheel cache. pip 23.1 will enforce this behaviour change. A possible replacement is to use the --no-cache-dir option. You can use the flag --use-feature=no-binary-enable-wheel-cache to test the upcoming behaviour. Discussion can be found at https://github.com/pypa/pip/issues/11453
Processing ./deps_bundle/python/python_packages/pip_packages/pycuda-2020.1.tar.gz
  Preparing metadata (setup.py) ... done
Installing collected packages: pycuda
  DEPRECATION: pycuda is being installed using the legacy 'setup.py install' method, because the '--no-binary' option was enabled for it and this currently disables local wheel building for projects that don't have a 'pyproject.toml' file. pip 23.1 will enforce this behaviour change. A possible replacement is to enable the '--use-pep517' option. Discussion can be found at https://github.com/pypa/pip/issues/11451
  Running setup.py install for pycuda ... error
  error: subprocess-exited-with-error

  × Running setup.py install for pycuda did not run successfully.
  │ exit code: 1
  ╰─> [6718 lines of output]
      *************************************************************
      *** I have detected that you have not run configure.py.
      *************************************************************
      *** Additionally, no global config files were found.
      *** I will go ahead with the default configuration.
      *** In all likelihood, this will not work out.
      ***
      *** See README_SETUP.txt for more information.
      ***
      *** If the build does fail, just re-run configure.py with the
      *** correct arguments, and then retry. Good luck!
      *************************************************************
      *** HIT Ctrl-C NOW IF THIS IS NOT WHAT YOU WANT
      *************************************************************
      Continuing in 10 seconds...

**** then hundreds of lines of various messages ending with:

     gcc -pthread -B /opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/compiler_compat -Wno-unused-result -Wsign-compare -fwrapv -Wall -O3 -DNDEBUG -fPIC -DBOOST_ALL_NO_LIB=1 -DBOOST_THREAD_BUILD_DLL=1 -DBOOST_MULTI_INDEX_DISABLE_SERIALIZATION=1 -DBOOST_PYTHON_SOURCE=1 -Dboost=pycudaboost -DBOOST_THREAD_DONT_USE_CHRONO=1 -DPYGPU_PACKAGE=pycuda -DPYGPU_PYCUDA=1 -DHAVE_CURAND=1 -Isrc/cpp -Ibpl-subset/bpl_subset -I/cm/shared/apps/cuda11.8/toolkit/11.8.0/include -I/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/site-packages/numpy/core/include -I/opt/cryoem/cryosparc/cryosparc2_worker/deps/anaconda/envs/cryosparc_worker_env/include/python3.8 -c src/cpp/cuda.cpp -o build/temp.linux-x86_64-cpython-38/src/cpp/cuda.o
      In file included from src/cpp/cuda.cpp:4:
      src/cpp/cuda.hpp:23:10: fatal error: cudaProfiler.h: No such file or directory
       #include <cudaProfiler.h>
                ^~~~~~~~~~~~~~~~
      compilation terminated.
      error: command '/cm/local/apps/gcc/8.2.0/bin/gcc' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> pycuda

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
check_install_deps.sh: 66: ERROR: installing python failed.
1 Like

What are the outputs of

ls -l /cm/shared/apps/cuda11.8/toolkit/11.8.0/
ls -l /cm/shared/apps/cuda11.8/toolkit/11.8.0/include/

Somewhat related: we recently became aware of performance issues on CryoSPARC instances configured with CUDA-11.8 that may be avoided by configuring CUDA-11.7 instead.

Greetings.

Here is the output requested:

[root@vision ~]# ls -l /cm/shared/apps/cuda11.8/toolkit/11.8.0/
total 236
drwxr-xr-x 3 root root  4096 Oct 24 12:41 bin
drwxr-xr-x 3 root root    16 Oct 24 12:41 C
drwxr-xr-x 2 root root  4096 Oct 24 12:41 compat
drwxr-xr-x 5 root root  4096 Oct 24 12:41 compute-sanitizer
-rw-r--r-- 1 root root 80660 Oct 18 10:06 CUDA_Toolkit_Release_Notes.txt
-rw-r--r-- 1 root root   160 Oct 18 10:06 DOCS
drwxr-xr-x 3 root root    25 Oct 24 12:41 etc
-rw-r--r-- 1 root root 61498 Oct 18 10:06 EULA.txt
drwxr-xr-x 5 root root    84 Oct 24 12:41 extras
drwxr-xr-x 4 root root    78 Oct 24 12:41 gds
lrwxrwxrwx 1 root root    28 Oct 24 12:41 include -> targets/x86_64-linux/include
lrwxrwxrwx 1 root root    24 Oct 24 12:41 lib64 -> targets/x86_64-linux/lib
-rw-r--r-- 1 root root 61498 Oct 18 10:06 LICENSE
drwxr-xr-x 3 root root    17 Oct 24 12:41 man
drwxr-xr-x 3 root root    20 Oct 24 12:41 nvml
drwxr-xr-x 7 root root    80 Oct 24 12:41 nvvm
-rw-r--r-- 1 root root   524 Oct 18 10:06 README
drwxr-xr-x 5 root root    41 Oct 24 12:41 share
drwxr-xr-x 2 root root  4096 Oct 24 12:41 src
drwxr-xr-x 3 root root    25 Oct 24 12:41 targets
drwxr-xr-x 2 root root    42 Oct 24 12:44 tools
-r--r--r-- 1 root root  2929 Oct 18 10:07 version.json
[root@vision ~]# ls -l /cm/shared/apps/cuda11.8/toolkit/11.8.0/include/
total 13292
-rw-r--r--  1 root root    3150 Oct 18 10:06 builtin_types.h
-rw-r--r--  1 root root   22595 Oct 18 10:06 channel_descriptor.h
drwxr-xr-x  2 root root     127 Oct 24 12:41 CL
-rw-r--r--  1 root root    3410 Oct 18 10:06 common_functions.h
drwxr-xr-x  3 root root      69 Oct 24 12:41 cooperative_groups
-rw-r--r--  1 root root   66376 Oct 18 10:06 cooperative_groups.h
drwxr-xr-x  2 root root    4096 Oct 24 12:41 crt
drwxr-xr-x 11 root root    4096 Oct 24 12:41 cub
-rw-r--r--  1 root root  220681 Oct 18 10:06 cublas_api.h
-rw-r--r--  1 root root   41246 Oct 18 10:06 cublas.h
-rw-r--r--  1 root root   79035 Oct 18 10:06 cublasLt.h
-rw-r--r--  1 root root    8811 Oct 18 10:06 cublas_v2.h
-rw-r--r--  1 root root   37380 Oct 18 10:06 cublasXt.h
-rw-r--r--  1 root root   12186 Oct 18 10:06 cuComplex.h
drwxr-xr-x  3 root root     106 Oct 24 12:41 cuda
-rw-r--r--  1 root root    7600 Oct 18 10:06 cuda_awbarrier.h
-rw-r--r--  1 root root   12227 Oct 18 10:06 cuda_awbarrier_helpers.h
-rw-r--r--  1 root root    3993 Oct 18 10:06 cuda_awbarrier_primitives.h
-rw-r--r--  1 root root  139379 Oct 18 10:06 cuda_bf16.h
-rw-r--r--  1 root root  101353 Oct 18 10:06 cuda_bf16.hpp
-rw-r--r--  1 root root   16574 Oct 18 10:06 cuda_device_runtime_api.h
-rw-r--r--  1 root root   39544 Oct 18 10:06 cudaEGL.h
-rw-r--r--  1 root root   37109 Oct 18 10:06 cuda_egl_interop.h
-rw-r--r--  1 root root    5645 Oct 18 10:06 cudaEGLTypedefs.h
-rw-r--r--  1 root root  132563 Oct 18 10:06 cuda_fp16.h
-rw-r--r--  1 root root   93580 Oct 18 10:06 cuda_fp16.hpp
-rw-r--r--  1 root root   13358 Oct 18 10:06 cuda_fp8.h
-rw-r--r--  1 root root   56491 Oct 18 10:06 cuda_fp8.hpp
-rw-r--r--  1 root root   22401 Oct 18 10:06 cudaGL.h
-rw-r--r--  1 root root   18961 Oct 18 10:06 cuda_gl_interop.h
-rw-r--r--  1 root root    6576 Oct 18 10:06 cudaGLTypedefs.h
-rw-r--r--  1 root root  840786 Oct 18 10:06 cuda.h
-rw-r--r--  1 root root    4105 Oct 18 10:06 cudalibxt.h
-rw-r--r--  1 root root   67179 Oct 18 10:06 cuda_occupancy.h
-rw-r--r--  1 root root    8130 Oct 18 10:06 cuda_pipeline.h
-rw-r--r--  1 root root   13828 Oct 18 10:06 cuda_pipeline_helpers.h
-rw-r--r--  1 root root    8675 Oct 18 10:06 cuda_pipeline_primitives.h
-rw-r--r--  1 root root    3297 Oct 18 10:06 cudaProfilerTypedefs.h
-rw-r--r--  1 root root    2717 Oct 18 10:06 cudart_platform.h
-rw-r--r--  1 root root  569590 Oct 18 10:06 cuda_runtime_api.h
-rw-r--r--  1 root root  113013 Oct 18 10:06 cuda_runtime.h
-rw-r--r--  1 root root    4093 Oct 18 10:06 cuda_stdint.h
-rw-r--r--  1 root root    4276 Oct 18 10:06 cuda_surface_types.h
-rw-r--r--  1 root root    4781 Oct 18 10:06 cuda_texture_types.h
-rw-r--r--  1 root root   93945 Oct 18 10:06 cudaTypedefs.h
-rw-r--r--  1 root root   12694 Oct 18 10:06 cudaVDPAU.h
-rw-r--r--  1 root root    7631 Oct 18 10:06 cuda_vdpau_interop.h
-rw-r--r--  1 root root    4144 Oct 18 10:06 cudaVDPAUTypedefs.h
-rw-r--r--  1 root root   12570 Oct 18 10:06 cufft.h
-rw-r--r--  1 root root   19412 Oct 18 10:06 cufftw.h
-rw-r--r--  1 root root   11463 Oct 18 10:06 cufftXt.h
-rw-r--r--  1 root root   22187 Oct 18 10:06 cufile.h
-rw-r--r--  1 root root  303452 Oct 18 10:06 cupti_activity.h
-rw-r--r--  1 root root   26602 Oct 18 10:06 cupti_callbacks.h
-rw-r--r--  1 root root    5264 Oct 18 10:06 cupti_checkpoint.h
-rw-r--r--  1 root root   66916 Oct 18 10:06 cupti_driver_cbid.h
-rw-r--r--  1 root root   52639 Oct 18 10:06 cupti_events.h
-rw-r--r--  1 root root    4697 Oct 18 10:06 cupti.h
-rw-r--r--  1 root root   32148 Oct 18 10:06 cupti_metrics.h
-rw-r--r--  1 root root    5912 Oct 18 10:06 cupti_nvtx_cbid.h
-rw-r--r--  1 root root   30910 Oct 18 10:06 cupti_pcsampling.h
-rw-r--r--  1 root root   13060 Oct 18 10:06 cupti_pcsampling_util.h
-rw-r--r--  1 root root   31487 Oct 18 10:06 cupti_profiler_target.h
-rw-r--r--  1 root root   12026 Oct 18 10:06 cupti_result.h
-rw-r--r--  1 root root   43104 Oct 18 10:06 cupti_runtime_cbid.h
-rw-r--r--  1 root root    1263 Oct 18 10:06 cupti_target.h
-rw-r--r--  1 root root    4317 Oct 18 10:06 cupti_version.h
-rw-r--r--  1 root root   10883 Oct 18 10:06 curand_discrete2.h
-rw-r--r--  1 root root    3486 Oct 18 10:06 curand_discrete.h
-rw-r--r--  1 root root    3717 Oct 18 10:06 curand_globals.h
-rw-r--r--  1 root root   43965 Oct 18 10:06 curand.h
-rw-r--r--  1 root root   52714 Oct 18 10:06 curand_kernel.h
-rw-r--r--  1 root root   28142 Oct 18 10:06 curand_lognormal.h
-rw-r--r--  1 root root  170296 Oct 18 10:06 curand_mrg32k3a.h
-rw-r--r--  1 root root  276889 Oct 18 10:06 curand_mtgp32dc_p_11213.h
-rw-r--r--  1 root root    7845 Oct 18 10:06 curand_mtgp32.h
-rw-r--r--  1 root root   18266 Oct 18 10:06 curand_mtgp32_host.h
-rw-r--r--  1 root root   13710 Oct 18 10:06 curand_mtgp32_kernel.h
-rw-r--r--  1 root root   26926 Oct 18 10:06 curand_normal.h
-rw-r--r--  1 root root    4649 Oct 18 10:06 curand_normal_static.h
-rw-r--r--  1 root root    7146 Oct 18 10:06 curand_philox4x32_x.h
-rw-r--r--  1 root root   25409 Oct 18 10:06 curand_poisson.h
-rw-r--r--  1 root root 1392393 Oct 18 10:06 curand_precalc.h
-rw-r--r--  1 root root   17472 Oct 18 10:06 curand_uniform.h
-rw-r--r--  1 root root    8825 Oct 18 10:06 cusolver_common.h
-rw-r--r--  1 root root  147406 Oct 18 10:06 cusolverDn.h
-rw-r--r--  1 root root   11549 Oct 18 10:06 cusolverMg.h
-rw-r--r--  1 root root   14292 Oct 18 10:06 cusolverRf.h
-rw-r--r--  1 root root   32561 Oct 18 10:06 cusolverSp.h
-rw-r--r--  1 root root   37495 Oct 18 10:06 cusolverSp_LOWLEVEL_PREVIEW.h
-rw-r--r--  1 root root  334453 Oct 18 10:06 cusparse.h
-rw-r--r--  1 root root    2587 Oct 18 10:06 cusparse_v2.h
-rw-r--r--  1 root root   11359 Oct 18 10:06 device_atomic_functions.h
-rw-r--r--  1 root root    8149 Oct 18 10:06 device_atomic_functions.hpp
-rw-r--r--  1 root root    3452 Oct 18 10:06 device_double_functions.h
-rw-r--r--  1 root root    3410 Oct 18 10:06 device_functions.h
-rw-r--r--  1 root root    3846 Oct 18 10:06 device_launch_parameters.h
-rw-r--r--  1 root root    3588 Oct 18 10:06 device_types.h
-rw-r--r--  1 root root    4625 Oct 18 10:06 driver_functions.h
-rw-r--r--  1 root root  141075 Oct 18 10:06 driver_types.h
-rw-r--r--  1 root root    1809 Oct 18 10:06 fatbinary_section.h
-rw-r--r--  1 root root    2250 Oct 18 10:06 generated_cuda_gl_interop_meta.h
-rw-r--r--  1 root root    3115 Oct 18 10:06 generated_cudaGL_meta.h
-rw-r--r--  1 root root   77263 Oct 18 10:06 generated_cuda_meta.h
-rw-r--r--  1 root root    1695 Oct 18 10:06 generated_cudart_removed_meta.h
-rw-r--r--  1 root root   64771 Oct 18 10:06 generated_cuda_runtime_api_meta.h
-rw-r--r--  1 root root    1367 Oct 18 10:06 generated_cuda_vdpau_interop_meta.h
-rw-r--r--  1 root root    1453 Oct 18 10:06 generated_cudaVDPAU_meta.h
-rw-r--r--  1 root root    7513 Oct 18 10:06 generated_nvtx_meta.h
-rw-r--r--  1 root root    3380 Oct 18 10:06 host_config.h
-rw-r--r--  1 root root    3386 Oct 18 10:06 host_defines.h
-rw-r--r--  1 root root    4766 Oct 18 10:06 library_types.h
-rw-r--r--  1 root root    7608 Oct 18 10:06 math_constants.h
-rw-r--r--  1 root root    3398 Oct 18 10:06 math_functions.h
-rw-r--r--  1 root root    2932 Oct 18 10:06 mma.h
-rw-r--r--  1 root root    7328 Oct 18 10:06 nppcore.h
-rw-r--r--  1 root root   30498 Oct 18 10:06 nppdefs.h
-rw-r--r--  1 root root    3204 Oct 18 10:06 npp.h
-rw-r--r--  1 root root 1006080 Oct 18 10:06 nppi_arithmetic_and_logical_operations.h
-rw-r--r--  1 root root  599888 Oct 18 10:06 nppi_color_conversion.h
-rw-r--r--  1 root root  329090 Oct 18 10:06 nppi_data_exchange_and_initialization.h
-rw-r--r--  1 root root 1030240 Oct 18 10:06 nppi_filtering_functions.h
-rw-r--r--  1 root root  332924 Oct 18 10:06 nppi_geometry_transforms.h
-rw-r--r--  1 root root    3425 Oct 18 10:06 nppi.h
-rw-r--r--  1 root root    5943 Oct 18 10:06 nppi_linear_transforms.h
-rw-r--r--  1 root root  131084 Oct 18 10:06 nppi_morphological_operations.h
-rw-r--r--  1 root root 1026876 Oct 18 10:06 nppi_statistics_functions.h
-rw-r--r--  1 root root   13574 Oct 18 10:06 nppi_support_functions.h
-rw-r--r--  1 root root  214915 Oct 18 10:06 nppi_threshold_and_compare_operations.h
-rw-r--r--  1 root root  249037 Oct 18 10:06 npps_arithmetic_and_logical_operations.h
-rw-r--r--  1 root root   54275 Oct 18 10:06 npps_conversion_functions.h
-rw-r--r--  1 root root    3718 Oct 18 10:06 npps_filtering_functions.h
-rw-r--r--  1 root root    3251 Oct 18 10:06 npps.h
-rw-r--r--  1 root root   21085 Oct 18 10:06 npps_initialization.h
-rw-r--r--  1 root root  279481 Oct 18 10:06 npps_statistics_functions.h
-rw-r--r--  1 root root    7875 Oct 18 10:06 npps_support_functions.h
drwxr-xr-x  3 root root      32 Oct 24 12:41 nv
-rw-r--r--  1 root root   23341 Oct 18 10:06 nvblas.h
-rw-r--r--  1 root root    1912 Oct 18 10:06 nv_decode.h
-rw-r--r--  1 root root    2975 Oct 18 10:06 nvfunctional
-rw-r--r--  1 root root   33444 Oct 18 10:06 nvjpeg.h
-rw-r--r--  1 root root  486176 Oct 18 10:06 nvml.h
-rw-r--r--  1 root root   10422 Oct 18 10:06 nvperf_common.h
-rw-r--r--  1 root root    8299 Oct 18 10:06 nvperf_cuda_host.h
-rw-r--r--  1 root root   63707 Oct 18 10:06 nvperf_host.h
-rw-r--r--  1 root root   21414 Oct 18 10:06 nvperf_target.h
-rw-r--r--  1 root root   11859 Oct 18 10:06 nvPTXCompiler.h
-rw-r--r--  1 root root   30224 Oct 18 10:06 nvrtc.h
-rw-r--r--  1 root root    6009 Oct 18 10:06 nvToolsExtCuda.h
-rw-r--r--  1 root root    5192 Oct 18 10:06 nvToolsExtCudaRt.h
-rw-r--r--  1 root root   53680 Oct 18 10:06 nvToolsExt.h
-rw-r--r--  1 root root    8360 Oct 18 10:06 nvToolsExtOpenCL.h
-rw-r--r--  1 root root   14562 Oct 18 10:06 nvToolsExtSync.h
drwxr-xr-x  3 root root     138 Oct 24 12:41 nvtx3
drwxr-xr-x  2 root root      28 Oct 24 12:41 Openacc
drwxr-xr-x  2 root root      45 Oct 24 12:41 Openmp
-rw-r--r--  1 root root    4342 Oct 18 10:06 sm_20_atomic_functions.h
-rw-r--r--  1 root root    3929 Oct 18 10:06 sm_20_atomic_functions.hpp
-rw-r--r--  1 root root   50660 Oct 18 10:06 sm_20_intrinsics.h
-rw-r--r--  1 root root    7694 Oct 18 10:06 sm_20_intrinsics.hpp
-rw-r--r--  1 root root   15845 Oct 18 10:06 sm_30_intrinsics.h
-rw-r--r--  1 root root   24480 Oct 18 10:06 sm_30_intrinsics.hpp
-rw-r--r--  1 root root    6540 Oct 18 10:06 sm_32_atomic_functions.h
-rw-r--r--  1 root root    5377 Oct 18 10:06 sm_32_atomic_functions.hpp
-rw-r--r--  1 root root   33197 Oct 18 10:06 sm_32_intrinsics.h
-rw-r--r--  1 root root   70577 Oct 18 10:06 sm_32_intrinsics.hpp
-rw-r--r--  1 root root    2909 Oct 18 10:06 sm_35_atomic_functions.h
-rw-r--r--  1 root root    2952 Oct 18 10:06 sm_35_intrinsics.h
-rw-r--r--  1 root root   20606 Oct 18 10:06 sm_60_atomic_functions.h
-rw-r--r--  1 root root   15057 Oct 18 10:06 sm_60_atomic_functions.hpp
-rw-r--r--  1 root root    5991 Oct 18 10:06 sm_61_intrinsics.h
-rw-r--r--  1 root root    6748 Oct 18 10:06 sm_61_intrinsics.hpp
-rw-r--r--  1 root root   19773 Oct 18 10:06 surface_functions.h
-rw-r--r--  1 root root   11930 Oct 18 10:06 surface_indirect_functions.h
-rw-r--r--  1 root root    4653 Oct 18 10:06 surface_types.h
-rw-r--r--  1 root root   32714 Oct 18 10:06 texture_fetch_functions.h
-rw-r--r--  1 root root   23039 Oct 18 10:06 texture_indirect_functions.h
-rw-r--r--  1 root root    9058 Oct 18 10:06 texture_types.h
drwxr-xr-x  9 root root    4096 Oct 24 12:41 thrust
-rw-r--r--  1 root root    7847 Oct 18 10:06 vector_functions.h
-rw-r--r--  1 root root   10060 Oct 18 10:06 vector_functions.hpp
-rw-r--r--  1 root root   13206 Oct 18 10:06 vector_types.h

Interesting note about 11.8. Is there a fix coming for this, or should we consider switching to an earlier version?

Interesting. This listing is missing:

cuda_profiler_api.h
cudaProfiler.h

and possibly other files that are present in an installation (without root privileges) from a runfile that I looked at.
If your installation is, directly or indirectly, package manager-based, could there be a “profiler” package missing (similar to this discussion)
We are working on a fix for the 11.8 issue. Because we have not determined a release date, I recommend using an earlier toolkit version.
If you meet the requirements shown here, you could try
/cryosparcw install-3dflex
which currently would (among other items) download and install the toolkit’s v11.7.

Hmm, indeed they are missing. I will correct this problem, but I’m also going to roll back to Cuda 11.2 based on your other advice. It is installed already, and has both cuda_profiler_api.h and cudaProfiler.h present as indicated.

This appears to have built the worker process correctly. I will ask users to test and report back.

3 posts were split to a new topic: Cache-related error