Error:'bins` must be positive, when an integer in patch ctf estimation

wtempel · January 22, 2025, 4:30pm

orangeboomerang:

  File "/nfs/(group_path)/<cryosparcpath>cryosparc_worker/cryosparc_compute/get_gpu_info.py", line 13, in get_driver_version
    from cryosparc_compute.gpu.driver import get_version
  File "/nfs/(group_path)/cryosparcpath)/cryosparc_worker/cryosparc_compute/gpu/driver.py", line 12, in <module>
    from cuda import cuda, cudart, nvrtc
ModuleNotFoundError: No module named 'cuda'
MONITOR PROCESS PID 3210403
**********************************

@orangeboomerang Please can you also post the top of this file

cryosparcm joblog P1 J3 | head -n 40

orangeboomerang · January 23, 2025, 8:00am

hi,

see below.

<groupname>@gpu136:~$ source cryosparc_v4.6.2.sh
(<placeholder_path>) <groupname>@gpu136:~$ cryosparcm joblog P1 J3 | head -n 40
Traceback (most recent call last):
  File "<placeholder_path>/cryosparc_worker/deps/anaconda/bin/conda", line 12, in <module>
    from conda.cli import main
ModuleNotFoundError: No module named 'conda'
Warning: Could not activate conda environment; this indicates that a cryoSPARC installation is either incomplete or in progress


================= CRYOSPARCW =======  2025-01-17 13:47:17.930484  =========
Project P1 Job J3
Master <hostname>.local <placeholder_port>
===========================================================================
MAIN PROCESS PID 3210401
========= now starting main process at 2025-01-17 13:47:17.930969
ctf_estimation.run cryosparc_compute.jobs.jobregister
<placeholder_path>/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/numba/core/config.py:194: UserWarning: CUDA Python bindings requested (the environment variable NUMBA_CUDA_USE_NVIDIA_BINDING is set), but they are not importable: No module named 'cuda'.
  warnings.warn(msg)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "cryosparc_master/cryosparc_compute/run.py", line 212, in cryosparc_master.cryosparc_compute.run.run
  File "<placeholder_path>/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 2441, in get_instance_information
    instance_information["driver_version"] = ".".join(map(str, get_driver_version()))
  File "<placeholder_path>/cryosparc_worker/cryosparc_compute/get_gpu_info.py", line 13, in get_driver_version
    from cryosparc_compute.gpu.driver import get_version
  File "<placeholder_path>/cryosparc_worker/cryosparc_compute/gpu/driver.py", line 12, in <module>
    from cuda import cuda, cudart, nvrtc
ModuleNotFoundError: No module named 'cuda'
MONITOR PROCESS PID 3210403
***************************************************************
Transparent hugepages setting: [always] madvise never

Running job on hostname %s slurmcluster
Allocated Resources :  {'fixed': {'SSD': False}, 'hostname': 'slurmcluster', 'lane': 'slurmcluster', 'lane_type': 'cluster', 'license': True, 'licenses_acquired': 1, 'slots': {'CPU': [0, 1], 'GPU': [0], 'RAM': [0]}, 'target': {'cache_path': '/ssdpool/<groupname>/v4.4', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'custom_var_names': [], 'custom_vars': {}, 'desc': None, 'hostname': 'slurmcluster', 'lane': 'slurmcluster', 'name': 'slurmcluster', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': 'sinfo', 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qstat_code_cmd_tpl': None, 'qsub_cmd_tpl': 'sbatch {{ script_path_abs }}', 'script_tpl': '#!/usr/bin/env bash\n#### cryoSPARC cluster submission script template for SLURM\n## Available variables:\n## {{ run_cmd }}            - the complete command string to run the job\n## {{ num_cpu }}            - the number of CPUs needed\n## {{ num_gpu }}            - the number of GPUs needed. \n##                            Note: the code will use this many GPUs starting from dev id 0\n##                                  the cluster scheduler or this script have the responsibility\n##                                  of setting CUDA_VISIBLE_DEVICES so that the job code ends up\n##                                  using the correct cluster-allocated GPUs.\n## {{ ram_gb }}             - the amount of RAM needed in GB\n## {{ job_dir_abs }}        - absolute path to the job directory\n## {{ project_dir_abs }}    - absolute path to the project dir\n## {{ job_log_path_abs }}   - absolute path to the log file for the job\n## {{ worker_bin_path }}    - absolute path to the cryosparc worker command\n## {{ run_args }}           - arguments to be passed to cryosparcw run\n## {{ project_uid }}        - uid of the project\n## {{ job_uid }}            - uid of the job\n## {{ job_creator }}        - name of the user that created the job (may contain spaces)\n## {{ cryosparc_username }} - cryosparc username of the user that created the job (usually an email)\n## {{ job_type }}           - CryoSPARC job type\n##\n## What follows is a simple SLURM script:\n\n#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}\n#SBATCH -n {{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH --ntasks=2\n#SBATCH --partition=gpu\n#SBATCH --mem=40000MB\n#SBATCH -o <placeholder_path>/cryosparc_slurm_outputs/output_{{ project_uid }}_{{ job_uid }}.txt\n#SBATCH -e <placeholder_path>/cryosparc_slurm_outputs/error_{{ project_uid }}_{{ job_uid }}.txt\n#SBATCH --exclude=gpu280,gpu279,gpu278,gpu281,gpu139,gpu227,gpu228,gpu138,gpu150,gpu148,gpu145\n#SBATCH --time=240:00:00\n#SBATCH --constraint=bookworm # debian12\n\necho $available_devs\necho $CUDA_HOME\necho "$(hostname)"\necho $SLURM_TMPDIR\n\n/usr/bin/nvidia-smi\n\nmodule list\n\nexport CRYOSPARC_SSD_PATH="${SLURM_TMPDIR}"\n\n{{ run_cmd }}\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'slurmcluster', 'tpl_vars': ['cluster_job_id', 'ram_gb', 'project_dir_abs', 'run_args', 'run_cmd', 'command', 'job_log_path_abs', 'worker_bin_path', 'cryosparc_username', 'job_dir_abs', 'num_cpu', 'project_uid', 'job_uid', 'num_gpu', 'job_creator', 'job_type'], 'type': 'cluster', 'worker_bin_path': '<placeholder_path>/cryosparc_worker/bin/cryosparcw'}}
Process Process-1:
Traceback (most recent call last):
  File "<placeholder_path>/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "<placeholder_path>/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "<placeholder_path>/cryosparc_worker/cryosparc_compute/jobs/pipeline.py", line 199, in process_work_simple
    process_setup(proc_idx) # do any setup you want on a per-process basis

wtempel · January 23, 2025, 2:41pm

Thanks @orangeboomerang . Please can you confirm that

(in your cluster script template)

is on network storage that is shared between all relevant GPU nodes
corresponds exactly to the prefix (up to cryosparc_worker/) of the shebang path in

schloegl:

#!/path/to/cryosparc_worker/deps/anaconda/bin/python

orangeboomerang · January 29, 2025, 4:28pm

thanks @wtempel , we resolved the issue. Will update soon with more information. Briefly, my understanding is that there were old installation files in the same location which may have been interfering.

schloegl · January 29, 2025, 4:29pm

It seems there was some interference with previous attempts (when the install path was to long).
This older installation has been removed, and now the user (here as user @orangeboomerang ) has confirmed that the current installation is usable and working.

Accordingly, this issue can be closed.