@orangeboomerang Please can you also post the top of this file
cryosparcm joblog P1 J3 | head -n 40
@orangeboomerang Please can you also post the top of this file
cryosparcm joblog P1 J3 | head -n 40
hi,
see below.
<groupname>@gpu136:~$ source cryosparc_v4.6.2.sh
(<placeholder_path>) <groupname>@gpu136:~$ cryosparcm joblog P1 J3 | head -n 40
Traceback (most recent call last):
File "<placeholder_path>/cryosparc_worker/deps/anaconda/bin/conda", line 12, in <module>
from conda.cli import main
ModuleNotFoundError: No module named 'conda'
Warning: Could not activate conda environment; this indicates that a cryoSPARC installation is either incomplete or in progress
================= CRYOSPARCW ======= 2025-01-17 13:47:17.930484 =========
Project P1 Job J3
Master <hostname>.local <placeholder_port>
===========================================================================
MAIN PROCESS PID 3210401
========= now starting main process at 2025-01-17 13:47:17.930969
ctf_estimation.run cryosparc_compute.jobs.jobregister
<placeholder_path>/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/site-packages/numba/core/config.py:194: UserWarning: CUDA Python bindings requested (the environment variable NUMBA_CUDA_USE_NVIDIA_BINDING is set), but they are not importable: No module named 'cuda'.
warnings.warn(msg)
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "cryosparc_master/cryosparc_compute/run.py", line 212, in cryosparc_master.cryosparc_compute.run.run
File "<placeholder_path>/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 2441, in get_instance_information
instance_information["driver_version"] = ".".join(map(str, get_driver_version()))
File "<placeholder_path>/cryosparc_worker/cryosparc_compute/get_gpu_info.py", line 13, in get_driver_version
from cryosparc_compute.gpu.driver import get_version
File "<placeholder_path>/cryosparc_worker/cryosparc_compute/gpu/driver.py", line 12, in <module>
from cuda import cuda, cudart, nvrtc
ModuleNotFoundError: No module named 'cuda'
MONITOR PROCESS PID 3210403
***************************************************************
Transparent hugepages setting: [always] madvise never
Running job on hostname %s slurmcluster
Allocated Resources : {'fixed': {'SSD': False}, 'hostname': 'slurmcluster', 'lane': 'slurmcluster', 'lane_type': 'cluster', 'license': True, 'licenses_acquired': 1, 'slots': {'CPU': [0, 1], 'GPU': [0], 'RAM': [0]}, 'target': {'cache_path': '/ssdpool/<groupname>/v4.4', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'custom_var_names': [], 'custom_vars': {}, 'desc': None, 'hostname': 'slurmcluster', 'lane': 'slurmcluster', 'name': 'slurmcluster', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': 'sinfo', 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qstat_code_cmd_tpl': None, 'qsub_cmd_tpl': 'sbatch {{ script_path_abs }}', 'script_tpl': '#!/usr/bin/env bash\n#### cryoSPARC cluster submission script template for SLURM\n## Available variables:\n## {{ run_cmd }} - the complete command string to run the job\n## {{ num_cpu }} - the number of CPUs needed\n## {{ num_gpu }} - the number of GPUs needed. \n## Note: the code will use this many GPUs starting from dev id 0\n## the cluster scheduler or this script have the responsibility\n## of setting CUDA_VISIBLE_DEVICES so that the job code ends up\n## using the correct cluster-allocated GPUs.\n## {{ ram_gb }} - the amount of RAM needed in GB\n## {{ job_dir_abs }} - absolute path to the job directory\n## {{ project_dir_abs }} - absolute path to the project dir\n## {{ job_log_path_abs }} - absolute path to the log file for the job\n## {{ worker_bin_path }} - absolute path to the cryosparc worker command\n## {{ run_args }} - arguments to be passed to cryosparcw run\n## {{ project_uid }} - uid of the project\n## {{ job_uid }} - uid of the job\n## {{ job_creator }} - name of the user that created the job (may contain spaces)\n## {{ cryosparc_username }} - cryosparc username of the user that created the job (usually an email)\n## {{ job_type }} - CryoSPARC job type\n##\n## What follows is a simple SLURM script:\n\n#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}\n#SBATCH -n {{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH --ntasks=2\n#SBATCH --partition=gpu\n#SBATCH --mem=40000MB\n#SBATCH -o <placeholder_path>/cryosparc_slurm_outputs/output_{{ project_uid }}_{{ job_uid }}.txt\n#SBATCH -e <placeholder_path>/cryosparc_slurm_outputs/error_{{ project_uid }}_{{ job_uid }}.txt\n#SBATCH --exclude=gpu280,gpu279,gpu278,gpu281,gpu139,gpu227,gpu228,gpu138,gpu150,gpu148,gpu145\n#SBATCH --time=240:00:00\n#SBATCH --constraint=bookworm # debian12\n\necho $available_devs\necho $CUDA_HOME\necho "$(hostname)"\necho $SLURM_TMPDIR\n\n/usr/bin/nvidia-smi\n\nmodule list\n\nexport CRYOSPARC_SSD_PATH="${SLURM_TMPDIR}"\n\n{{ run_cmd }}\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'slurmcluster', 'tpl_vars': ['cluster_job_id', 'ram_gb', 'project_dir_abs', 'run_args', 'run_cmd', 'command', 'job_log_path_abs', 'worker_bin_path', 'cryosparc_username', 'job_dir_abs', 'num_cpu', 'project_uid', 'job_uid', 'num_gpu', 'job_creator', 'job_type'], 'type': 'cluster', 'worker_bin_path': '<placeholder_path>/cryosparc_worker/bin/cryosparcw'}}
Process Process-1:
Traceback (most recent call last):
File "<placeholder_path>/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "<placeholder_path>/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "<placeholder_path>/cryosparc_worker/cryosparc_compute/jobs/pipeline.py", line 199, in process_work_simple
process_setup(proc_idx) # do any setup you want on a per-process basis
Thanks @orangeboomerang . Please can you confirm that
(in your cluster script template)
cryosparc_worker/
) of the shebang path in
thanks @wtempel , we resolved the issue. Will update soon with more information. Briefly, my understanding is that there were old installation files in the same location which may have been interfering.
It seems there was some interference with previous attempts (when the install path was to long).
This older installation has been removed, and now the user (here as user @orangeboomerang ) has confirmed that the current installation is usable and working.
Accordingly, this issue can be closed.