Hi,
Thank you for the release of cryoSPARC v2. I’ve been looking so much forward to this release.
Although I have a little problems installing the cluster setup.
I have come as far as installing the master and cluster worker, but when I try to connect the master with the cluster_info.json and cluster_script.sh it fails with:
On master:
-bash-4.2$ cryosparcm cluster connect
Traceback (most recent call last):
_ File “”, line 5, in _
_ File “/opt/bioxray/programs/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/json/init.py”, line 291, in load_
_ **kw)_
_ File “/opt/bioxray/programs/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/json/init.py”, line 339, in loads_
_ return default_decoder.decode(s)
_ File “/opt/bioxray/programs/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/json/decoder.py”, line 364, in decode_
_ obj, end = self.raw_decode(s, idx=w(s, 0).end())
_ File “/opt/bioxray/programs/cryosparc2/cryosparc2_master/deps/anaconda/lib/python2.7/json/decoder.py”, line 380, in raw_decode_
_ obj, end = self.scan_once(s, idx)_
ValueError: Expecting , delimiter: line 5 column 5 (char 142)
This is on CentOS 7.4 and the master node is non-GPU.
My cluster files looks like this:
-bash-4.2$ cat cluster_info.json
{
_ “name” : “EMCC”,_
_ “worker_bin_path” : “/opt/bioxray/programs/cryosparc2/cryosparc2_worker/bin/cryosparcw”,_
_ “cache_path” : “”_
_ “send_cmd_tpl” : “ssh loginnode {{ command }}”,_
_ “qsub_cmd_tpl” : “sbatch {{ script_path_abs }}”,_
_ “qstat_cmd_tpl” : “squeue -j {{ cluster_job_id }}”,_
_ “qdel_cmd_tpl” : “scancel {{ cluster_job_id }}”,_
_ “qinfo_cmd_tpl” : “sinfo”,_
_ “transfer_cmd_tpl” : “scp {{ src_path }} loginnode:{{ dest_path }}”_
}
-bash-4.2$ cat cluster_script.sh
#!/usr/bin/env bash
#### cryoSPARC cluster submission script template for SLURM
## Available variables:
## {{ run_cmd }} - the complete command string to run the job
## {{ num_cpu }} - the number of CPUs needed
_## {{ num_gpu }} - the number of GPUs needed. _
## Note: the code will use this many GPUs starting from dev id 0
## the cluster scheduler or this script have the responsibility
## of setting CUDA_VISIBLE_DEVICES so that the job code ends up
## using the correct cluster-allocated GPUs.
## {{ ram_gb }} - the amount of RAM needed in GB
## {{ job_dir_abs }} - absolute path to the job directory
## {{ project_dir_abs }} - absolute path to the project dir
## {{ job_log_path_abs }} - absolute path to the log file for the job
## {{ worker_bin_path }} - absolute path to the cryosparc worker command
## {{ run_args }} - arguments to be passed to cryosparcw run
## {{ project_uid }} - uid of the project
## {{ job_uid }} - uid of the job
##
## What follows is a simple SLURM script:
#SBATCH --job-name cryosparc2{{ project_uid }}{{ job_uid }}
#SBATCH -n {{ num_cpu }}
#SBATCH --gres=gpu:{{ num_gpu }}
#SBATCH -p gpu
_#SBATCH --mem={{ (ram_gb*1000)|int }}MB _
#SBATCH -o {{ job_dir_abs }}
#SBATCH -e {{ job_dir_abs }}
available_devs=""
for devidx in $(seq 0 15);
do
_ if [[ -z $(nvidia-smi -i $devidx --query-compute-apps=pid --format=csv,noheader) ]] ; then_
_ if [[ -z “$available_devs” ]] ; then_
_ available_devs=$devidx_
_ else_
_ available_devs=$available_devs,$devidx_
_ fi_
_ fi_
done
export CUDA_VISIBLE_DEVICES=$available_devs
{{ run_cmd }}
What am I doing wrong?
Cheers,
Jesper