We were having troubles with starting cryosparcm with the slurm on the cluster. So instead we used salloc to get to an allocated node for GPU access. For that to work we had to change the CRYOSPARC_MASTER_HOSTNAME with an environment variable in the config.sh, please see below.
This starts cryosparcm fine and we were able to access the user interface with:
ssh -N -L localhost:39000:localhost:39000 -J myusername@my.jump.hostname myusername@target.hostname
After pasting localhost:39000 into Chrome, we were able to access the user interface fine. But when we check the Resource Manager it shows my jump hostname instead of the target. So errors show up when we try to run any jobs.
Welcome to the forum @parag.
Can you recall the precise install.sh command(s) that was/were used during cryoSPARC installation and on which host(s) they were executed?
Are you attempting to simulate a “single workstation” (combined master/worker) instance?
I think we were attempting to simulate a single workstation.
The host registered during installation is our login node (written as myusername@my.jump.hostname above). Should we try to install in the compute node?
If I am correctly understanding what you are trying to install a cryoSPARC instance that will run on an unknown (at time of installation) host as a combined master/worker
remove the scheduler target my.jump.hostname cryosparcm cli "remove_scheduler_target_node('my.jump.hostname')"
(confirm hostname in Resource Manager : Instance Information)
“connect” 127.0.0.1 as a worker: /path/to/cryosparc_worker/bin/cryosparcw connect --master 127.0.0.1 --worker 127.0.0.1 [..]
(see guide for important additional options)
This setup requires that certain configurations are consistent between all cluster nodes where you wish to run that “single workstation” instance:
settings for the 127.0.0.1 target in cryosparcm cli "get_scheduler_targets()" need to be applicable regardless the specific allocated cluster node