Queueing jobs on specific GPUs

From the description it sounds like this behavior was most definitely intended, but maybe it would be possible to make the scheduler overrun optional?

As a workaround (albeit an inelegant one), have you considered setting up separate worker lanes for each GPU?

You can get around the issue of hostname duplication with ssh_config host aliases. There are oddities in how cache-locks are (not) honoured under such conditions, e.g. when jobs sent to separate lanes happen to cache the same input data to ssdpath. However, if this is a common occurrence in your workflow, you can specify a unique ssdpath for each worker lane as well.

Alternatively, if you only ever have one cryoSPARC user sharing the workstation with RELION jobs, you can have a single worker lane for which the gpu configuration can be updated when required.

@lazaret

In case you wondered, that update could be achieved with
cryosparcw connect <...> and the applicable --update gpus <id or comma-sep id list> option.

Surely widely appreciated, I would say. I am surprised that it still works like it does.

-André

Hi @team

More than a year passed and I would like to ask you if there is any easy solution in place for standalone workstations as discussed above… I understand that for clusters lane configuration is working well.

Thanks!
André

If you are principally interested in

a cluster resource manager may be the way to go, even

2 Likes

Thank you for confirming it @wtempel

Could you confirm that such requires a fresh installation of cryosparc? If there is another way, do let me know.

I do not think so. The challenging part would be to configure the cluster manager and job template(s) to fit your needs. You could then use a commands like
cryosparcm cluster connect of your existing CryoSPARC installation to update the lane and target configurations that are stored in the CryoSPARC database.

1 Like

I would also appreciate a fix for this - it is strange that by default “run on specific GPU” overrides the scheduler.

This might be sometimes useful - e.g. using a single GPU with lots of VRAM to run multiple jobs on the same GPU - but more often, I would like to just avoid running on a particular GPU on my workstation because I am using it for something else - alphafold calculations, relion, whatever. I know I could use a cluster resource manager, but this is a lot of overhead for occasional use cases on a standalone workstation.

Would it be possible to update the default behavior to “queue on a specific GPU”, rather than “run on a specific GPU”? Perhaps with an “override scheduler” checkbox for those who want it?

As it is accessed from the “Queue Job” menu this would also make more sense to new users, I think.

2 Likes

See also this thread.

Seems a popular request (especially on systems with mixed GPUs for situations where something will run OK on 48GB, but crash on 24GB…)

1 Like

@olibclarke @rbs_sci We have noted the feature request. At the moment, even on a single (multi-GPU) workstation, we are not aware of a method, other than an external workload manager with proper resource isolation, that would reliably queue a mix of CryoSPARC and non-CryoSPARC workloads.

2 Likes

Thanks @wtempel - but I think the request is rather simpler - not asking for a smart, comprehensive queuing system.

It is just to be able to queue to a specific GPU (as opposed to running the job regardless of whether other CS jobs are already running on the same GPU).

This would allow for avoiding a specific GPU (if I know I am running something on there outside CS), as well as targeting a specific GPU (so e.g. if I have a mix of GPUs I can submit a big box refinement to the GPU with the most VRAM).

Does that make sense?

1 Like

Agree, Oli.

I’m not asking (and I don’t think Oli is either) for a system-wide scheduler - but the CryoSPARC scheduler having the logic not to run a job on a GPU which is already running a GPU job (assigned manually to that GPU by the user, from CryoSPARC).

2 Likes

I see I am not the only one a bit upset with this old default behavior. Glad to see this topic revived! :wink:

Thanks @olibclarke and @rbs_sci for underlying so well the need and thank you tk @wtempel for following up on the topic. I hope that what it seems to as a simple need/request to the default behavior is not too difficult to alter.

Happy November!
André

This may be possible with the caveat that the solution proposed below will not be aware of

  • non-CryoSPARC workloads
  • CryoSPARC workloads from another CryoSPARC instance, if the computer serves as a worker for multiple CryoSPARC instances.

Suppose the hypothetical scenario where a two-gpu worker was originally connected and linked to the default scheduler lane with the command, run on relevant worker node, say worker1.mydomain.com, by cryosparcuser:

cryosparcw connect --master master.mydomain.com --port 61000 \ 
    --worker $(hostname -f) --ssdpath /disks/scratch1 

where:

  • master.mydomain.com corresponded to the value of the CRYOSPARC_MASTER_HOSTNAME variable defined inside cryosparc_master/config.sh
  • 61000 corresponded to the value of the CRYOSPARC_BASE_PORT variable inside cryosparc_master/config.sh
  • /disks/scratch1 is the dedicated CryoSPARC scratch device on the worker node

This would have created a scheduler target with "hostname": "worker1.mydomain.com" and "lane": "default".
Then one could instead run these commands on worker1.mydomain.com:

cryosparcw connect --master master.mydomain.com --port 61000 \ 
    --worker worker1-gpu0 --ssdpath /disks/scratch1 \
    --newlane --lane cryoemtest1-gpu0 \
    --sshstr cryosparcuser@worker1.mydomain.com --gpus 0

cryosparcw connect --master master.mydomain.com --port 61000 \ 
    --worker worker1-gpu1 --ssdpath /disks/scratch1 \
    --newlane --lane cryoemtest1-gpu1 \
    --sshstr cryosparcuser@worker1.mydomain.com --gpus 1

and this command on master.mydomain.com:

cryosparcm cli "remove_scheduler_target_node('worker1.mydomain.com')"

One should also remove any scheduler lane that may have ended up empty after removal of a scheduler node, using the remove_scheduler_lane() cli function.
When connecting gpu-specific targets:

  • refer to the output of the command
    cryosparcw gpulist for the appropriate gpu indices (not nvidia-smi)
  • ensure that each gpu is connected at most once

Right - thank you for the workaround - but to be clear the request is specific to how the GUI works.

“Queue on specific GPU” implies queuing, which is not currently what happens - the job is just submitted, regardless of what other cryoSPARC jobs are already running.

The normal cryosparc scheduler is aware of other jobs and does do this, so it is not completely clear to me why the “queue on specific GPU” scheduler does not have this capacity.

2 Likes

Thanks @olibclarke @AndreGraca @rbs_sci @leetleyang @lazaret @DanielAsarnow for chiming in on this topic!

Just to confirm, the current intended behaviour of the “Run on specific GPU” tab in the Queue dialog is to run the job immediately on the specific GPU (i.e. to override any checks by the scheduler other than that inputs are ready).
The reason that this is the current behaviour is not simply because it is set as the UI’s default; currently, the CryoSPARC scheduler internally does not have the ability to “Queue on a specific GPU”. The scheduler’s internal logic is based on lanes, and we have not yet implemented a mode for it that can schedule at a finer granularity (e.g. a single GPU).
This is similar to the fact that if you had a lane with multiple nodes, there is no support in the scheduler currently for queueing to just one of the nodes - a job has to be queued to the lane, or else skip resource checks in the scheduler altogether.

This is why we can’t yet change the UI behaviour. We’re definitely aware of and tracking this feature request though, and having the feedback from you all is very helpful!

4 Likes

Is there a way of spoofing multiple lanes on a single node?

So, for example, master/worker1/worker2 on a single system, where (pseudo-)worker1 could be assigned, say, 50% of the cores, 50% of the RAM and all 24GB GPUs (e.g. 8 total), while (pseudo-)worker2 could be assigned 50% of the cores, 50% of the RAM and all 48GB GPUs (e.g. 2 total). Because of the autodetection during install, I think that might be trickier than in writing, though?

1 Like

Last I looked into it, it was possible to spoof multiple occurrences of the same workstation/node by way of unique hostname aliases in sshd_config. Each can be assigned non-overlapping GPUs at time of connection to avoid the most obvious conflict, but there didn’t seem to be a way to effect similar CPU/RAM accounting—cryosparcw script auto-detected everything onboard without an obvious avenue for user-control. Thinking it through, however, if the ratio of resources happens to be appropriately provisioned for most uses cases, then outside of RBMC, this may not be a problem in practice?

Worth mentioning that I never got as far as testing cache-handling—cryoSPARC will treat both instances as unique resources, which could pose a problem under certain conditions.

All of this was experimental and unsanctioned, of course.

Cheers,
Yang

2 Likes

Thanks, @leetleyang :smiley:

I’d say “I’ll give it a go too and report back” but I’ve just done a big update run on our main processing servers and I’m not about to take one down again given the queue of things to run right now.

Cache collisions shouldn’t be a huge issue since each worker can be assigned a different directory (if using the same mount point) or even given an SSD each. Or just run with --nossd if running an all-SSD system…

Might be possible to do it just via /etc/hosts as well, rather than anything exotic with sshd*.

Hm. I have another system sitting on my desk which needs its big update as well, I could temporarily play musical chairs with another systems’ GPUs to experiment… OK, I think I’ll try that next chance I get.

* edit: Using dummy NICs if necessary should prevent any weirdness. OK, definitely going to try this.

2 Likes