4.4.1 Only single GPU out of 4 is available to select in "run on specific GPU" in Queue

Hi,

After version upgrade to 4.4.0 then patch AND after fresh install of 4.4.1 only single GPU out of 4 is available to select in Queue (all four GPUs are listed) for “patch CTF” and other jobs. Grateful for any solution!
Installation correctly detects all four GPU with log attached:

/home/dima/cryosparc/cryosparc_worker/bin/cryosparcw connect --master argyrin --worker argyrin --port 61000 --update --gpus 0,1,2,3
 ---------------------------------------------------------------
  CRYOSPARC CONNECT --------------------------------------------
 ---------------------------------------------------------------
  Attempting to register worker argyrin to command argyrin:61002
  Connecting as unix user dima
  Will register using ssh string: dima@argyrin
  If this is incorrect, you should re-run this command with the flag --sshstr <ssh string> 
 ---------------------------------------------------------------
  Connected to master.
 ---------------------------------------------------------------
  Current connected workers:
    argyrin
 ---------------------------------------------------------------
  Worker will be registered with 128 CPUs.
 ---------------------------------------------------------------
  Updating target argyrin
  Current configuration:
               cache_path :  /media/dima/scratch/cryosparc_cache
           cache_quota_mb :  None
         cache_reserve_mb :  10000
                     desc :  None
                     gpus :  [{'id': 0, 'mem': 25388515328, 'name': 'Quadro RTX 6000'}, {'id': 1, 'mem': 25388515328, 'name': 'Quadro RTX 6000'}, {'id': 2, 'mem': 25385631744, 'name': 'Quadro RTX 6000'}, {'id': 3, 'mem': 25388515328, 'name': 'Quadro RTX 6000'}]
                 hostname :  argyrin
                     lane :  default
             monitor_port :  None
                     name :  argyrin
           resource_fixed :  {'SSD': True}
           resource_slots :  {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127], 'GPU': [0, 1, 2, 3], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]}
                  ssh_str :  dima@argyrin
                    title :  Worker node argyrin
                     type :  node
          worker_bin_path :  /home/dima/cryosparc/cryosparc_worker/bin/cryosparcw
 ---------------------------------------------------------------
  Autodetecting available GPUs...
  Detected 4 CUDA devices.

   id           pci-bus  name
   ---------------------------------------------------------------
       0                 1  Quadro RTX 6000                                                                
       1               129  Quadro RTX 6000                                                                
       2               130  Quadro RTX 6000                                                                
       3               193  Quadro RTX 6000                                                                
   ---------------------------------------------------------------
   Devices specified: 0, 1, 2, 3
   Devices 0, 1, 2, 3 will be enabled now.
   This can be changed later using --update
 ---------------------------------------------------------------
  Updating.. 
  Done. 
 ---------------------------------------------------------------
  Final configuration for argyrin
               cache_path :  /media/dima/scratch/cryosparc_cache
           cache_quota_mb :  None
         cache_reserve_mb :  10000
                     desc :  None
                     gpus :  [{'id': 0, 'mem': 25388515328, 'name': 'Quadro RTX 6000'}, {'id': 1, 'mem': 25388515328, 'name': 'Quadro RTX 6000'}, {'id': 2, 'mem': 25385631744, 'name': 'Quadro RTX 6000'}, {'id': 3, 'mem': 25388515328, 'name': 'Quadro RTX 6000'}]
                 hostname :  argyrin
                     lane :  default
             monitor_port :  None
                     name :  argyrin
           resource_fixed :  {'SSD': True}
           resource_slots :  {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127], 'GPU': [0, 1, 2, 3], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]}
                  ssh_str :  dima@argyrin
                    title :  Worker node argyrin
                     type :  node
          worker_bin_path :  /home/dima/cryosparc/cryosparc_worker/bin/cryosparcw

Welcome to the forum @dima.
Please can you post additional details:

  1. the value you specified for the Patch CTF job’s Number of GPUs to parallelize parameter
    image
  2. a screenshot of the dialog where you are unable to select more than a single GPU
  3. output of the command
    cryosparcm cli "get_scheduler_targets()"
    

cryosparcm cli “get_scheduler_targets()”
[{‘cache_path’: ‘/media/dima/scratch/cryosparc_cache’, ‘cache_quota_mb’: None, ‘cache_reserve_mb’: 10000, ‘desc’: None, ‘gpus’: [{‘id’: 0, ‘mem’: 25388515328, ‘name’: ‘Quadro RTX 6000’}, {‘id’: 1, ‘mem’: 25388515328, ‘name’: ‘Quadro RTX 6000’}, {‘id’: 2, ‘mem’: 25385631744, ‘name’: ‘Quadro RTX 6000’}, {‘id’: 3, ‘mem’: 25388515328, ‘name’: ‘Quadro RTX 6000’}], ‘hostname’: ‘argyrin’, ‘lane’: ‘default’, ‘monitor_port’: None, ‘name’: ‘argyrin’, ‘resource_fixed’: {‘SSD’: True}, ‘resource_slots’: {‘CPU’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127], ‘GPU’: [0, 1, 2, 3], ‘RAM’: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]}, ‘ssh_str’: ‘dima@argyrin’, ‘title’: ‘Worker node argyrin’, ‘type’: ‘node’, ‘worker_bin_path’: ‘/home/dima/cryosparc/cryosparc_worker/bin/cryosparcw’}]

2

Thanks @dima . What about

1

Do you set the 4 GPUs in the Compute settings by writing “4”, or do you use the small arrows at the input field to increment it to 4?
I’ve seen a bug (in other input fields, but perhaps it’s also here) where setting values with the arrows are not saved, while it works when typing the value.

Can confirm: typing “4” works - processes are running on four GPUs, changing by arrow “up” to 4 doesnt work. Thank you for help!

@dima Does this mean that

  • the original issue, in the first post of this topic, has been resolved
  • you can select multiple GPUs for Patch CTF jobs as long as you typed a number > 1 inside the Number of GPUs to parallelize field?

[Added later:]
A colleague let me know that a Number of GPUs to parallelize field outlined in gray rather than green indicates that the value entered has not yet been “registered”, and that registration can be effected by clicking outside the input field.

Issue resolved. Thank you!

I think there is a generel problem for input fields when using Firefox, that they don’t register if you change a value using the arrows. If you use the arrows, it is not enough to click outside the field somewhere - you first have to click inside the field (after setting the value with the arrows) and then click outside.

Thanks @boggild , we’ve noted this issue and will work on a fix!

- Suhail