I am a bit confused about the difference between the local motion correction and local motion correction (multi-gpu) job types. They both seem to accept number of GPUs as an input. It seems like the local motion correction (not-multi) job type will correctly reserve multiple GPUs if the number of GPUs is set to be >1, but it will only use one of them (at least only one shows usage by nvidia-smi) but the multi job type will request and actually use multiple GPUs. Despite this, they both seem to take approx the same amount of time if the same number of GPUs is requested. What is the actual difference under the hood between these job types? Why does it seem like the non-multi isn’t using multiple GPUs by resource monitoring but is taking an amount of time consistent with using multiple GPUs?