I recently started using a small GPU cluster, with SLURM, for cryoSPARC. If I launch multiple jobs on the same particle set, which get queued on different nodes, only one of them will copy the particle stacks at a time. The other jobs idle as each gets its turn to cache, all holding GPU slots the entire time.
My understanding is that this is the intended behavior. As simultaneous file reads are generally safe, and parallel IO can be much faster with the right filesystems, could we have an option to release the read lock for caching? Even better would be an option to set a maximum number of simultaneous reads.