Is it possible to add a variable for estimated cache space (say, input particle space + 10%) required that can be passed to cluster submission scripts?
Thanks! It could be pretty simple, just the particle stacks called by a job + 10% for instance would work well.
We have cluster nodes with large caches and several GPUs each, but it’s not too hard to fill the caches when a few jobs are running concurrently. Previously they were limiting the scheduler to just one job per node for this reason, which was a huge waste of GPUs. Now we’ve opened it up but there are some cache space collisions happening relatively frequently. We can add a generous cache requirement as a workaround, but it’s inefficient (in GPU utilization). So a simple, moderate over-estimate of the cache requirement would really help.
It would also be great if particles were re-stacked on the cache (which is what Relion does) - then only the used particles are copied and the cache requirements are less.