Hi,
I work with Cryosparc users to run their jobs through our HPC cluster. We have a high-performance parallel filesystem (Panasas) but for some jobs we still see significant speed-ups with Cryosparc jobs using a high performance nvme (local) scratch disk. What I think would be beneficial is for all cluster jobs in a project and/or workspace be submitted to the same cluster node (e.g. using the same local scratch subsequent to a ssd cache copy). In my cluster submit script, I have a parameter which could be used to specify running the job on a particular node. Is there any way that local cluster parameters (for a project or workspace) could override the global cluster parameters so that we persistently place the job on a node where the project/workspace data has been synced to?
In the cluster situation, does Cryosparc itself keep track of where the data has been sync’ed to, e.g. which node? I’m using slurm and I can imagine a slurm prologue script that could do some of this given a project database and job information.
Has anyone else found a better way to more efficiently use local ssd caches in a cluster?
Thanks,
Jeff