Cache handling (cluster setting)


Correct me if I’m wrong, but cryoSPARC appears to take a binary approach to caching.

  • Yes; cache particle image stacks to scratch; if space is insufficient, wait (indefinitely) for space to become available.
  • No; do not cache particle image stacks.

With a cluster setting in mind, have you considered refactoring to provide the option of partial caching, as much as space allows, and reading the remaining data over network?

RELION does a similar thing and it allows us to speed up jobs whose particle sets are too big to fit in scratch. Caching some of the data is better than not at all.

It has also spared us having to play the cache lottery game when sharing a node with other particularly heavy submissions. Yes, the job will take longer, but depending on the state of the cluster, some times that can be better than i) waiting for the multi-day job(s) sharing the node to clear off, or ii) resubmitting and joining the end of the queue.


Hi @leetleyang,

Thanks for the feedback! We’ll try to incorporate this when we work on the caching system next.