SSD Cache lifetime suggestion

Hi all!
I have a basic question on the variable for the SSD cache lifetime (SSD_CACHE_LIFETIME_DAY).
In our instance we are submitting jobs to the HPC, the SSD is local to each compute node, and usually different jobs end up being scheduled to different nodes.
For our use case, we don’t need to store data in the cache for too long, but still copying the files there improves job run time by a lot.

Now, my question is: what is the behaviour of the cache if I set SSD_CACHE_LIFETIME_DAY=0 ?
Will the data in the cache be removed when the job is finished? Or will the data be removed as soon as another job needs the cache?

Thanks in advance for your help!

1 Like

As of CryoSPARC v4.3.1, a minimum value of 1 is enforced for the CRYOSPARC_SSD_CACHE_LIFETIME_DAYS variable. Therefore

would in and by itself not cause the cached files immediately upon completion of the job.

Thanks for your reply!

Then my question is: is there a way to not get any lifetime for the cache? So that there is no lock on the data stored in the cache once no jobs are using it?

You could create a cluster job-specific scratch directory and remove the scratch directory after the cluster job’s completion through your cluster’s workload manager.
For the slurm case, you could try (I have not yet):

  1. define (using the $SLURM_JOB_ID), create and chown $SLURM_JOB_USER a job-specific scratch path as part of the cluster job Prolog
  2. export that scratch path to the cluster job’s environment as part of the TaskProlog, as described in this example
  3. in the cluster script, above the {{ run_cmd }} statement, assign the scratch path to the CRYOSPARC_SSD_PATH variable
  4. remove the slurm job-specific scratch directory as part of the job Epilog

Does this work?

Thanks for your suggestion!
In the end I have set the CRYOSPARC_SSD_PATH variable to a temporary directory created and deleted by Slurm Prolog and Epilog.

Thank you very much for your availability :slight_smile:
Have a nice day!