Hi @stephan,
Thank you for your reply.
Yes, the absolute path to the raw data that a job uses to process.
I encounter the situation where in our shared HPC environment with PBS, we have been advised/recommended to tar all input files and copy the tarred input files to SSD cache as a one bigger file (rather than multiple or thousands of smaller files), and untar it before executing any processing tasks directly with the files on local SSD caches on the compute nodes, and not to use our shared Lustre filesystem storage. So, I was wondering if there are any ways do this within our PBS job submission scripts. If I can call a variable for the input data’s absolute path, then I may be able to do those actions for better file I/O performance.
Previously, I did ab-initio and 3D refinement directly on our shared lustre storage, the performance was degraded about 3 times or more in terms of the processing time. After using the SSD cache on the compute nodes, the performance is comparable to CryoSPARC’s AWS benchmarking. However, when loading the input files to SSD cache, it seems to copy mrc files one by one.
Regards,
qitsweauca