Hi all,
My apologies in advance if this is redundant but I haven’t been able to find a definitive answer for the guidance I’m looking for.
I’m assisting a colleague on setting up a cryoSPARC Master/Worker environment with eventual Slurm integration. We have a VM set up for the Master, and then compute nodes for the workers with multiple shared filesystems (NFS, Lustre, etc) to utilize.
My colleague initially set this up (master, worker, database) as a shared directory on our Lustre storage and running the Master process and database off that storage. We had some Lustre issues that caused the filesystem to drop out from underneath the DB/Master processes and require a reboot of the VM to resolve.
(This is where I jumped in to help go over what was actually installed where ). We had encountered some possible DB corruption preventing the DB from starting, which fortunately resolved with the reboot.
So my questions are:
-
Can/should the Master process directory and/or database reside on the local HDD of the Master node vs the shared filesystem? From a sysadmin perspective this is how I feel it should be, but I couldn’t find a workflow diagram of the internal communication to see what is being accessed from the Worker vs the Master.
-
If the answer above is Yes, how is the project/shared storage configured/defined for access for use by Master and Worker?
Thanks in advance!