Disk read speed - how to optimise?

simonbrown · November 2, 2020, 5:03am

Hi all,

I am getting a new GPU box setup and would like to get the hardware tuned up optimally for cryoSPARC Live. We store the movies on a NetApp fileserver in the same rack as the GPU box, and I’d like to optimise the read speed across to the GPU box for preprocessing.

My understanding is the movies are read into RAM (GPU RAM?) one at a time, motion corrected, and the averages written out to the project directory?

Is there a way that I can get a readout of transfer speeds for these steps so I can optimise?

jucastil · November 4, 2020, 10:53am

Hi @simonbrown,

Probably it’s too late, but we have added to the ‘GPU box’ an SSD array (RAID 0), then configured it as ssd_cache. The data will be then taken by cryosparc from your NetApp and copied to the SSD array.
All further analysis on the project will run over the SSD array, results being copied to your NetApp.
I believe .

Best,

Juan

simonbrown · November 4, 2020, 10:37pm

Hi @jucastil,

Thanks for that. Sounds similar to our setup. The GPU box is SSD only. We have

4TB NVME SSD (for cryoSPARC cache)
16TB SATA SSD (for projects)

My understanding is that major I/O will be:

Preprocessing: Read movies from NetApp and write averages to project.
Reconstruction: Prior to job - Read particle stacks from SATA SSD, writing to NVME SSD
Reconstruction: During job - read/write to NVME SSD

I can see the throughout for step 2 in the job logs, and I assume that 3 will be fast as it is NVME only. But step 1 could do with some tuning, and that is what I am hoping to achieve.

Thanks!

jucastil · November 6, 2020, 9:09am

Hi @simonbrown,

This is not a cryosparc topic but maybe a general computing topic .
We abandoned the NetApp long time ago because it was not fast enough.
Now we have a GPFS storage. The cryosparc stand-alone nodes are GPFS nodes connected through 10 GB or 40 GB Infiniband to the GPFS storage. If you don’t have a GPFS cluster, you could simply try connecting the node with a 10 GB connection. Anyway for us the real improvement on the speed (x4 in speed they said, but I don’t have numbers myself) came when we added the SSD array.

Hope this comment helps!
Best,
Juan