I wonder if any one in the group even tested whether an infiniband interface (100GBE or more) in a small cluster would help improve the running of cryoSPARC. We are deciding whether to use an infiniband switch. Thanks for sharing.
Could you provide some more specs of your cluster/environment?
Typically higher networking bandwidth can only help in a few parts of processing:
- Reading and writing raw movie data from network disk during preprocessing
- caching of particle stacks on to local SSD at the start of reconstruction (2D/3D) jobs
Once a reconstruction job (like 2D class, 3D refine etc) is in progress it will not be using the network to access files since all particles will be on the local SSD. If you’ve turned off local SSD caching then the network performance can be more helpful. However, above 10G network, if you are using spinning disks a your file system, you will probably not see any performance gain with 100G network. If you have all-flash storage or a huge cluster and many simultaneous users/jobs, then 100G network may become worth it.
Hi, yes, we are evaluating a 500 TB storage system with 8-large RAM/ 4GPU nodes for data processing. The data will be stored in the storage system and will be shared with the nodes during data processing. For fast processing, we are planning a shared flash SSD RAID / total 50 TB, and also would like to use a local SSD in each compute node as scratch space. I heard about different configurations for Relion and would like to know if our plan would work well for cryoSPARC. Fast network (IBoIP) could be very helpful. An alternative is DDN or a distributed parallel storage system. Thanks. Qiu-Xing
We specced out our recent build based on a similar thought process. We ended up with a 6x GTX 2080Ti with 4TB of NVME and 16TB of SATA SSDs. Our raw data are stored on a 100TB NetApp storage server. Both the file server and GPU server are equipped with 100G, but as Apunjani mentioned, the spinning discs are the bottleneck.
Our idea was to store the active projects on the SATA SSD, cache on the NVME, and movies on the file server. Then we can get quick cache loads from the project files once pre-processing is complete.