GPU Utilization

August · March 1, 2023, 4:51am

Are there any benchmarks for how much GPU vs CPU a worker should use for certain jobs types?

I am running a heterogeneous refinement job with about 3 million particles on a Tesla t4 and another job on a Tesla v100 and notice that neither card is being fully utilized.

nvidia-smi reports only a few spikes to 100% every 5-10 seconds, but the GPU stays at 0% utilization most of the time. Is this normal? Is the job mostly CPU. The CS python job is indeed using a lot of CPU time.

Am I better off using a cheap video card with a better processor combo?

wtempel · March 1, 2023, 5:17pm

Fluctuations of GPU utilization are expected as the job passes through various computational routines. Not all of these routines use the GPU.
Low utilization of both CPU and GPU may indicate an IO bottleneck, which can be mitigated by particle caching to a SSD (on the same host as the GPU) and high-performance bulk storage.
To verify the appropriate performance of your own CryoSPARC instance, you can run through a benchmark workflow for a small dataset. For comparison, here are our timings for a recent run on our own test instance:

(The heterogeneous refinement job used a A100-PCIE-40GB GPU; particle caching was enabled.)