NVIDIA A30 GPU performance

olaf · November 4, 2023, 9:51pm

I just wonder is somebody has knowledge of how well NVIDIA A30 GPU works with crysosparc. With HBM2, it has very good memory bandwidth, which is a major factor in GPU performance according to cryosparc documentation. However, its theoretical FP16 and 32 performance is far below other common alternatives, e.g., RTX A5000. Could somebody please share their experience with the A30 GPU?

                             A30     A5000
-------------------------------------------
Memory Bandwidth (GB/s)     933.1    768.0 
FP16 (TFLOPS)                10.3     27.8
FP32 (TFLOPS)                10.3     27.8
FP64 (TFLOPS)                5.16     0.68

Edit: I just noticed tensor cores on A30 are faster than that on A5000. Can cryosparc utilize tensor cores on GPU?

Thanks a lot in advance! Any thought on the subject is appreciated!

alebus · November 6, 2023, 12:55pm

Thank you Olaf for your question.
I just wanted to second you in the same line, and extend the question to A100 GPUs as well. Any experience that the community can share will be much appreciated.
Alejandro

ccgauvin94 · November 6, 2023, 4:12pm

I’ve noticed zero difference in refinement speeds for large datasets between A40s and A100s (40 GB), with all other variables held constant. I would be curious if anyone else has any data about this. My suspicion is that faster memory is pointless in scenarios where your dataset exceeds the total memory capacity, if you have a filesystem bottleneck.

I have noticed huge speedups with the A100 (> 20% faster) in certain neural network workloads, where the entire model is capable of being held in memory.

olaf · November 6, 2023, 11:56pm

Thank you for the helpful info!

alebus · November 7, 2023, 1:25pm

thanx a bunch!!
indeed very interesting observation (we’re in the precise moment of deciding “best specifications within our budget” to purchase a server with GPUs: any other experience will be super helpful)