GPUS for a HPC Cluster Node

Hi there,

I am currently in the processing of speccing a GPU node (4 to 8 GPUS), I was wondering if the Cryosparc folks have any recommendations on GPUs.

Currently, L4 Ada and RTX 4500 Ada look like a good choice. I am leaning towards 4500 in an 8x format. Appreciate any pointers!

Best,

Abhiram

L4’s have a TDP of 72W. RTX 4500 Ada has a TDP of 210W. Even if all else were equal (4500 Ada has a few more shader units), you should see better performance from the 4500 Ada.

The L4 is about equivalent to a 3070Ti or 5060Ti 16GB (in terms of performance) while the 4500 Ada is roughly a 3090.

For me, it would all depend on price, power budget and where the system will be living (having a loud box on your desk is not fun)…

1 Like

That makes sense. Appreciate the help!

I just checked with a vendor and it looks like the supply chain decided it for us. I am able to source A5000 in a 4x/8x format. Since this system will be housed in a data center, I should be okay with noise and power.

More GPUs are always better, as well as more GPU memory.
But local scratch disk performance is the limiting factor, not the GPUs.

As rbs_sci mentioned, RTX4500 Ada and L4 Ada are not the latest generation. CryoSparc is currently unfortunately still not ready for the Blackwell generation (RTX5000 series) of nVidia GPUs, but I would expect so soon.

Your system builder will likely not support “blower type” gamnig cards like the RTX5090, even in “2-slot” form factor: NVIDIA GeForce RTX 5090 Graphics Cards This is because servers are designed for linear airflow and the position of the power connector makes it harder to connect those gamnig cards. So the Ada cards may be your only choice (I would not get Ampere generation at this time - A5000 is at the performance level of a RTX2080Ti and 2 generations behind). Ada has twice the performance of Ampere, so L4 or RTX4500Ada is much better. RTX PRO 6000 Blackwell is the latest server GPU, if the vendor can source them (which will be difficult) - but it doesn’t work with cryosparc yet. You could consider a “mixed” configuration of Ada and Blackwell GPUs.

You need a system with enough PCIe lanes to feed those GPUs with data, which favors AMD processors (likely EPYC for your server). For 8 slots, they will probably only support PCIe 4.0, but that’s okay. Get at least 512GB RAM. And enough NVMe drives for scratch (e.g. minimum four 8GB drives with linux md_raid), or a PCIe-NVMe RAID0 card (e.g. Highpoint https://www.highpoint-tech.com/gen5) for fast data access from the scratch. Depending on the task, your GPUs will likely outperform even the bandwidth of PCIe5 NVMe RAID.

If your mass storage is in another box, make sure the machine has plenty networking bandwidth (multiple aggregated 10Gb/s fiber ports, or faster).

Your datacenter won’t like it, but for a quarter of the cost of your server you can build something like this https://x.com/HiCryoEM/status/1763712323104235892.

1 Like