Apologies in advance for what will probably be quite a long post.
First, while this might sound obvious, decide budget. Budget will inform you regarding options for everything else. If tight, a 9950X will do quite well (you might even be able to squeeze two full systems in, if careful) but if there is lots of leeway, a Threadripper Pro might serve better (or even an Epyc dual-socket rack system).
We have no solid benchmarks (at least that I am aware of) which people have shared yet for the 5090’s with CryoSPARC and/or RELION. They perform well in most tasks, but have odd weaknesses. In benchmarks I’ve seen, they can be up to twice as fast in compute loads over a 4090, but also draw more peak and average power. Sometimes they show barely any improvement. With a quick skim, GPU Passmark has it losing compute tests to the 4090, but is unclear whether the 4090 score is overclocked or not. The increased power draw may be a concern.
My primary issue with using other results as a guide for performance in CryoSPARC/RELION is that it’s a poor estimate: my labs most recently commissioned system exhibits higher temperatures with a RELION run optimised to fill RAM and maximise both CPU and GPU load than it does when running mprime, GPU benchmarks and I/O benchmarks in parallel. Cryo-EM image processing is a good stress test!
I would get Founder’s Edition cards, or Quadros if possible and budget allows, as third party cards frequently have larger coolers (which are not always quieter!) and occasionally corners are cut with respect to VRMs (although I really hope no-one is doing this any more with cards that pull 600W!) @Mark-A-Nakasone is right to suggest looking at Ada Quadros, too, although when I could I still bought Ampere cards as they were 40% cheaper…
…
I would not recommend any modern consumer Intel CPU. They draw more power, are less performant and are recently coming off the back of a degradation scandal which Intel basically ignored and denied until there was no other choice. That doesn’t mean AMD are spotless (thinking of early AM5 boards cooking CPUs due to bad settings in the UEFI) but at least AMD didn’t deny it for well over a year. I’m also wary of heterogeneous cores for compute loads, but that’s a topic for another day…
I also don’t recommend closed-loop liquid coolers for systems which will be running 24/7 and spend time unattended. Perhaps I’ve just been unlucky, but for CLCs I’m currently batting a 20% failure rate (either pump failure or leaks) so really would recommend air cooling. As @Das says, a Noctua D15 (or equivalent) is going to be a pretty good option. Also look at the Thermalright PA120SE, we now have four workstations running with those and they are excellent for the price (here they are 30% of the cost of a Noctua for almost the same performance).
Get an overspec PSU (at least 1600W if going dual 5090s) which comes with reliable, high quality 12VHIPWR cables. Skimping here is folly, even if you have to cut back somewhere else. Dual PSUs are also an option depending on case.
You don’t need a large boot drive, but 1-2TB NVMe is not so expensive now and will have room for the space-inefficient horror which is Anaconda virtual environments. A pair of 4TB NVMes for RAID scratch would be a good idea. If using RELION and large stacks, RELION currently converts its particle stack to 32-bit MRC before loading to scratch, so that 16-bit stack will double in size - just a little surprise to be aware of. HDDs as you feel. I like Toshiba Enterprise drives, others like WD or Seagate. Buy spares.
RAM quantity will depend completely on what platform you decide on. AM5 (9950X) supports up to 192GB of RAM (I’m running this personally) but the memory controller will downclock it to 3600MT/s for stability, and I’ve been unable to get mine stable anything above that, although I admit I haven’t put as much time into it as I would have done 10-15 years ago. 128GB will run at 5600MT/s. Outside of synthetic benchmarks, I’ve not noticed an appreciable difference - at least in RELION/CryoSPARC the many other overhead factors mitigate some of the bandwidth loss between 96GB @ 5600MT/s and 192GB @ 3600MT/s, and the abiilty to handle larger volumes is more valuable to me, but not valuable enough to multiply the price of the motherboard by three, CPU by six and RAM by at least two.
In the past I’ve wholeheartedly recommended Fractal Design cases, although the Define 7 (XL) made several design decisions which I dislike… if wanting to load up on HDDs, be prepared to faff around moving and removing bits of the internals. If wanting to stuff the case full, I would avoid the 7 series cases. Airflow takes a hit if you exceed 2x HDDs in the PSU bay (for the 7) or 4x HDDs in the PSU bay (for the 7 XL). I eventually gave up as cable routing in the “HDD optimised” layout was driving me up the wall. The fans which come with it are quiet but aren’t great. Get some ML140 Pros if budget allows. For our next workstation systems, I will be investigating the Silverstone Seta H2 and Alta D1, particularly the latter.
10GbE NIC is a good idea; if your IT supports it, perhaps a dual-port card for teaming (especially if you get network storage which can support it) although with just one or two cards, CryoSPARC will not stream data fast enough to saturate 10GbE.
For a “budget” (with how much this will cost, that word makes me wince) workstation, I’d look at something like this:
AMD 9950X (can set 105W eco mode for significant heat reduction with minimal performance loss)
Thermalright PA120SE or Noctua D15
Asus ProArt X870E Creator (has 10GbE)
128GB or 192GB DDR5
1x 2TB NVMe (root)
2x 4TB NVMe (scratch, ZFS stripe)
4x 20TB HDD (RAIDZ1) (+2 spares)
1600W PSU (maybe Seasonic PRIME?)
1x or 2x 5090 Founders Edition
Fractal Design 7 XL (+ ML140 Pro fans) (yes, despite what I said above - I have no experience of the Silverstone cases I mentioned so will not recommend yet)
With a bit more budget, I’d switch out the AM5 parts for a Threadripper Pro 7995WX or 7975WX, 512GB RAM and relevant motherboard.
Definitely look toward large-scale NAS storage as well. Cryo-EM data eats those petabytes faster than you might expect. 