I am currently using a specific nominal pixel size from our data collection setup, but I want to confirm whether it is accurate.
In your experience, what are the most common approaches to validate the actual pixel size?
Do you typically compare the cryo-EM map with a previously solved high-resolution X-ray or cryo-EM structure (PDB) of the same protein, adjusting the voxel size until the correlation is maximized?
Are there alternative or more direct methods within CryoSPARC (or other software) for pixel size validation, without relying on an existing atomic model?
Any advice, workflows, or example protocols would be greatly appreciated.
When the pixel size is incorrect, the refined Cs value will diverge from the microscope’s nominal value (typically ~2.7 mm). You can correct it using:
Although as you get to ever higher resolutions, you’ll find that 0.5% is still enough to cause headaches sometimes. Still, it’s usually close enough that you only need a minor adjustment (using the Validation (FSC) tool or RELION postprocessing) and it won’t impact your reported resolution enough to matter (difference from 1.39 to 1.38Ang, for example…). Still, getting it as accurate as possible can help your Q-score a little…
I usually just toss in a grid of apoferritin semi-regularly and collect data at all the mags I care about, then process quickly and compare them to previous high resolution results for any deviations. So far, except after the 2024 New Year earthquake, variation has always been minor (<0.1%).
If off a long way (the calibrated Angpix of one mag when our Krios was commissioned was way off) reprocessing is essential, though.
Do you typically compare the cryo-EM map with a previously solved high-resolution X-ray or cryo-EM structure (PDB) of the same protein, adjusting the voxel size until the correlation is maximized?
This is the way we would typically do it - either using a crystal structure of a fragment, or a small amount of data collected on a known standard, e.g. apoferritin.
In terms of how to do this, I used to do it manually using fitmap in Chimera, but now there is a convenient tool in the nightly builds of Phenix, called phenix.magref, which will perform the same procedure automatically and very quickly. There is also a Phaser tool embedded in phenix that can do the same thing. Both are discussed in this thread on the phenixbb: [phenixbb] Re: Refine pixel size for EM map? - phenixbb - phenix-online.org
Interestingly (not really surprising given the training set), fitting alphafold models of individual domains & maximizing correlation seems to give essentially identical results to a crystal structure, so that may be worth considering in a pinch.
Refining Cs is also a valid approach, assuming you have sufficient resolution, and MagCalEM is also useful.
There are several good points made in the magCalEM paper. All three methods (gold diffraction, x-ray standards, Cs refinement) can converge on an accurate value, but only if they are done carefully. IMO Cs refinement using high-resolution datasets is the most straightforward, followed by the magCalEM approach. However, magCalEM is not that easy depending on the actual magnification, it’s great if the pixel size is ~0.8-1.0 ish, but if it’s larger you have to use aliasing, and if smaller I have trouble positioning the diffraction peaks consistently because they have significant spread, so different runs give different pixel sizes.
Fitting crystal structures only agreed after averaging the results from 10 structures, presumably due to detector distance error. I have found there is also a bias if you work “up” or “down” in pixel size due to the local optimization map fitting in Chimera.
Cs refinement value can be validated internally by reprocessing with the correct pixel size - then the refined Cs should be very, very close to the manufacturer provided Cs. Usually to get the value I open up the .cs file in Python and average the Cs across the exposure groups. One could also average across particles to give more weight to groups with more particles.
I agree that Cs is the way to go if you have a high res dataset, but if you don’t fitting a crystal structure will work in a pinch, and has the advantage of not requiring any reprocessing (just a map). Which is useful if you want to calibrate existing maps already in the EMDB, for structural comparison.
In practice I definitely haven’t found it necessary to average 10 crystal structures - I get pretty consistent results no matter which structure I use (even alphafold models seem to give almost identical results). Phenix.magref seems to work pretty well & fast as an alternative to doing it manually in Chimera.
I did not analyse this thoroughly, just noticed that in the pool of 10 structures of DPS of that paper, most PDB codes start with 1 (quite old) and have pretty bad validation scores in the PDB. It makes sense that they blame the detector distance for variations. In x-ray crystallography we start with the reciprocal space with distances that are measured in cm, easy to be accurate nowadays, and wavelength is determined with more precision than you need. Goniometers and detectors also evolved to more reliable, fast and stable data collections. Radiation damage is almost not a problem these days. Basically I agree with @olibclarke that you should not need more than one crystal structure, provided that it is relatively recent. If the validation scores are good, I’d expect the structure to be good for calibration of the pixel size, too. There is no way the bonds between atoms can inflate or shrink globally. (I am not saying that all the old structures are bad, though!)
To add to this - they don’t specify whether they used the oligomer or the monomer for fitting, which for cage structures like DPS could be important - I would favor using the monomer, to avoid any influence of the crystal contacts, cryo conditions etc on the assembly as a whole.
The magCalEM paper from MRC-LMB is a great reference. I often save the HexAuFoils for this reason, gold diffraction is easier to work with from HexAuFoil rather than UltrAuFoil.
I am skeptical OEM provided Cs would be wrong by 200 microns, I think it’s still pixel size error. It’s extremely common to provide users with only an approximate pixel size based on a single-point gold diffraction estimate, which are often systematically too large (slightly).
Usually I’ve seen a non-zero value like 0.001 used for cryo-EM with a Cs (image) corrector.