"How do you usually validate the correct pixel size in cryo-EM datasets?"

PKB · August 13, 2025, 12:30pm

I am currently using a specific nominal pixel size from our data collection setup, but I want to confirm whether it is accurate.

In your experience, what are the most common approaches to validate the actual pixel size?

Do you typically compare the cryo-EM map with a previously solved high-resolution X-ray or cryo-EM structure (PDB) of the same protein, adjusting the voxel size until the correlation is maximized?
Are there alternative or more direct methods within CryoSPARC (or other software) for pixel size validation, without relying on an existing atomic model?

Any advice, workflows, or example protocols would be greatly appreciated.

Andrea · August 13, 2025, 1:17pm

Refining Pixel Size — Two Approaches

Methods based on spherical aberration (Cs) can be used to refine pixel size — see the RELION documentation:
https://relion.readthedocs.io/en/release-3.1/Reference/PixelSizeIssues.html

When the pixel size is incorrect, the refined Cs value will diverge from the microscope’s nominal value (typically ~2.7 mm). You can correct it using:

Real pixel size = Nominal pixel size × (True Cs / Apparent Cs)^(1/4)

This is essentially an iterative procedure:

Calculate the new “real” pixel size from the formula.
Re-apply CTF correction to the micrographs.
Re-extract particles.
Re-refine the model.
Re-refine Cs.

If Cs is still off, repeat the process.

Reprocessing with the correct pixel size can improve your map, so there is at least some payoff for the work.

For higher-resolution models, I use another approach that seems to converge quicker (perhaps in one step):

Refine the model with secondary structure restraints (e.g. in Phenix), which enforce geometry but not distances.
Identify helices in the model (e.g. using DSSP at PDB-REDO: DSSP).
Measure the median O→N distance from residue i to i+4 within helices with a script.

From high-resolution crystallographic structures in the PDB, the correct distance is 2.915 ± 0.096 Å (PDB-REDO data). So you can:

Calculate the ratio between your measured value and the expected value.
Multiply your nominal pixel size by this ratio to get the corrected pixel size.
Reprocess your data with the corrected pixel size.

In practice, this often converges in a single iteration, plus minus rounding errors.

rbs_sci · August 13, 2025, 2:40pm

MagCalEM from Chris Russo is useful as well.

Although as you get to ever higher resolutions, you’ll find that 0.5% is still enough to cause headaches sometimes. Still, it’s usually close enough that you only need a minor adjustment (using the Validation (FSC) tool or RELION postprocessing) and it won’t impact your reported resolution enough to matter (difference from 1.39 to 1.38Ang, for example…). Still, getting it as accurate as possible can help your Q-score a little…

I usually just toss in a grid of apoferritin semi-regularly and collect data at all the mags I care about, then process quickly and compare them to previous high resolution results for any deviations. So far, except after the 2024 New Year earthquake, variation has always been minor (<0.1%).

If off a long way (the calibrated Angpix of one mag when our Krios was commissioned was way off) reprocessing is essential, though.

olibclarke · August 13, 2025, 2:56pm

Do you typically compare the cryo-EM map with a previously solved high-resolution X-ray or cryo-EM structure (PDB) of the same protein, adjusting the voxel size until the correlation is maximized?

This is the way we would typically do it - either using a crystal structure of a fragment, or a small amount of data collected on a known standard, e.g. apoferritin.

In terms of how to do this, I used to do it manually using fitmap in Chimera, but now there is a convenient tool in the nightly builds of Phenix, called phenix.magref, which will perform the same procedure automatically and very quickly. There is also a Phaser tool embedded in phenix that can do the same thing. Both are discussed in this thread on the phenixbb: [phenixbb] Re: Refine pixel size for EM map? - phenixbb - phenix-online.org

Interestingly (not really surprising given the training set), fitting alphafold models of individual domains & maximizing correlation seems to give essentially identical results to a crystal structure, so that may be worth considering in a pinch.

Refining Cs is also a valid approach, assuming you have sufficient resolution, and MagCalEM is also useful.

DanielAsarnow · August 16, 2025, 6:00pm

There are several good points made in the magCalEM paper. All three methods (gold diffraction, x-ray standards, Cs refinement) can converge on an accurate value, but only if they are done carefully. IMO Cs refinement using high-resolution datasets is the most straightforward, followed by the magCalEM approach. However, magCalEM is not that easy depending on the actual magnification, it’s great if the pixel size is ~0.8-1.0 ish, but if it’s larger you have to use aliasing, and if smaller I have trouble positioning the diffraction peaks consistently because they have significant spread, so different runs give different pixel sizes.

Fitting crystal structures only agreed after averaging the results from 10 structures, presumably due to detector distance error. I have found there is also a bias if you work “up” or “down” in pixel size due to the local optimization map fitting in Chimera.

Cs refinement value can be validated internally by reprocessing with the correct pixel size - then the refined Cs should be very, very close to the manufacturer provided Cs. Usually to get the value I open up the .cs file in Python and average the Cs across the exposure groups. One could also average across particles to give more weight to groups with more particles.

olibclarke · August 16, 2025, 7:26pm

I agree that Cs is the way to go if you have a high res dataset, but if you don’t fitting a crystal structure will work in a pinch, and has the advantage of not requiring any reprocessing (just a map). Which is useful if you want to calibrate existing maps already in the EMDB, for structural comparison.

In practice I definitely haven’t found it necessary to average 10 crystal structures - I get pretty consistent results no matter which structure I use (even alphafold models seem to give almost identical results). Phenix.magref seems to work pretty well & fast as an alternative to doing it manually in Chimera.

carlos · August 17, 2025, 7:19am

I did not analyse this thoroughly, just noticed that in the pool of 10 structures of DPS of that paper, most PDB codes start with 1 (quite old) and have pretty bad validation scores in the PDB. It makes sense that they blame the detector distance for variations. In x-ray crystallography we start with the reciprocal space with distances that are measured in cm, easy to be accurate nowadays, and wavelength is determined with more precision than you need. Goniometers and detectors also evolved to more reliable, fast and stable data collections. Radiation damage is almost not a problem these days. Basically I agree with @olibclarke that you should not need more than one crystal structure, provided that it is relatively recent. If the validation scores are good, I’d expect the structure to be good for calibration of the pixel size, too. There is no way the bonds between atoms can inflate or shrink globally. (I am not saying that all the old structures are bad, though!)

olibclarke · August 17, 2025, 12:15pm

To add to this - they don’t specify whether they used the oligomer or the monomer for fitting, which for cage structures like DPS could be important - I would favor using the monomer, to avoid any influence of the crystal contacts, cryo conditions etc on the assembly as a whole.

Mark-A-Nakasone · August 19, 2025, 8:32am

The magCalEM paper from MRC-LMB is a great reference. I often save the HexAuFoils for this reason, gold diffraction is easier to work with from HexAuFoil rather than UltrAuFoil.

We usually do not correct for Cs in the microscope, that is materials science (Spherical aberration correction in a scanning transmission electron microscope using a sculpted thin film - ScienceDirect), at least it brings the costs down. I have used a Krios where they told me Cs=0, its corrected. These forms have discussed how each microscope is unique, with their own true Cs - not always 2.7 but sometimes 2.5, etc.

DanielAsarnow · August 19, 2025, 7:30pm

I am skeptical OEM provided Cs would be wrong by 200 microns, I think it’s still pixel size error. It’s extremely common to provide users with only an approximate pixel size based on a single-point gold diffraction estimate, which are often systematically too large (slightly).

Usually I’ve seen a non-zero value like 0.001 used for cryo-EM with a Cs (image) corrector.