Particle Extraction Pixel Size for Combining Datasets (K3 & Falcon 4)

krishna.inampudi · March 28, 2025, 10:28pm

I’ve collected two datasets for my project. The first dataset, acquired on a GATAN K3 detector with a pixel size of 0.86 Å, exhibits biased particle orientation. To address this, I collected a second dataset at a -20° tilt on a Falcon 4 detector with a pixel size of 0.743 Å.

To combine these datasets for an Ab-Initio reconstruction, I need to determine the appropriate box size (in pixels) for particle extraction. Could you please advise me on the optimal pixel number for particle extraction to ensure compatibility between the two datasets?

Mark-A-Nakasone · March 29, 2025, 9:18pm

I am not sure if these two data sets are recommended to combined.

You have different pixel sizes, detectors (Falcon4 4096K vs. K3 4092x5760), probably different Cs too.

What was total dose and fractions ? I would prefer to use EPU or serialEM to collect 0 degrees and tilted in the same session (same hardware). Not much would need to be changed and 20deg is on the low side. You can get to 30-40deg without changing much (with drumming effect may have to decrease dose or tweak exposure time).

Your 2D classes from either should fit with much extra space in the extraction box (don’t want to loose the high frequency information.

This would depend on the maximum diameter of your particle. The problem here will be working with two different box sizes, which is not ideal. You may have to compromise on the extraction pixel width being too big for the higher mag and smaller for the lower mag.

I would just process the tilted data set (20deg) and see how that goes.

krishna.inampudi · March 31, 2025, 8:50pm

Thank you, Mark. I processed only the tilted data and it is not very promising.
I will try to do the other options that you mentioned. Thank you.

Mark-A-Nakasone · April 1, 2025, 8:33am

I usually process 0degrees and tilted individually too.

Which one has more anisotropy in the map or more complete Fourier Sphere ?

If the tilted data set is still bad, perhaps you have to increase alpha-tilt to 30-40degrees.

Also you may not need equal (50/50) micrographs from both sets, it could be 70% from tilted and 30% from 0-degrees.

The new ThermoFisher EPU allows you to collect a mix of 0-degrees and titled (e.g. on square can be 0 another 30), but I still prefer to keep these separate.

If it is a big complex (~0.5 MDa +) you could try sub-tomogram averaging.

hbridges1 · April 1, 2025, 9:11am

Hi @krishna.inampudi! Thanks for your question about merging datasets with differing pixel sizes.

If you wish to combine two datasets for use in CryoSPARC that have different pixel sizes, it is important to find compatible box sizes and downsampling so that both datasets end up with the same box size in pixels, and the same pixel sizes (Å/pix) within 0.0001 Å of each other.

We can suggest the following process to allow you to identify box sizes and Fourier cropping for each dataset that might result in a close enough match:

A. consider a list of even integer box sizes (in pixels) for dataset 2 (0.743 Å/pix) within a typically useful range for cryoEM (e.g. 250-700 or whatever is appropriate for your sample)
B. calculate a rescaled list of box sizes (in pixels) after correcting the Å/pix to match dataset 1 (0.860) and rounding to the closest even integer box size
C. calculate the list of precise rescaled Å/pix per box size
D. look for the rescaled box sizes for dataset 2 (B) that have the closest matches in rescaled Å/pix (C) to dataset 1 (0.860)

If you find a solution where the Å/pix are the same to 0.0001 Å or better, then you can go ahead and extract dataset 1 using the box size (step A) Fourier cropped to the box size (step B), and extract dataset 2 with the box size from step B.

In order to get a close enough match in pixel size, you may have to use a box that is smaller or larger than what you would ideally want for your target. While combining data with different pixel sizes sometimes helps to improve map quality, in other cases just one of the datasets is found to be good enough on its own.

If you plan to use reference-based motion correction as part of your processing pipeline, the particles from each collection will need to be run in separate RBMC jobs as this job type cannot handle different movie dimensions or total doses. Each dataset in RBMC will also need appropriate Fourier cropping so that they both end up with matching box sizes (in pixels) and can be recombined for further downstream processing.