Relationship between extraction boxsize and reference pixelsize/boxsize

Quick question relating to the following line on the automated workflows page:

Extraction box size

Note: if using our reference, this should be 310.6 divided by the pixel size of the motion corrected micrographs (rounded to the nearest even pixel - e.g. 361.18 —> 362).

Can anyone elaborate why the extraction box size and reference are related? What property of the reference is being compensated for here (reference box size, reference A/px etc.). Further if I was to create a new reference for a new target class, how should I think about what the reference box size and pixel size should be in relation to my desired extraction box size within the workflow?

Hi @samhaysom,

Thanks for reaching out about automated data processing! Heterogeneous Refinement does not currently scale the input maps to have the same physical extent – it only scales them to have the same box size in number of pixels. Because of this, if you use our references, your particle images must have the same (or as close as possible) physical extent as the references: 310.6 Å.

If you use your own references, then the particles must have the same physical extent as those references.

Best,
Kye

1 Like

Ah I see, so to clarify if I have a reference at a a box size of 240 px but my particles have a box size of 300 px, at the instantiation of the hetref job the reference will be resampled to 300 px but retain the same pixel size? So it will effectively increase the size of the reference relative to the particles? This is assuming the hetref job is also run with a refinement box size of 300 px.

Hi @samhaysom,

Particle images and the reference volume input to a Heterogeneous Refinement will have their pixel size and number of pixels scaled to maintain their current physical box extent, but match the number of pixels specified in the Refinement box size (voxels) parameter. Therefore, you need to ensure the volume and the particles have the same physical extent as one another before supplying to Heterogeneous Refinement.

Here are a few examples. For each example, I give the values of:

  • Particles: pixel size, extraction box size, physical extent
  • Volume: pixel size, box size, physical extent

which are necessary to keep the volume and particles at the correct sizes compared to each other (that is, to keep their physical extent the same)

Scenario 1 - particles and volume have same pixel sizes

Particles: 1Å/px, 300 px, physical extent = 300 Å
Volume: 1Å/px, 300 px, physical extent = 300 Å
Refinement box size: 128 px
Particles and volume scaled: 2.3438Å/px, 128 px, physical extent = 300 Å

Scenario 2 - volume has a larger pixel size

Particles: 1Å/px, 300 px, physical extent = 300 Å
Volume: 1.5Å/px, 200 px, physical extent = 300 Å
Refinement box size: 128 px
Particles and volume scaled: 2.3438Å/px, 128 px, physical extent = 300 Å

Best,
Kye

2 Likes

Thank you that is very helpful clarification. Could you provide an example of what happens if the reference and particle physical extents are not matched? By my calculation my example would be as follows

Scenario 3 - reference has a smaller extent than the particles

Particles: 1Å/px, 300 px, physical extent = 300 Å
Volume: 1Å/px, 240 px, physical extent = 240 Å
Refinement box size: 300 px
Particles scaled: 1Å/px, 300 px, physical extent = 300 Å
Volume scaled: 0.8Å/px, 300 px, physical extent = 240 Å

Within the actual refinement the effect is a magnification of the reference relative to the particles. A volume with a physical extent of 240 Å is magnified to match particles with physical extents of 300 Å (or vice versa, particles shrunk relative to the reference). Within the heterogeneous refinement, the initial reference ends up 1.25x the size of the particles.

Is it only heterogeneous refinement that works this way or is this also the case for non-uniform/homogeneous refinements as well? Is there anywhere that this information is captured within the documentation for cryoSPARC, I cannot see any mention of it in the heterogeneous refinement, non-uniform refinement or homogeneous refinement man pages.

I have just run a quick heterogeneous refinement and this does appear to be the case. I’ve used the same reference but either cropped or expanded the box size with volume tools:

Å/px box size (px) physical extent Magnification relative to particles
Particles 0.94 330 310.2
Reference 1 0.7981 380 303.278
Reference 2 0.7981 200 159.62
Reference 3 0.7981 1000 798.1
Refinement 128
Particles scaled 2.4234375 128 310.2 1
Reference 1 scaled 2.369359375 128 303.278 1.022823944
Reference 2 scaled 1.24703125 128 159.62 1.943365493
Reference 3 scaled 6.23515625 128 798.1 0.388673099

Looking at the volume slices from the first couple of iterations, Class 1 (using reference 2) is magnified and Class 2 (using reference 3) is shrunk vs Class 0 (using reference 1)

Hi @samhaysom,

Is it only heterogeneous refinement that works this way or is this also the case for non-uniform/homogeneous refinements as well? Is there anywhere that this information is captured within the documentation for cryoSPARC, I cannot see any mention of it in the heterogeneous refinement, non-uniform refinement or homogeneous refinement pages.

Right now this is only the case for Heterogeneous Refinement. Homogeneous and Non-uniform refinement will resample, as necessary, the initially provided volume to match the specs of the particles.

Changing Heterogeneous Refinement to have the same behaviour as Homo/NU-refinement is a consideration for a future version of Heterogeneous Refinement.

Best,
Kye

Hi, Samhaysom, where did you get the data in your table? Lan