Seeking advice for merging datasets with slight deviation in pixel size

jimhbean · October 18, 2023, 10:30pm

Howdy all

I was hoping to seek some guidance on merging datasets with different pixel sizes. Firstly, to my knowledge, cryosparc doesn’t have tools for this; If i am wrong please enlighten me, and may i suggest such tools be implemented in future.

I am following doi: 10.1107/S2059798319010519 as a reference; the process seems to be:

-set the dataset with the largest pixel size as a reference, set the second dataset as the data to be rescaled (in this case, the reference data pixel size = 1.4 Å, the re-scale data =1.36 Å).

generate re-constructions for both datasets (already done in cryosparc).
Determine the ‘rescale factor’ using ChimeraX to iteratively change the pixel size of the re-scale map and fit it to the reference map, optimising correlation coefficient (done and = 1.394 Å).

-use rescale tools such as relion_image_handler to rescale the rescale dataset micrographs using the rescale factor (1.394 Å in this case) as the --angpix argument, and the reference dataset pixel size as the rescale target (e.g. --rescale_angpix 1.4).

I have done this, however the output rescaled .mrc micrographs have non-cubic voxels now (1.4, 1.4, 1.36). Does anyone know if this is likely to be a problem in data-processing? my gut says yes.

Curious to know if anyone has done this successfully, and if there is any support in cryosparc for this
cheers,
James.

olibclarke · October 19, 2023, 1:03pm

Micrographs are 2D - they have pixels, not voxels - what are the three dimensions you are referring to?

jimhbean · October 23, 2023, 8:17pm

Hi Oli

They still have a voxel dimension in Z written to their headers, which remained unchanged during scaling. by the header the micrographs don’t appear as perfectly flat 2D planes, but 2D planes of thickness = voxel size, and during rescaling this scale didn’t change in Z, making the voxels/pixels appear anisotropic. I came to the conclusion that it probably doesn’t matter and this seems correct.

Cheers.

olibclarke · October 23, 2023, 8:20pm

Ah I see what you mean now - I’d never noticed that sorry!

hbridges1 · October 25, 2023, 12:22pm

Hi @jimhbean,

It seems like you are making some progress towards merging your datasets! Unfortunately it is difficult to comment on the output dimensions of your images after relion rescaling, but there is presently not a specific tool in CryoSPARC to facilitate rescaling. There may be different ways of merging but for CryoSPARC to refine multiple datasets together you need:

Final particle stacks with the same box size in pixels
Final particle stacks with the same dimensions in Å

There is a little leeway for point (2) but to merge the images, CryoSPARC requires the two Å/pix to be within 0.0001 Å of each other. This may be achieved by extracting with a box of the same dimensions in Å for both datasets, then downsampling the smaller pixel data to match the box size of the larger pixel data.

Things are more complicated if one or both datasets were not originally processed at their calibrated pixel sizes, because CryoSPARC will currently not allow you to just rescale or update the pixel size in your particle stacks, and you would need to go back to motion correction and CTF estimation of your movies before re-extraction of your particles.

As an example, let’s say you have two datasets processed at 1.05 Å/pix but at the end you find the second dataset was really at 1.0625 Å/pix (according to calibration in ChimeraX). The way I would handle this is re-import the movies for dataset 2 with the corrected pixel size, re-run motion correction and CTF estimation, and then during particle re-extraction use “Force re-extract CTFs from micrographs”.

Dataset 1 at 1.05 Å/pix could be extracted with a box of 510 rescaled to 504 (giving a final pixel size of 1.0625), and this can be combined with dataset 2 at 1.0625 Å/pix extracted with a box size of 504. In this case both datasets end up with exactly the same pixel size and box size and CryoSPARC can refine them together.

I hope that is helpful!

jimhbean · October 26, 2023, 2:05am

All good reading back over my post I can see why it would seem confusing! It did not seem to cause any problems so far as I can tell

jimhbean · October 26, 2023, 2:40am

Hi @hbridges1
Thanks so much for your detailed reply
I am a little confused as to how the “force re-extract CTFs from micrographs” argument rescales the data to be equivalent? The datasets will now have different pixel sizes written to them right?

Also by downsample, do you mean to fourier crop the box size such that the pixel size in both cases is equal? It occurs to me this probably wouldn’t work as the box dimensions would be different i think?

Cheers,
James.

hbridges1 · October 26, 2023, 5:39pm

Hi @jimhbean

Those are great questions! Let me explain the reason why I suggested using “Force re-extract CTFs from micrographs”:

Let’s say you already have a particle stack that you produced at an incorrect pixel size, and you go ahead and re-import the movies at the correct pixel size, re-run motion correction and CTF estimation. Now the pixel size and CTF are right for these micrographs, but the original CryoSPARC particle stacks themselves contain information about their pixel size and CTF.

When you re-extract particles, CryoSPARC has to pick which of these sources to take information from, and by default it uses the pixel size from the micrographs and the CTF information from the particles. Turning on “Force re-extract CTFs from micrographs” tells it to get the CTF information from your newly-corrected micrographs.

Now, on to the question about box sizes: You are correct that I was referring to Fourier cropping. As Fourier cropping changes the particle image dimensions in pixels, but keeps the same distance in Å, you can extract boxes from both datasets so that they have the same distance in Å. One of the datasets will need a larger box in terms of pixels to achieve this, and this larger box can be Fourier cropped to match the box size in pixels of the other dataset.