Local motion correction of subparticles?

olibclarke · August 6, 2024, 12:58am

Hi,

What is the best workflow to perform local motion correction of subparticles (either reference-based or otherwise)?

Ideally, I would like to calculate trajectories from the full particle (which has more signal, and more stable trajectories), but then extract matched sets of subparticles using the improved local alignment parameters and dose weighting.

Currently, the best way I can think of to do this is to align/extract in an oversized box, reconstruct, then use volume alignment tools to recenter, and crop around that center using the downsample tool. However this is (1) very space/time inefficient and (2) fails completely above a certain box size (1126), where reconstruction of the whole particle is currently not possible in cryosparc.

Is there a better way to do this currently?

In a future version, would it be possible to allow a single full-particle reference to be paired with a symmetry expanded particle set after volume alignment tools, to allow for pairing of the sets of subparticles with the appropriate full reference? In either local motion or RBMC?

Cheers
Oli

kstachowski · August 27, 2024, 1:05pm

Hi @olibclarke,

Sorry for the delayed response! We haven’t tried this ourselves, but think your workflow should work. As you mentioned, it will likely be space and time inefficient, but to our knowledge would probably be the only way to currently approach this. What box size would you need to use to achieve this for your dataset?

We tried to be careful about units and normalization in our formulation of RBMC, so in principle using hyperparameters from the full particle to motion correct the sub-particles should be okay. Moreover, it seems like hyperparameters that are “in the ballpark” of right tend to give reasonable results. That said, less signal is fundamentally harder to align, so I don’t know if that effect would manifest to a significant degree. If you do attempt to do this, please let us know how it goes.

Additionally, we have noted a feature request for sub-particles and symmetry expanded mates in RBMC.

All the best,
Kye

olibclarke · August 27, 2024, 2:49pm

Hi Kye,

The problem is that the native box size I would need for this would be on the order of 1400 pixels - and I would then need to refine before cropping the subparticles out, as otherwise the shifts will be inaccurate (due to drift during RBMC, as discussed elsewhere).

I guess I could run RBMC, downsample 2x & refine to get “good enough” alignments, then crop out subparticles from the original RBMC particles using the alignments from the downsampled refinement? as otherwise refinement will fail in the original box size…

But there is another problem, I think - in order to start RBMC, I need to provide a matching set of particles & volume - but I can’t reconstruct from 1400px particles as I will run out of GPU RAM. I guess I can just extract the particles in that size, and crop/pad the volume from a smaller box size to match? EDIT: actually just tried this, and it requires one additional step - because Vol Tools only works on one map (not all the passthrough maps), separate vol tools jobs need to be run for the full map and both half maps, and then the outputs recombined using the lower level slots. EDIT2:… but when I try this, it doesn’t work see screenshot

Cheers
Oli

EDIT3: I tried a different approach - performing the pre-RBMC reconstruction directly in a reduced box size - but this fails with EWS correction on:

Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 115, in cryosparc_master.cryosparc_compute.run.main
  File "cryosparc_master/cryosparc_compute/jobs/refine/newrun.py", line 1174, in cryosparc_master.cryosparc_compute.jobs.refine.newrun.run_homo_reconstruct
  File "cryosparc_master/cryosparc_compute/engine/newengine.py", line 3279, in cryosparc_master.cryosparc_compute.engine.newengine.reconstruct
  File "cryosparc_master/cryosparc_compute/engine/newengine.py", line 3292, in cryosparc_master.cryosparc_compute.engine.newengine.reconstruct
  File "cryosparc_master/cryosparc_compute/engine/newengine.py", line 482, in cryosparc_master.cryosparc_compute.engine.newengine.EngineThread.setup_ews_correct
AssertionError: TODO must check box size convention for ewald sphere radius calculation

kstachowski · October 31, 2024, 3:01pm

Hey @olibclarke,

We are working on this internally but wanted to see if you were able to get this working.

Was this request spurred by your recent RyR preprint?

Best,
Kye

olibclarke · October 31, 2024, 3:06pm

Great! It actually wasn’t that, but it would be useful there too, absolutely, for further improving resolution in the corner subparticles

And no, wasn’t able to get it working

Happy to share data if needed!

kstachowski · November 18, 2024, 10:16pm

Hi @olibclarke,

It turns out that currently, this kind of workflow can be done, though it is not as streamlined as it could be, and we appreciate you bringing it to our attention! I have been able to piece together a workflow that performs sub-particle RBMC on EMPIAR-10793:

After obtaining a homogeneous particle stack, I extracted particles at a box size of 800 px and F-cropped to 600 px
Using the RBMC job, I calculated hyperparameters, experimental dose weights, and motion trajectories for the full particle.
I symmetry expanded the particles with the correct symmetry operator
I created a mask for the subparticle and used Volume Alignment Tools to shift the subparticles to the center of the box
I used the job Apply Trajectories with the hyperparameters/dose weights/trajectories from the RBMC of the full particles, and then the particles/volume/mask outputs from the Volume Alignment Tools job.
Lastly, I used a downsample job to real-space crop the particles to a more manageable box size before moving onto local refinement/3D-classification.

Note that for the experimental dose weights to be applied correctly to subparticles, you must currently use the same physical box size that was used for calculation of the hyperparameters and experimental dose weights of the full particle (ie no downsampling or cropping in the Apply Trajectories job unless you removed the experimental dose weights). Alternatively, you can omit the EDWs as an input to Apply Trajectories and this will then apply a standard model-based dose weighting curve, which in part defeats the purpose of using RBMC. This is why the downsample job with cropping is recommended at the end, to save on disc space and computational expense.

Cheers,
Kye

olibclarke · November 19, 2024, 3:10pm

Thanks Kye this is very helpful, thanks for figuring it out!

Between steps 2 and 3, did you re-refine with the RBMC-improved full-particles? If not, would this be possible (to improve alignments prior to local refinement), or not recommended? I guess it doesn’t matter, as the trajectories are already baked in at that point?

Also, did you compare local refinements for the RBMC-corrected & cropped subparticles and those extracted from the original micrographs - did you see the expected improvements in resolution?

kstachowski · November 19, 2024, 3:20pm

Hi @olibclarke,

I did not, as I only had RBMC do the hyperparameter and dose weight calculations and not actually perform the motion correction.

My goal was to try to limit the storage space overhead with as few extractions as possible but still obtain a comparable result to show that the workflow can be done. In theory, you should be able to use RBMC’d particles as inputs.

I did compare results, they were about the same. I think any improvement in the map quality will be target/dataset dependent.

Best,
Kye

olibclarke · November 19, 2024, 3:21pm

Thanks heaps Kye! We will test on some of our cases, very helpful to have this available!

olibclarke · November 19, 2024, 8:31pm

Hi Kye, one more query about step 2 - you mentioned that you just ran this up to the hyperparameter estimation stage, and then used apply trajectories. I thought the trajectories were calculated after hyperparameter estimation? Are the trajectories actually generated during hyperparameter estimation, and only applied during the motion-correct particles step?

kstachowski · November 21, 2024, 3:31pm

Hi @olibclarke,

Trajectories are calculated and applied in the last step of RBMC where the particles motion is corrected.

Best,
Kye

olibclarke · November 21, 2024, 3:51pm

right - that’s what I thought - but you said:

My goal was to try to limit the storage space overhead with as few extractions as possible but still obtain a comparable result to show that the workflow can be done. In theory, you should be able to use RBMC’d particles as inputs.

So I feel like I’m missing something?

kstachowski · November 21, 2024, 3:55pm

Hi @olibclarke,

After talking with @hsnyder more about this, I made a mistake and realized I should change the guidance above. When I was doing this, I had run an RBMC job just to calculate hyperparameters and EDWs and a separate job to motion correct the full particles. When writing up these steps, I missed that job in the workflow.

After RBMC corrects the particle motion, there will be trajectories in the particles.cs file with the field name motion/path and these are absolutely necessary for Apply Trajectories. Additionally, this field will be duplicated and retained during symmetry expansion (I verified this). So, for step 2, run a full RBMC job so that trajectories for the full particle are present during sym expand.

After obtaining a homogeneous particle stack, I extracted particles at a box size of 800 px and F-cropped to 600 px (this extraction is likely unnecessary unless you want to extract to recenter and reconstruct prior to RBMC to verify everything is at the center of the box for RBMC).
Run RBMC of the full particles as normal. I F-cropped the particles from their original 800 pixel box to 256 at this step just as a way to save space.
I symmetry expanded the particles with the correct symmetry operator
I created a mask for the subparticle and used Volume Alignment Tools to shift the subparticles to the center of the box
I used the job Apply Trajectories with the hyperparameters/dose weights/trajectories from the RBMC of the full particles, and then the particles/volume/mask outputs from the Volume Alignment Tools job.
Lastly, I used a downsample job to real-space crop the particles to a more manageable box size before moving onto local refinement/3D-classification.

olibclarke · November 21, 2024, 4:06pm

got it this makes sense - this also makes clear that it is neccessary to run sym expansion on the RBMC-ed particle set - one can’t just use a sym expanded set of particles from the RBMC input.

Did you recenter particles during RBMC? This is the default, but I wonder if it will lead to any issues downstream if sym expansion is performed and a mask/map provided for subparticle RBMC without re-refining (as volumes can be slightly offset after RBMC)

kstachowski · November 21, 2024, 5:26pm

got it this makes sense - this also makes clear that it is neccessary to run sym expansion on the RBMC-ed particle set - one can’t just use a sym expanded set of particles from the RBMC input.

Absolutely, or post hoc copy info using cs-tools but that is more work than you probably want to do.

Did you recenter particles during RBMC? This is the default, but I wonder if it will lead to any issues downstream if sym expansion is performed and a mask/map provided for subparticle RBMC without re-refining (as volumes can be slightly offset after RBMC)

Good question. I did leave recenter particles on for full particle RBMC. Apply Trajectories will not recenter the particles. It will use whatever is present in the fields location/center(x/y)_frac as the centers and apply the trajectories of the full particle from there. So, I think that for proper centering, particles would have to be extracted post VAT and prior to Apply Trajectories. I am currently testing this out and will report back.

kstachowski · November 21, 2024, 8:40pm

it is not as streamlined as it could be

This comment holds even more true.

After testing the centering issue, I was able to get a 0.1Å increase over the previous local refinement I did with no subparticle RBMC; YMMV. Although my goal was to limit the number of extractions, it looks like this would not be the case currently. Okay, I think this is the final, final, workflow.

After obtaining a homogeneous particle stack, I extracted particles at a box size of 800 px and F-cropped to 600 px (this extraction is likely unnecessary unless you want to extract to recenter and reconstruct prior to RBMC to verify everything is at the center of the box befor RBMC).
Run RBMC of the full particles as normal. I F-cropped the particles from their original 800 pixel box to 256 at this step just as a way to save space.
I symmetry expanded the particles with the correct symmetry operator
I created a mask for the subparticle and used Volume Alignment Tools to shift the subparticles to the center of the box
Re-extracted particles with Recenter particle shifts enabled, I did this at the box size of 800 pixels downsampled to 600.
I used the job Apply Trajectories with the hyperparameters/dose weights/trajectories from the RBMC of the full particles, the newly re-extracted particles and then the volume/mask outputs from the Volume Alignment Tools job.
Lastly, I used a downsample job to real-space crop the particles to a more manageable box size before moving onto local refinement/3D-classification.

olibclarke · November 21, 2024, 8:46pm

Thanks Kye for doing all this! I wonder if redoing the full particle refinement at step 2 (prior to symmetry expansion) might improve things further (as particles always get slightly misaligned after RBMC in my experience, which could have consequences for downstream local refinement of subparticles after RBMC)

kstachowski · November 21, 2024, 8:53pm

Yeah, that might help too, although I didnt test that.

olibclarke · November 22, 2024, 12:08pm

Hi Kye, after trying this myself I have a couple more queries/thoughts:

a. In step 5, is it neccessary to actually re-extract? Won’t the updated coordinates be sufficient for apply traj? I am trying it at the moment and it seems to be working without extracting after VAT, but apply traj doesn’t give any intermediate outputs, or plot the subparticle trajectories or particle centers on the micrographs, so it is hard to be sure.

Also, I’m not sure if using re-extracted particles is even compatible with Apply Traj if Fourier cropping was initially performed during Patch Motion (as Apply Traj seems to fail if the box size differs between the Apply Traj value (which is based on the original movie pixel size) and the input particles)

b. It would be very useful to have an option to Fourier Crop in apply trajectories. Currently this workflow leads to giant, disk-filling particle stacks if one is using super-res movies with already large particles and symmetry expansion, making this workflow unfortunately impractical for some of the cases we would like to try it on. It would also be helpful if apply trajectories had a progress bar like RBMC - it is currently difficult to estimate how long it will take to complete.

c. Apply Traj produces a bunch of unlabeled plots at the start of the log. What do these signify?

Also, Apply Traj has an option “produce motion diagnostic plots for n movies”, but it does not produce motion diagnostic plots for any movies, regardless of what value this is set to.

d. Apply Traj seems to be running and processing movies, but some movies (just a handful so far) seem to be failing with CUDA allocation errors. This is on a 3090, initial box size 1200 (super-res). Is this something to be concerned about? Can the incomplete movies be reprocessed later? I guess this also indicates that larger box sizes will be impractical with any card?

Error occurred while processing J54/imported/007171423974844969476_24nov04c_grid1_100_00006gr_00077sq_v03_00005hln_00027enn.frames.tif Traceback (most recent call last): File “/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 851, in _attempt_allocation return allocator() File “/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1054, in allocator return driver.cuMemAlloc(size) File “/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 348, in safe_cuda_api_call return self._check_cuda_python_error(fname, libfn(*args)) File
“/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 408, in _check_cuda_python_error raise CudaAPIError(retcode, msg) numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_OUT_OF_MEMORY] Call to cuMemAlloc results in CUDA_ERROR_OUT_OF_MEMORY During handling of the above exception, another exception occurred: Traceback (most recent call last): File “/home/exx/cryosparc/cryosparc_worker/cryosparc_compute/jobs/pipeline.py”, line 59, in exec return self.process(item) File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_applytraj.py”, line 253, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.run_applytraj.run_apply_traj_multi.motionworker.process File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_applytraj.py”, line 261, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.run_applytraj.run_apply_traj_multi.motionworker.process File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_applytraj.py”, line 262, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.run_applytraj.run_apply_traj_multi.motionworker.process File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/motioncorrection.py”, line 742, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.motioncorrection.motion_correction_apply_traj_patches File “cryosparc_master/cryosparc_compute/jobs/motioncorrection/motioncorrection.py”, line 763, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.motioncorrection.motion_correction_apply_traj_patches File “cryosparc_master/cryosparc_compute/gpu/gpucore.py”, line 398, in cryosparc_master.cryosparc_compute.gpu.gpucore.EngineBaseThread.ensure_allocated File “/home/exx/cryosparc/cryosparc_worker/cryosparc_compute/gpu/gpuarray.py”, line 376, in empty return device_array(shape, dtype, stream=stream) File “/home/exx/cryosparc/cryosparc_worker/cryosparc_compute/gpu/gpuarray.py”, line 332, in device_array arr = GPUArray(shape=shape, strides=strides, dtype=dtype, stream=stream) File “/home/exx/cryosparc/cryosparc_worker/cryosparc_compute/gpu/gpuarray.py”, line 127, in init super().init(shape, strides, dtype, stream, gpu_data) File “/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/devicearray.py”, line 103, in init gpu_data = devices.get_context().memalloc(self.alloc_size) File “/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1372, in memalloc return self.memory_manager.memalloc(bytesize) File “/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1056, in memalloc ptr = self._attempt_allocation(allocator) File “/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 863, in _attempt_allocation return allocator() File “/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 1054, in allocator return driver.cuMemAlloc(size) File “/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 348, in safe_cuda_api_call return self._check_cuda_python_error(fname, libfn(*args)) File “/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py”, line 408, in _check_cuda_python_error raise CudaAPIError(retcode, msg) numba.cuda.cudadrv.driver.CudaAPIError: [CUresult.CUDA_ERROR_OUT_OF_MEMORY] Call to cuMemAlloc results in CUDA_ERROR_OUT_OF_MEMORY Marking as incomplete and continuing…

e. If one attempts to only process a subset of movies (in this case 50), Apply Traj fails with an error:

Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 129, in cryosparc_master.cryosparc_compute.run.main
  File "cryosparc_master/cryosparc_compute/jobs/motioncorrection/run_applytraj.py", line 340, in cryosparc_master.cryosparc_compute.jobs.motioncorrection.run_applytraj.run_apply_traj_multi
  File "/home/exx/cryosparc/cryosparc_worker/cryosparc_tools/cryosparc/dataset.py", line 1487, in mask
    assert len(mask) == len(self), f"Mask with size {len(mask)} does not match expected dataset size {len(self)}"
AssertionError: Mask with size 50 does not match expected dataset size 10438

olibclarke · November 22, 2024, 1:59pm

I don’t completely understand this part - can you elaborate a little more why this is the case?