Topaz Train job going through but outputing flat curves

wjnicol · July 22, 2024, 8:20pm

Hello everyone,

I am currently trying to train protein-specific particle pickers with the Topaz Train job and understand better the CryoSparc integration of Topaz, as I am new to cryosparc.

My original plan and train of thought was to use a previously generated “clean” set or particles that yielded a good quality map that went to high resolution:

Selecting clean particle coordinates

Ran a 2D Classification job with the “clean particles”
Selected the 2D classes with various poses, trying my best to include the “rare” classes

This outputs, I believe, particle coordinates (not a particle stack) that in combination with the micrographs will allow training of Topaz Model.

Generating denoised micrographs for Topaz training

Trained a Micrograph Denoiser on a subset of 100 exposures from the same starting exposures used throughout the data processing to reach final resolution (essentially same raw material).
Applied the trained denoiser model on all exposures used for data processing

Setup Topaz training

First off, It is not clear to me if topaz uses particles stacks (already extracted) or particle coordinates and then extracts from micrographs. It seems to be it is the later but I’d like to be sure.

In order to troubleshoot more efficiently, I’ve been running topaz train on a small dataset of 100 exposures. When I run the topaz trainer on the 100 denoised exposures and template-picked or blob picked particles I get the following Warning:

WARNING: 57947 particle coordinates are out of the micrograph dimensions. Did you scale the micrographs and particle coordinates correctly?

And I get flat or no curves at all.

Now when I tick off “Use denoised micrographs”, it seems to work fine. I gather from this it is a scaling problem. I compared a denoised micrograph and a non-denoised in Chimera and using IMOD’s header command and indeed, the denoised is at 1549x1100px vs non-denoised 640x454. I don’t understand where this down scaling comes from. Do I need to downscale my particles?

Misc questions

Can a Topaz model trained on denoised images yield good results on non denoised? (my assumption is no).
If I point towards the denoised micrographs (output from denoising job) but untick “use denoised micrographs” in the Topaz Train job, what is supposed to take precedence?
What does Topaz consider as a training example? A micrograph or a particle?

Thank you very much for the help,

William

kstachowski · July 29, 2024, 5:26pm

Hi @wjnicol,

First off, welcome to the community!

Looking through your workflow – you are correct in terms of selecting a clean particle stack for TOPAZ. The outputs of a select 2D include the 2D classes (templates) you see in the job, along with all of the extracted particle images that make up those classes and their associated metadata (such as the particles location in the micrograph). TOPAZ training only requires the particle locations – these can be obtained from particle picking (eg. manual) or after extracting and processing a particle stack.

The warning you are getting is due to the difference in scaling applied in our Micrograph Denoiser and the scaling applied by TOPAZ. The scaling issue you are noticing is due to the fact that both our Micrograph Denoiser and TOPAZ Train/Extract downsample the micrographs to different sizes for processing. Particles picked on the smaller, denoised micrographs will be outside the bounds of the downsampled TOPAZ mics.

As of v4.5, denoised micrographs produced by our Micrograph Denoiser in CryoSPARC are not compatible with TOPAZ training and extraction. You can technically use our denoised micrographs in TOPAZ (ie the job will launch), but this performs very poorly. If you would like to use denoised micrographs for TOPAZ training and extraction you would need to use the TOPAZ Denoise job to generate denoised mics. You do not need to downsample your particles. For more information on all the specifics of how TOPAZ works you can checkout Tristan Bepler’s guides on this located at the TOPAZ GitHub page.

Misc questions

Can a Topaz model trained on denoised images yield good results on non denoised? (my assumption is no).

No, it will not.

If I point towards the denoised micrographs (output from denoising job) but untick “use denoised micrographs” in the Topaz Train job, what is supposed to take precedence?

If you are running TOPAZ through CS, the non-denoised mics will be used. This selection is possible because the Micrograph Denoiser output group will contain the denoised and noisy micrographs.

What does Topaz consider as a training example? A micrograph or a particle?

A particles location is used in conjuction with the parameter “Training radius” which tells TOPAZ how many downsampled pixels are considered for the positive label.

Best,
Kye

wjnicol · July 30, 2024, 5:42pm

Thank you Kye for the detailed response.

I’m currently running benchmarks to see if picking on denoised micrographs, in our case helps…And from what I understand, it’s best for us to not use Topaz in conjunction with CS’s Micrograph Denoiser. The Deep Picker implementation articulates better with the overall CS workflows we use.

Best,

Will