Picking / Classification of particles (too) close to each other

Hey all,

what would be your workflow if the first 2D classification of a data set results not only in good classes of single particles, but also classes that seem to have two particles very close to each other? Is there a way to split these classes into images with recentered particles? Should I use template picking using the good averages as templates and decrease the particle distance (min seperation distance for the blob picker was at 0.3 (particle diameter 120-200A), which I thought would be small enough)
Or should I ditch these images completely as the signals of both particles will interfer with each other in the alignment (in fourier space)?
P.S.: these are phase plate data recorded at <0.1µm defocus, hence no CTF correction

I’d abandon them.

Posts must be at least 20 characters.

Use the single, well centered particles to train a topaz or crYOLO model, to improve your picking. Likely you have some miscentered picks - improving the picking in this instance could improve downstream results (and is often beneficial for crowded/aggregated samples and small particles, in my experience)

1 Like

great, thanks. how will this work if i have a probably orientation biased sample? could I try to make an ab intio and use that for picking? if i use the averages only I guess I will miss out all the other orientations

I would try and see how you go. Quite often even if you have some orientation bias, there will be particles from other orientations in those good averages.

With cryolo/topaz of these averages or from an ab initio?

cryolo/topaz (post must be at least 20 characters)

1 Like

Any hint which one to use?

Both are good, I use topaz more often but both will give good results with good training data

In addition to better picking, or abandoning if you have a large dataset and they are a small fraction, I would select the doubles, re-run 2D with small window (65 or so). Iterate. They will become centered, and when they do, extract with recentering. If you redo 2D with recentering the problem could arise again (good that 2D provides diversity!), so can just skip 2D with these. A well-centered particle will align well to a nice 3D volume, even if there is a particle right next to it. And dense micrographs make for great CTFs and more data.

1 Like

I tried to use topaz train but somehow I cannot get it to work. The job is running but is stuck at the following state (nothing changed over the weekend):

Which GPU are you using? This may be the culprit?

I have two GTX1070 installed.
I used all standard settings. Maybe there is sth that speeds up things? and is there anything that I have to specify if the data are phase plate data?

P.S.: I tried the job on 10 micrographs and it worked, though it took 2h for it to finish and the extract / inspect jobs showed quite a large amount of not picket particles that I would have manually picked. I sed the nr of particles per mic to 300. But applying it to all 2500 mics would take 500h…

Which bit took 2h? Training or extraction? You only need to do training once, and I would often train on only 50-100 mics with several thousand particles.

Also, what settings did you use? The default settings for computation often spawn a lot of subprocesses and lead to system lockup in my hands. I would recommend 2 threads & 2 processes as a starting point.

Regarding results, this will very much depend on the threshold used in Topaz Extract - may require some tweaking. Estimated number of particles per mic in training is also an important parameter to tune.

Training took 2h.
I used the default settings.
I will try 100 mics
expectednr of particles=800
nr of parallel processes=2
nr of CPUs=2

Ok if it is the training that took 2h that is fine - you only need to do this once, and you never need to do it on the whole dataset. Extraction should be faster - I would not extrapolate from training to extraction based on number of mics

extraction was pretty fast afterwards, true

1 Like