Possible speed up of Extract From Micrographs


The Event Log of any Extract From Micrographs job suggests that the job simply goes through all micrographs in order and extracts the coordinates found in the Particle input.

This is probably fine early on in the process, say just after picking, when one has many particles per micrograph and almost no micrographs without particle coordinates. But after enough rounds of sorting and selecting particles, it is easy to end up with very few particles per micrograph and many micrographs not contributing any particle to the set of particles at this stage of the analysis. When one needs to re-extract a small set of particles, the Extract job still proceeds the same and goes through all micrograpghs one by one, which is pretty slow. Would it be possible to do the opposite way? From the Particles input, look up the micrographs they came from, and then only Extract these.

It might be difficult to automatically choose between these two modes, so a simple switch would be ok (and its default value can be the current behavior of the Extract job).

Or maybe another job type can generate the list of micrographs that contain the particles from a small set of particles? I took a quick look at Exposure Sets Tools and Exposure Groups Utilities, but not sure they can do this.

Thank you!

You can do it with the Manually Curate Exposures job, where you can select based on how many particles there are per micrograph.

Thank you, this is what I was looking for, but not in the correct jobs. It solves the problem, albeit with an intermediate step.

Hi @Guillaume. Could you post what cryosparc version you’re using and which specific extraction job? The extraction jobs already “skip” micrographs with no particles in them: they don’t bother reading the micrograph from disk. In my testing, these skipped micrographs consume very minimal processing time. What’s the time discrepancy between a job that uses the manually curate trick mentioned by boggild, and a job that just goes through all micrographs?

Hi @hsnyder. I noticed after I tried this suggestion that the time difference was indeed negligible, so I marked this as solved, and this is why I did not follow up.

1 Like