Separating particles from different datasets

I have a large dataset of ~15k images. During collection I used different parameters which are recorded in the file name. I have already assigned exposure sets, and because I used a 5x5x3 pattern there are 75 sets for each collection setting. Usually I run separate extraction jobs for the collection groups, but when there are multiple sets of classifications, Relion imports, etc. it becomes impractical.

Is there any way for me to split particles based on micrograph name without changing the exposure set assignments they already have and without running a new extraction?

Do you mean

75 sets[, one set] for each collection setting [of three parameters]

such that you would end up with 75 non-overlapping sets of particles?

Sorry, by 5x5x3 I mean 25 holes and 3 shots per hole. There are 75 image shift groups for each time I adjusted the microscope. In this case, I adjusted the microscope 3 times and there are also 3 sets of parameters. I imported the micrographs separately at first, and created exposure groups for them using an offset so that there are 225 unique shift groups.

After extraction I processed them all together, eventually exporting and re-importing them. I would now like to split up the data based on the microscope adjustments so I can compare the resolution. I know I can re-extract using separate jobs, or re-do the exposure groups by first splitting on the “scope setting” part of the file names and then again on the image shift group part, it’s just pretty inconvenient. I guess I basically want to do split by filename like in the exposure groups utility, but then not overwrite the exposure group IDs.

BTW I also use the particle sets tool at the end to bootstrap several sets with the same number of particles in order to get good resolution comparisons, and the microscope stuff I’m testing has to do with 1) energy filter slit width and 2) frequency of coma-free alignment.

Another way of saying it is that what I want is just to pull out the particles for particular micrographs without doing an extraction. A job where I could link all the particles and the right set of exposures and then get just those particles from those particular exposures would do the trick.

Extract from micrographs actually works exactly this way already…maybe Inspect particles will do it?

Inspect picks doesn’t work because the pick stats may be missing. Reassign particles to micrographs also fails when some particles aren’t mapped to any provided micrograph.

Update 2
Curate exposures actually works, but it’s a bit of a pain as there is no non-interactive way to run the job and one must wait for the job to load the 10k+ images, hit accept all, and then done. I think it’s a little more convenient than extracting into a 2 pixel box and replacing the data blob in low-level input, though.

1 Like

Hey @DanielAsarnow,

I think I understand what you’re trying to do, and I believe I have a potential method of doing this in the UI without having to resort to modifying the .cs files directly.
You can use the Exposure Group Utilities job to split up the particles based on whatever you like, and make sure that the Split Outputs by Exposure Group parameter is on. You can then use the outputs, and connect any combination of them to downstream jobs. Once connected to those jobs, override each particle input group’s “CTF” result slot using the low-level results editor in the Job Builder.

Doing this allows you to split up the particles however you want, but still retaining the original exposure_group_ids of the particles.

Example parameter settings where I split up particles into 4 groups, used two of them in a refinement and overrode their CTF slots to retain their original exposure groups:

(following job is a Class2D but its the same idea)

(following is the actual refinement where I ran it with the two separate particle groups, but it’s still showing as only one exposure group since I overrode the CTF slot)

1 Like