I used the Particle Sets tool to split a giant dataset. I am trying to run each subset in jobs that write to scratch. They are queued on different nodes. The unique particles are “locked” while one writes to scratch, probably because they are perceived as particles from an identical parent job. But since they are unique particles, and queued to a different node’s scratch, shouldn’t they be allowed to simultaneously write? I understand it’s a niche problem, but would benefit from a better explanation or a simple workaround. Maybe if I performed and failed a task with each subset, it would get a unique parent job ID?
Hi @CryoEM1,
See one of my previous questions here: Fewer particles doesn't reduce SSD space required?
The Particle Sets Tool doesn’t actually “split” the particle stack, it just selects a subset within the original particle stack. If you truly want to “split” the stack into smaller stacks, you need to use the Downsample Particles job.
3 Likes
Awesome, thanks for the link. “Downsample particles” seemed to run extraction even when all values are set to null, so the job type is slow and duplicates data. Alternatively, I used the split particles randomly function in subset selection of relion and imported 5 stacks which worked well.
1 Like