Setup info: Cluster managed by Slurm
Software: Cryosparc 4.1.2, Topaz 0.2.5
GPU: Tesla V100
I am currently attempting to run some Topaz training on a subset of my single particle cryo-EM dataset. The workflow I follow is:
- Prepare motion corrected, CTF estimated micrographs
- Run manual particle picking (using manual picker job)
- Extract chosen particles (using extract micrographs job)
- Feed particles and micrographs into a Topaz train job
- All parameters are left at default, except for expected number of particles
This works very well in some cases (e.g. trained in ~2 hours on a dataset of 328 micrographs), but in others (e.g. a dataset of 248 micrographs) seems to get stuck in preprocessing, with no apparent link to input micrograph quantity. When I investigated this, the final output log message is ‘Preprocessing over 8 processes…’, which can then proceed to hang for at least 12 hours (cancelled job at this point). When I looked at the job’s ‘preprocessed’ folder I can see that the job has produced 242/248 micrographs but then just hangs there. A subsequent re-run resulted in the job hanging at 214/248 micrographs. Otherwise, the job still shows ‘running’ and does not indicate any errors.
Has anyone encountered this before? Is there anyway this issue could be fixed or avoided?
Thanks for your help!