Topaz Extract fails on entire dataset

Hi all,

I have been training Topaz on my dataset using a subset of 20 micrographs and has worked fine. The same applies when I use the model to run a Topaz Extract job on that micrograph subset.
However, it fails on the entire dataset (About 7.5k images) with exactly the same parameters.

Here is the error output, it is the same no matter what I change:

[CPU: 252.2 MB]  Starting extraction by running command /home/upiv/.conda/envs/topaz/bin/topaz extract --radius 38 --threshold -6 --up-scale 4 --assignment-radius -1 --min-radius 5 --max-radius 100 --step-radius 5 --num-workers 8 --device 0 --model /raid0/cryoem/David/P3/J51/models/model_epoch07.sav -o /raid0/cryoem/David/P3/J65/topaz_particles_prediction.txt [MICROGRAPH PATHS EXCLUDED FOR LEGIBILITY]
[CPU: 252.4 MB]  Traceback (most recent call last):
[CPU: 252.4 MB]  File "/home/.conda/envs/topaz/bin/topaz", line 33, in <module>
[CPU: 252.4 MB]  sys.exit(load_entry_point('topaz-em==0.2.5', 'console_scripts', 'topaz')())
[CPU: 252.4 MB]  File "/home/.conda/envs/topaz/lib/python3.8/site-packages/topaz/", line 148, in main
[CPU: 252.4 MB]  args.func(args)
[CPU: 252.4 MB]  File "/home/.conda/envs/topaz/lib/python3.8/site-packages/topaz/commands/", line 288, in main
[CPU: 252.4 MB]  for path,score,coords in nms_iterator(stream, radius, threshold, pool=pool):
[CPU: 252.4 MB]  File "/home/.conda/envs/topaz/lib/python3.8/site-packages/topaz/commands/", line 79, in nms_iterator
[CPU: 252.4 MB]  for name,score,coords in pool.imap_unordered(process, scores):
[CPU: 252.4 MB]  File "/home/.conda/envs/topaz/lib/python3.8/multiprocessing/", line 868, in next
[CPU: 252.4 MB]  raise value
[CPU: 252.4 MB]  struct.error: unpack requires a buffer of 1024 bytes

Regarding the red part of the error, I get the following:

[CPU: 252.4 MB]  Traceback (most recent call last):
  File "cryosparc_worker/cryosparc_compute/", line 85, in
  File "/home/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/", line 1109, in run_topaz_wrapper_extract
  File "/home/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/", line 98, in run_process
    assert process.returncode == 0, f"Subprocess exited with status {process.returncode} ({str_command})"
AssertionError: Subprocess exited with status 1 (/home/.conda/envs/topaz/bin/topaz extract --radius 38 --threshold -6 --up-scale 4 --assignment-radius -1 --min-radius 5 --max-radius 100 --step-radius 5 --num-workers 8 --device 0 --model /raid0/cryoem/David/P3/J51/models/model_epoch07.sav -o /raid0/c…)

Any help would be more than welcome. Thank you very much in advance

Hi David,

I’d suggest splitting your micrographs into 2 sets (~3750 each) and running your topaz extract on them separately. Topaz extract tends to fail if you have more than 5k micrographs.



Hi Vamsee,

Thanks for the suggestion! It did not work for me but I found the solution (or at least a workaround). I did Topaz preprocess on command line and then ran Topaz extract on the preprocessed directory introducing the absolute path in the GUI option.
I am putting it here so people in the future can refer to it if they run into the same issue.