Topaz Extract fatal error

Hi,

I’m running Topaz Extract job on our cluster but job failed (cryoSPARC v4.5.3) using the default settings. It says:
FATAL: while executing starter: while initializing starter command: while copying engine configuration: engine configuration too big > 1048448
I have ~9800 micrographs to pick particles.
What’s the possible reason for this error? Thank you!

@liningsdu Please can you post the output of the command

cryosparcm eventlog P199 J999

where you replace P199 and J999 with the actual project and job IDs of the failed Topaz Extract job.

Hi,
Here is the output of the command:

[Wed, 03 Jul 2024 17:45:20 GMT]  License is valid.
[Wed, 03 Jul 2024 17:45:20 GMT]  Launching job on lane single_gpu target single_gpu ...
[Wed, 03 Jul 2024 17:45:20 GMT]  Launching job on cluster single_gpu
[Wed, 03 Jul 2024 17:45:20 GMT]  
====================== Cluster submission script: ========================
==========================================================================
#!/bin/bash
#SBATCH --job-name=cryosparc_P5_J193
#SBATCH --output=/data/SBS/dinosaur/datasets/cryoem_data/CS-cryoem_data/J193/output.txt
#SBATCH --error=/data/SBS/dinosaur/datasets/cryoem_data/CS-cryoem_data/J193/error.txt
#SBATCH --mem=64GB
#SBATCH --time=4:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=8
#SBATCH --cpus-per-task=1
#SBATCH --partition=gpu
#SBATCH --gres=lscratch:100,gpu:a100:1
/data/dinosaur/apps/cryosparc/cryosparc_worker/bin/cryosparcw run --project P5 --job J193 --master_hostname cn1559 --master_command_core_port 39024 > /data/SBS/dinosaur/datasets/cryoem_data/CS-cryoem_data/J193/job.log 2>&1 
==========================================================================
==========================================================================
[Wed, 03 Jul 2024 17:45:20 GMT]  -------- Submission command: 
/usr/local/slurm/bin/sbatch /data/SBS/dinosaur/datasets/cryoem_data/CS-cryoem_data/J193/queue_sub_script.sh
[Wed, 03 Jul 2024 17:45:21 GMT]  -------- Cluster Job ID: 
29915694
[Wed, 03 Jul 2024 17:45:21 GMT]  -------- Queued on cluster at 2024-07-03 13:45:21.647585
[Wed, 03 Jul 2024 17:49:37 GMT] [CPU RAM used: 85 MB] Job J193 Started
[Wed, 03 Jul 2024 17:49:37 GMT] [CPU RAM used: 85 MB] Master running v4.5.3, worker running v4.5.3
[Wed, 03 Jul 2024 17:49:37 GMT] [CPU RAM used: 85 MB] Working in directory: /data/SBS/dinosaur/datasets/cryoem_data/CS-cryoem_data/J193
[Wed, 03 Jul 2024 17:49:37 GMT] [CPU RAM used: 85 MB] Running on lane single_gpu
[Wed, 03 Jul 2024 17:49:37 GMT] [CPU RAM used: 85 MB] Resources allocated:
[Wed, 03 Jul 2024 17:49:37 GMT] [CPU RAM used: 85 MB]   Worker:  single_gpu
[Wed, 03 Jul 2024 17:49:37 GMT] [CPU RAM used: 85 MB]   CPU   :  [0, 1, 2, 3, 4, 5, 6, 7]
[Wed, 03 Jul 2024 17:49:37 GMT] [CPU RAM used: 85 MB]   GPU   :  [0]
[Wed, 03 Jul 2024 17:49:37 GMT] [CPU RAM used: 85 MB]   RAM   :  [0]
[Wed, 03 Jul 2024 17:49:37 GMT] [CPU RAM used: 85 MB]   SSD   :  False
[Wed, 03 Jul 2024 17:49:37 GMT] [CPU RAM used: 85 MB] --------------------------------------------------------------
[Wed, 03 Jul 2024 17:49:37 GMT] [CPU RAM used: 85 MB] Importing job module for job type topaz_extract...
[Wed, 03 Jul 2024 17:49:41 GMT] [CPU RAM used: 221 MB] Job ready to run
[Wed, 03 Jul 2024 17:49:41 GMT] [CPU RAM used: 221 MB] ***************************************************************
[Wed, 03 Jul 2024 17:49:41 GMT] [CPU RAM used: 221 MB] Topaz is a particle detection tool created by Tristan Bepler and Alex J. Noble.
Citations:
- Bepler, T., Morin, A., Rapp, M. et al. Positive-unlabeled convolutional neural networks for particle picking in cryo-electron micrographs. Nat Methods 16, 1153-1160 (2019) doi:10.1038/s41592-019-0575-8
- Bepler, T., Noble, A.J., Berger, B. Topaz-Denoise: general deep denoising models for cryoEM. bioRxiv 838920 (2019) doi: https://doi.org/10.1101/838920

Structura Biotechnology Inc. and cryoSPARC do not license Topaz nor distribute Topaz binaries. Please ensure you have your own copy of Topaz licensed and installed under the terms of its GNU General Public License v3.0, available for review at: https://github.com/tbepler/topaz/blob/master/LICENSE.
***************************************************************
[Wed, 03 Jul 2024 17:49:47 GMT] [CPU RAM used: 236 MB] Starting Topaz process using version 0.2.5a...
[Wed, 03 Jul 2024 17:49:47 GMT] [CPU RAM used: 236 MB] Using preprocessed micrographs from  J192/preprocessed
[Wed, 03 Jul 2024 17:49:48 GMT] [CPU RAM used: 238 MB] Found 9874 processed micrograph(s) in /data/SBS/dinosaur/datasets/cryoem_data/CS-cryoem_data/J192/preprocessed
[Wed, 03 Jul 2024 17:49:48 GMT] [CPU RAM used: 238 MB] An additional 0 micrograph(s) require preprocessing. Results will be saved to /data/SBS/dinosaur/datasets/cryoem_data/CS-cryoem_data/J192/preprocessed
[Wed, 03 Jul 2024 17:49:48 GMT] [CPU RAM used: 238 MB] --------------------------------------------------------------
[Wed, 03 Jul 2024 17:49:48 GMT] [CPU RAM used: 238 MB] No micrographs require preprocessing. Skipping.
[Wed, 03 Jul 2024 17:49:48 GMT] [CPU RAM used: 238 MB] Inverting negative staining...
[Wed, 03 Jul 2024 17:49:48 GMT] [CPU RAM used: 241 MB] Inverting negative staining complete.
[Wed, 03 Jul 2024 17:49:48 GMT] [CPU RAM used: 241 MB] --------------------------------------------------------------
[Wed, 03 Jul 2024 17:49:48 GMT] [CPU RAM used: 241 MB] Starting extraction...
[Wed, 03 Jul 2024 17:49:48 GMT] [CPU RAM used: 241 MB] Starting extraction by running command /usr/local/apps/topaz/0.2.5/bin/topaz extract --radius 16 --threshold -6 --up-scale 8 --assignment-radius -1 --min-radius 5 --max-radius 100 --step-radius 5 --num-workers 4 --device 0 --model /data/SBS/dinosaur/datasets/cryoem_data/CS-cryoem_data/J192/models/model_epoch19.sav -o /data/SBS/dinosaur/datasets/cryoem_data/CS-cryoem_data/J193/topaz_particles_prediction.txt [9874 MICROGRAPH PATHS EXCLUDED FOR LEGIBILITY]
[Wed, 03 Jul 2024 17:49:48 GMT] [CPU RAM used: 243 MB] FATAL:   while executing starter: while initializing starter command: while copying engine configuration: engine configuration too big > 1048448
[Wed, 03 Jul 2024 17:49:48 GMT] [CPU RAM used: 243 MB] Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 115, in cryosparc_master.cryosparc_compute.run.main
  File "/gpfs/gsfs12/users/dinosaur/apps/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/run_topaz.py", line 1182, in run_topaz_wrapper_extract
    utils.run_process(extract_command)
  File "/gpfs/gsfs12/users/dinosaur/apps/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/topaz_utils.py", line 99, in run_process
    assert process.returncode == 0, f"Subprocess exited with status {process.returncode} ({str_command})"
AssertionError: Subprocess exited with status 255 (/usr/local/apps/topaz/0.2.5/bin/topaz extract --radius 16 --threshold -6 --up-scale 8 --assignment-radius -1 --min-radius 5 --max-radius 100 --step-radius 5 --num-workers 4 --device 0 --model /data/SBS/dinosaur/datasets/cryoem_data/CS-vem0302a-…)

Did you ask your HPC/IT support for help with this error?

The job may have failed due to the long command line. You may want to try

  1. splitting the Topaz Extract job’s input micrographs in half using an Exposure Sets Tool job with the Action parameter split and Split num. batches parameter set to 2
  2. then running two Topaz Extract jobs, both jobs using the same model as the failed Topaz Extract job and each job extracting from a different one of the split micrograph sets, respectively.

Does this help?

Hi @wtempel ,

I split my micrographs into two parts, now the Topaz Extract job works. Thank you!

1 Like