Topaz Training Issue

I am having a failed Topaz training job with following logs:

[Thu, 03 Jul 2025 01:00:21 GMT] License is valid.
[Thu, 03 Jul 2025 01:00:21 GMT] Launching job on lane default target YXZ-gpu …
[Thu, 03 Jul 2025 01:00:21 GMT] Running job on master node hostname YXZ-gpu
[Thu, 03 Jul 2025 01:00:23 GMT] [CPU RAM used: 90 MB] Job J483 Started
[Thu, 03 Jul 2025 01:00:23 GMT] [CPU RAM used: 90 MB] Master running v4.7.1, worker running v4.7.1
[Thu, 03 Jul 2025 01:00:23 GMT] [CPU RAM used: 90 MB] Working in directory: /home/xinzhe/data/P4/J483
[Thu, 03 Jul 2025 01:00:23 GMT] [CPU RAM used: 90 MB] Running on lane default
[Thu, 03 Jul 2025 01:00:23 GMT] [CPU RAM used: 90 MB] Resources allocated:
[Thu, 03 Jul 2025 01:00:23 GMT] [CPU RAM used: 90 MB] Worker: YXZ-gpu
[Thu, 03 Jul 2025 01:00:23 GMT] [CPU RAM used: 90 MB] CPU : [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
[Thu, 03 Jul 2025 01:00:23 GMT] [CPU RAM used: 90 MB] GPU : [0]
[Thu, 03 Jul 2025 01:00:23 GMT] [CPU RAM used: 90 MB] RAM : [0]
[Thu, 03 Jul 2025 01:00:23 GMT] [CPU RAM used: 90 MB] SSD : False
[Thu, 03 Jul 2025 01:00:23 GMT] [CPU RAM used: 90 MB] --------------------------------------------------------------
[Thu, 03 Jul 2025 01:00:23 GMT] [CPU RAM used: 90 MB] Importing job module for job type topaz_train…
[Thu, 03 Jul 2025 01:00:27 GMT] [CPU RAM used: 255 MB] Job ready to run
[Thu, 03 Jul 2025 01:00:27 GMT] [CPU RAM used: 255 MB] ***************************************************************
[Thu, 03 Jul 2025 01:00:27 GMT] [CPU RAM used: 255 MB] Topaz is a particle detection tool created by Tristan Bepler and Alex J. Noble.
Citations:

  • Bepler, T., Morin, A., Rapp, M. et al. Positive-unlabeled convolutional neural networks for particle picking in cryo-electron micrographs. Nat Methods 16, 1153-1160 (2019) doi:10.1038/s41592-019-0575-8
  • Bepler, T., Noble, A.J., Berger, B. Topaz-Denoise: general deep denoising models for cryoEM. bioRxiv 838920 (2019) doi: Topaz-Denoise: general deep denoising models for cryoEM and cryoET | bioRxiv

Structura Biotechnology Inc. and cryoSPARC do not license Topaz nor distribute Topaz binaries. Please ensure you have your own copy of Topaz licensed and installed under the terms of its GNU General Public License v3.0, available for review at: topaz/LICENSE at master · tbepler/topaz · GitHub.


[Thu, 03 Jul 2025 01:00:32 GMT] [CPU RAM used: 361 MB] Starting Topaz process using version 0.3.7…
[Thu, 03 Jul 2025 01:00:32 GMT] [CPU RAM used: 361 MB] Random seed used is 915930905
[Thu, 03 Jul 2025 01:01:03 GMT] [CPU RAM used: 361 MB] --------------------------------------------------------------
[Thu, 03 Jul 2025 01:01:03 GMT] [CPU RAM used: 361 MB] Starting preprocessing…
[Thu, 03 Jul 2025 01:01:03 GMT] [CPU RAM used: 361 MB] Using a downsampling factor of 9
[Thu, 03 Jul 2025 01:01:03 GMT] [CPU RAM used: 361 MB] Starting micrograph preprocessing by running command /home/xinzhe/miniconda3/envs/topaz/bin/topaz preprocess --scale 9 --niters 200 --num-workers 0 -o /home/xinzhe/data/P4/J483/preprocessed [17311 MICROGRAPH PATHS EXCLUDED FOR LEGIBILITY]
[Thu, 03 Jul 2025 01:01:03 GMT] [CPU RAM used: 361 MB] Preprocessing over 2 processes…
[Thu, 03 Jul 2025 07:52:21 GMT] [CPU RAM used: 364 MB] Inverting negative staining…
[Thu, 03 Jul 2025 07:52:21 GMT] [CPU RAM used: 368 MB] Inverting negative staining complete.
[Thu, 03 Jul 2025 07:52:21 GMT] [CPU RAM used: 368 MB] Micrograph preprocessing command complete.
[Thu, 03 Jul 2025 07:54:03 GMT] [CPU RAM used: 370 MB] Starting particle pick preprocessing by running command /home/xinzhe/miniconda3/envs/topaz/bin/topaz convert --down-scale 9 --threshold 0 -o /home/xinzhe/data/P4/J483/topaz_particles_processed.txt /home/xinzhe/data/P4/J483/topaz_particles_raw.txt
[Thu, 03 Jul 2025 07:54:11 GMT] [CPU RAM used: 370 MB] Particle pick preprocessing command complete.
[Thu, 03 Jul 2025 07:54:11 GMT] [CPU RAM used: 370 MB] Preprocessing done in 24788.346s.
[Thu, 03 Jul 2025 07:54:11 GMT] [CPU RAM used: 370 MB] --------------------------------------------------------------
[Thu, 03 Jul 2025 07:54:11 GMT] [CPU RAM used: 370 MB] Starting train-test splitting…
[Thu, 03 Jul 2025 07:54:11 GMT] [CPU RAM used: 370 MB] Starting dataset splitting by running command /home/xinzhe/miniconda3/envs/topaz/bin/topaz train_test_split --number 3442 --seed 915930905 --image-dir /home/xinzhe/data/P4/J483/preprocessed /home/xinzhe/data/P4/J483/topaz_particles_processed.txt
[Thu, 03 Jul 2025 07:54:17 GMT] [CPU RAM used: 370 MB] # splitting 17213 micrographs with 668725 labeled particles into 13771 train and 3442 test micrographs
[Thu, 03 Jul 2025 07:59:55 GMT] [CPU RAM used: 370 MB] # writing: /home/xinzhe/data/P4/J483/preprocessed/008205213482525112558_FoilHole_2983753_Data_2964339_2964341_20240504_073538_fractions_train.txt
[Thu, 03 Jul 2025 07:59:56 GMT] [CPU RAM used: 370 MB] # writing: /home/xinzhe/data/P4/J483/preprocessed/008205213482525112558_FoilHole_2983753_Data_2964339_2964341_20240504_073538_fractions_test.txt
[Thu, 03 Jul 2025 07:59:56 GMT] [CPU RAM used: 370 MB] # writing: /home/xinzhe/data/P4/J483/preprocessed/image_list_train.txt
[Thu, 03 Jul 2025 07:59:57 GMT] [CPU RAM used: 370 MB] # writing: /home/xinzhe/data/P4/J483/preprocessed/image_list_test.txt
[Thu, 03 Jul 2025 07:59:58 GMT] [CPU RAM used: 370 MB]
Dataset splitting command complete.
[Thu, 03 Jul 2025 07:59:58 GMT] [CPU RAM used: 370 MB] Train-test splitting done in 347.095s.
[Thu, 03 Jul 2025 07:59:58 GMT] [CPU RAM used: 370 MB] --------------------------------------------------------------
[Thu, 03 Jul 2025 07:59:58 GMT] [CPU RAM used: 370 MB] Starting training…
[Thu, 03 Jul 2025 07:59:58 GMT] [CPU RAM used: 370 MB] Starting training by running command /home/xinzhe/miniconda3/envs/topaz/bin/topaz train --train-images /home/xinzhe/data/P4/J483/image_list_train.txt --train-targets /home/xinzhe/data/P4/J483/topaz_particles_processed_train.txt --test-images /home/xinzhe/data/P4/J483/image_list_test.txt --test-targets /home/xinzhe/data/P4/J483/topaz_particles_processed_test.txt --num-particles 70 --learning-rate 0.0002 --minibatch-size 128 --num-epochs 10 --method GE-binomial --slack -1 --autoencoder 0 --l2 0.0 --minibatch-balance 0.0625 --epoch-size 5000 --model resnet8 --units 32 --dropout 0.0 --bn on --unit-scaling 2 --ngf 32 --num-workers -1 --cross-validation-seed 915930905 --radius 3 --num-particles 70 --device 0 --no-pretrained --save-prefix=/home/xinzhe/data/P4/J483/models/model -o /home/xinzhe/data/P4/J483/train_test_curve.txt
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] # Loading model: resnet8
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] # Model parameters: units=32, dropout=0.0, bn=on
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] # Receptive field: 71
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] # Using device=0 with cuda=True
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] # When using GPU to load data, we only load in this process. Setting num_workers = 0.
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] # Training…
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] # source split p_observed num_positive_regions total_regions
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] Traceback (most recent call last):
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] File “/home/xinzhe/miniconda3/envs/topaz/bin/topaz”, line 33, in
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] sys.exit(load_entry_point(‘topaz-em==0.3.7’, ‘console_scripts’, ‘topaz’)())
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] File “/home/xinzhe/miniconda3/envs/topaz/lib/python3.9/site-packages/topaz/main.py”, line 148, in main
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] args.func(args)
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] File “/home/xinzhe/miniconda3/envs/topaz/lib/python3.9/site-packages/topaz/commands/train.py”, line 140, in main
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] classifier = train_model(classifier, args.train_images, args.train_targets, args.test_images, args.test_targets,
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] File “/home/xinzhe/miniconda3/envs/topaz/lib/python3.9/site-packages/topaz/training.py”, line 607, in train_model
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] num_positive_regions, total_regions, num_images = report_data_stats(train_images_path, train_targets_path, test_images_path, test_targets_path,
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] File “/home/xinzhe/miniconda3/envs/topaz/lib/python3.9/site-packages/topaz/training.py”, line 284, in report_data_stats
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] train_targets = file_utils.read_coordinates(train_targets_path)
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] File “/home/xinzhe/miniconda3/envs/topaz/lib/python3.9/site-packages/topaz/utils/files.py”, line 201, in read_coordinates
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] particles = pd.read_csv(path, sep=‘\t’, dtype={‘image_name’:str})
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] File “/home/xinzhe/miniconda3/envs/topaz/lib/python3.9/site-packages/pandas/io/parsers/readers.py”, line 1026, in read_csv
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] return _read(filepath_or_buffer, kwds)
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] File “/home/xinzhe/miniconda3/envs/topaz/lib/python3.9/site-packages/pandas/io/parsers/readers.py”, line 620, in _read
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] parser = TextFileReader(filepath_or_buffer, **kwds)
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] File “/home/xinzhe/miniconda3/envs/topaz/lib/python3.9/site-packages/pandas/io/parsers/readers.py”, line 1620, in init
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] self._engine = self._make_engine(f, self.engine)
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] File “/home/xinzhe/miniconda3/envs/topaz/lib/python3.9/site-packages/pandas/io/parsers/readers.py”, line 1880, in _make_engine
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] self.handles = get_handle(
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] File “/home/xinzhe/miniconda3/envs/topaz/lib/python3.9/site-packages/pandas/io/common.py”, line 873, in get_handle
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] handle = open(
[Thu, 03 Jul 2025 08:00:02 GMT] [CPU RAM used: 370 MB] FileNotFoundError: [Errno 2] No such file or directory: ‘/home/xinzhe/data/P4/J483/topaz_particles_processed_train.txt’
[Thu, 03 Jul 2025 08:00:04 GMT] [CPU RAM used: 371 MB] Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 129, in cryosparc_master.cryosparc_compute.run.main
File “/home/xinzhe/software/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/run_topaz.py”, line 384, in run_topaz_wrapper_train
utils.run_process(train_command)
File “/home/xinzhe/software/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/topaz_utils.py”, line 99, in run_process
assert process.returncode == 0, f"Subprocess exited with status {process.returncode} ({str_command})"
AssertionError: Subprocess exited with status 1 (/home/xinzhe/miniconda3/envs/topaz/bin/topaz train --train-images /home/xinzhe/data/P4/J483/image_list_train.txt --train-targets /home/xinzhe/data/P4/J483/topaz_particles_processed_train.txt --test-images /home/xin…)

It seems that in the wrapper to execute the topaz training command the following options are given as

–train-images /home/xinzhe/data/P4/J483/image_list_train.txt --train-targets /home/xinzhe/data/P4/J483/topaz_particles_processed_train.txt --test-images /home/xinzhe/data/P4/J483/image_list_test.txt --test-targets /home/xinzhe/data/P4/J483/topaz_particles_processed_test.txt

whereas in the preprocessing stage these files are named

[Thu, 03 Jul 2025 07:59:55 GMT] [CPU RAM used: 370 MB] # writing: /home/xinzhe/data/P4/J483/preprocessed/008205213482525112558_FoilHole_2983753_Data_2964339_2964341_20240504_073538_fractions_train.txt
[Thu, 03 Jul 2025 07:59:56 GMT] [CPU RAM used: 370 MB] # writing: /home/xinzhe/data/P4/J483/preprocessed/008205213482525112558_FoilHole_2983753_Data_2964339_2964341_20240504_073538_fractions_test.txt
[Thu, 03 Jul 2025 07:59:56 GMT] [CPU RAM used: 370 MB] # writing: /home/xinzhe/data/P4/J483/preprocessed/image_list_train.txt
[Thu, 03 Jul 2025 07:59:57 GMT] [CPU RAM used: 370 MB] # writing: /home/xinzhe/data/P4/J483/preprocessed/image_list_test.txt

I’m using cryoSPARC v4.7.1 and Topaz 0.3.7.

Topaz 0.2.5 is the most recent version currently supported by the CryoSPARC wrapper (version-specific installation instructions). CryoSPARC developers are aware of users’ interest in support for more recent Topaz versions.

1 Like