Topaz error: Subprocess exited with status 1

Dear all,

I am trying to train topaz with 2700 micrographs and 3k particles. I keep getting the error:

in run_process assert process.returncode == 0, f"Subprocess exited with status {process.returncode} ({str_command})"

AssertionError: Subprocess exited with status 1.

Does anyone have any idea what the problem could be? I am certain that it is not an installation or topaz version problem since other people can use Topaz without issues, could there be something wrong with my micrographs? I am using default settings with 1 expected particle per micrograph.

Any idea or suggestion would be greatly appreciated.

Kind regards

@Panoskre In case you are running topaz via the CryoSPARC wrapper, please can you post:

  • output of the command
    /path/to/topaz --version
  • the CryoSPARC version
  • additional context for the error, replacing P99 and J199 with the actual CryoSPARC project and job IDs, respectively, in the following commands
    cryosparcm joblog P99 J199 | tail -n 40
    cryosparcm eventlog P99 J199 | tail -n 40
    

@Panoskre is using the cryosparc instance that i operate.

Topaz: 0.2.5a
Cryosparc Version: 4.6.2

================= CRYOSPARCW =======  2025-02-13 13:28:21.972709  =========
Project P100 Job J1023
Master cryosparcmaster Port 39002
===========================================================================
MAIN PROCESS PID 2384876
========= now starting main process at 2025-02-13 13:28:21.973416
topaz.run_topaz cryosparc_compute.jobs.jobregister
MONITOR PROCESS PID 2384878
========= monitor process now waiting for main process
========= sending heartbeat at 2025-02-13 13:28:23.000400
========= sending heartbeat at 2025-02-13 13:28:33.013967
========= sending heartbeat at 2025-02-13 13:28:43.027943
***************************************************************
Transparent hugepages setting: always madvise [never]

Running job on hostname %s gpu02
**** handle exception rc
Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 129, in cryosparc_master.cryosparc_compute.run.main
  File "/opt/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/run_topaz.py", line 332, in run_topaz_wrapper_train
    utils.run_process(split_command)
  File "/opt/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/topaz_utils.py", line 99, in run_process
    assert process.returncode == 0, f"Subprocess exited with status {process.returncode} ({str_command})"
AssertionError: Subprocess exited with status 1 (/programs/x86_64-linux/topaz/0.2.5a_cu11.3_py36/bin/topaz train_test_split --number 109 --seed 1420656222 --image-dir /mnt/xxx/data04/xxxx/cryosparc/CS-xxxx-cftr-vesicles/J566 /mnt/xxx/data04/xxxx/cryosparc/CS-xxx-cftr-vesicles/J1023/topaz…)
set status to failed
========= main process now complete at 2025-02-13 13:28:53.042279.
========= monitor process now complete at 2025-02-13 13:28:53.046268.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "006231731896023480250_FoilHole_27059231_Data_27066474_26_20250128_225111_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "001904820508562434347_FoilHole_27054202_Data_27065497_16_20250128_173455_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "005613296379436869938_FoilHole_27061933_Data_27065500_33_20250129_022033_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "004390728773744558122_FoilHole_27042162_Data_27065500_12_20250128_060528_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "017992005504552028817_FoilHole_27032796_Data_27065497_35_20250127_230817_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "006152693300718219366_FoilHole_28124267_Data_27066474_25_20250129_065353_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "002424809494787319258_FoilHole_27060774_Data_27065497_22_20250129_005032_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "009997577164604924640_FoilHole_27037549_Data_27065497_12_20250128_023151_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "000170584651708938755_FoilHole_27030159_Data_27066474_10_20250127_205906_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "006280494022743254927_FoilHole_27058042_Data_27066474_3_20250128_211804_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "013654983446909551547_FoilHole_27027190_Data_27065500_22_20250127_181135_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "016886265913036348510_FoilHole_27048767_Data_27065497_21_20250128_123328_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "005226442435868559764_FoilHole_27060829_Data_27065497_1_20250129_005729_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] WARNING: no micrograph found matching image name "003369386825400160251_FoilHole_27058044_Data_27065500_10_20250128_212755_fractions_patch_aligned_doseweighted". Skipping it.
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] Traceback (most recent call last):
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] File "/programs/x86_64-linux/topaz/0.2.5a_cu11.3_py36/bin/topaz", line 8, in <module>
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] sys.exit(main())
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] File "/programs/x86_64-linux/topaz/0.2.5a_cu11.3_py36/topaz_extlib/envs/topaz-py3.6/lib/python3.6/site-packages/topaz/main.py", line 148, in main
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] args.func(args)
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] File "/programs/x86_64-linux/topaz/0.2.5a_cu11.3_py36/topaz_extlib/envs/topaz-py3.6/lib/python3.6/site-packages/topaz/commands/train_test_split.py", line 128, in main
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] image_list_train = pd.DataFrame({'image_name': image_names_train, 'path': paths_train})
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] File "/programs/x86_64-linux/topaz/0.2.5a_cu11.3_py36/topaz_extlib/envs/topaz-py3.6/lib/python3.6/site-packages/pandas/core/frame.py", line 468, in __init__
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] mgr = init_dict(data, index, columns, dtype=dtype)
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] File "/programs/x86_64-linux/topaz/0.2.5a_cu11.3_py36/topaz_extlib/envs/topaz-py3.6/lib/python3.6/site-packages/pandas/core/internals/construction.py", line 283, in init_dict
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] File "/programs/x86_64-linux/topaz/0.2.5a_cu11.3_py36/topaz_extlib/envs/topaz-py3.6/lib/python3.6/site-packages/pandas/core/internals/construction.py", line 78, in arrays_to_mgr
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] index = extract_index(arrays)
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] File "/programs/x86_64-linux/topaz/0.2.5a_cu11.3_py36/topaz_extlib/envs/topaz-py3.6/lib/python3.6/site-packages/pandas/core/internals/construction.py", line 397, in extract_index
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] raise ValueError("arrays must all be same length")
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] ValueError: arrays must all be same length
[Thu, 13 Feb 2025 12:28:48 GMT] [CPU RAM used: 280 MB] Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 129, in cryosparc_master.cryosparc_compute.run.main
  File "/opt/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/run_topaz.py", line 332, in run_topaz_wrapper_train
    utils.run_process(split_command)
  File "/opt/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/topaz_utils.py", line 99, in run_process
    assert process.returncode == 0, f"Subprocess exited with status {process.returncode} ({str_command})"
AssertionError: Subprocess exited with status 1 (/programs/x86_64-linux/topaz/0.2.5a_cu11.3_py36/bin/topaz train_test_split --number 109 --seed 1420656222 --image-dir /mnt/xxx/data04/xxxx/cryosparc/CS-xxxx-cftr-vesicles/J566 /mnt/xx/data04/xxxx/cryosparc/CS-xxx-cftr-vesicles/J1023/topaz…)
[Thu, 13 Feb 2025 12:30:29 GMT]  License is valid.
[Thu, 13 Feb 2025 12:30:29 GMT]  Launching job on lane RTX4090 target gpu02 ...
[Thu, 13 Feb 2025 12:30:29 GMT]  Running job on remote worker node hostname gpu02

This is a possible error:

 [CPU:  268.0 MB  Avail: 222.27 GB]

# splitting 548 micrographs with 563 labeled particles into 439 train and 109 test micrographs
[CPU:  268.0 MB  Avail: 222.26 GB]

Traceback (most recent call last):
[CPU:  268.0 MB  Avail: 222.26 GB]

File "/programs/x86_64-linux/topaz/0.2.5a/bin/topaz", line 33, in <module>
[CPU:  268.0 MB  Avail: 222.26 GB]

sys.exit(load_entry_point('topaz-em==0.2.5a0', 'console_scripts', 'topaz')())
[CPU:  268.0 MB  Avail: 222.25 GB]

File "/programs/x86_64-linux/topaz/0.2.5a/lib/python3.9/site-packages/topaz/main.py", line 148, in main
[CPU:  268.0 MB  Avail: 222.24 GB]

args.func(args)
[CPU:  268.0 MB  Avail: 222.15 GB]

File "/programs/x86_64-linux/topaz/0.2.5a/lib/python3.9/site-packages/topaz/commands/train_test_split.py", line 108, in main
[CPU:  268.0 MB  Avail: 222.23 GB]

targets_train = pd.concat(groups_train, 0)
[CPU:  268.0 MB  Avail: 222.22 GB]

TypeError: concat() takes 1 positional argument but 2 were given

Dear @wtempel

Apologies for the late reply, it is as @biocit explained. I still can not find out why topaz is not working. It is trying to find the dose_weighted.mrc files, and since it does not, it crashes. (…fractions_patch_aligned_doseweighted". Skipping it.) These files (doseweighted) are only found in my motion correction jobs.

Best regards

@biocit Farther up in the event log, did the
topaz preprocess and
topaz convert commands complete successfully?
Please can you post the output of the command:

cryosparcm eventlog P100 J1023 | head -n 60

I am puzzled by the mention of various paths for the topaz executable:

Do the log excerpts correspond to different jobs?

Dear @wtempel

I finally managed to solve the problem. For some reason, whenever I included the absolute path of my micrographs I was getting this error. However, when I removed the absolute path and tried Topaz train without it, it worked without any problems.
Once again, thank you for your time and for looking into it.

Best regards.

Thanks @Panoskre for confirming the resolution of this issue.
Out of curiosity:
With

were you referring to the Absolute path of directory containing preprocessed directory parameter?

Dear @wtempel,

That is correct, it was the “absolute path of directory containing preprocessed directory”.
I thought it was something essential for topaz to run, but it appears that when I connect it to the location of my micrographs, it crashes the job.

Best regards

I see. Absolute path of directory containing preprocessed directory in this context refers to output from the topaz preprocess command. The parameter enables reuse of output from a previous topaz preprocess run. If the parameter is left blank, topaz preprocess will be initiated automatically by the CryoSPARC job that wraps topaz.