Topaz Train fails after manual picking

Hi everyone,
Does anyone have an idea what is potential problem?

I’m trying to use Topaz Train to pick only the start and end points of protein fibers.
I manually picked ~4000 particles that correspond to these extremities and used them to train Topaz, but the training job always fails — even after I tried splitting the micrographs. I’ve attached the logs. Has anyone encountered this problem when training Topaz, or does anyone have an idea what could be going wrong?

Thanks in advance for your help!

[CPU: 257.4 MB Avail: 227.93 GB]

WARNING: no micrograph found matching image name “013117323519443128231_FoilHole_2494191_Data_2455121_27_20240505_182653_EER_patch_aligned_doseweighted”. Skipping it.
[CPU: 257.4 MB Avail: 227.93 GB]

WARNING: no micrograph found matching image name “012290055039585680394_FoilHole_2494174_Data_2455121_11_20240505_173626_EER_patch_aligned_doseweighted”. Skipping it.
[CPU: 257.4 MB Avail: 227.93 GB]

WARNING: no micrograph found matching image name “010384470680063819598_FoilHole_2494204_Data_2455112_24_20240505_173738_EER_patch_aligned_doseweighted”. Skipping it.
[CPU: 257.4 MB Avail: 227.93 GB]

Traceback (most recent call last):
[CPU: 257.4 MB Avail: 227.93 GB]

File “/programs/x86_64-linux/topaz/0.2.5/bin/topaz”, line 8, in
[CPU: 257.4 MB Avail: 227.93 GB]

sys.exit(main())
[CPU: 257.4 MB Avail: 227.93 GB]

File “/programs/x86_64-linux/topaz/0.2.5/topaz_extlib/miniconda3-4.8.2-b5qb/lib/python3.8/site-packages/topaz/main.py”, line 148, in main
[CPU: 257.4 MB Avail: 227.93 GB]

args.func(args)
[CPU: 257.4 MB Avail: 227.93 GB]

File “/programs/x86_64-linux/topaz/0.2.5/topaz_extlib/miniconda3-4.8.2-b5qb/lib/python3.8/site-packages/topaz/commands/train_test_split.py”, line 128, in main
[CPU: 257.4 MB Avail: 227.93 GB]

image_list_train = pd.DataFrame({‘image_name’: image_names_train, ‘path’: paths_train})
[CPU: 257.4 MB Avail: 227.93 GB]

File “/programs/x86_64-linux/topaz/0.2.5/topaz_extlib/miniconda3-4.8.2-b5qb/lib/python3.8/site-packages/pandas/core/frame.py”, line 468, in init
[CPU: 257.4 MB Avail: 227.93 GB]

mgr = init_dict(data, index, columns, dtype=dtype)
[CPU: 257.4 MB Avail: 227.93 GB]

File “/programs/x86_64-linux/topaz/0.2.5/topaz_extlib/miniconda3-4.8.2-b5qb/lib/python3.8/site-packages/pandas/core/internals/construction.py”, line 283, in init_dict
[CPU: 257.4 MB Avail: 227.93 GB]

return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
[CPU: 257.4 MB Avail: 227.93 GB]

File “/programs/x86_64-linux/topaz/0.2.5/topaz_extlib/miniconda3-4.8.2-b5qb/lib/python3.8/site-packages/pandas/core/internals/construction.py”, line 78, in arrays_to_mgr
[CPU: 257.4 MB Avail: 227.93 GB]

index = extract_index(arrays)
[CPU: 257.4 MB Avail: 227.93 GB]

File “/programs/x86_64-linux/topaz/0.2.5/topaz_extlib/miniconda3-4.8.2-b5qb/lib/python3.8/site-packages/pandas/core/internals/construction.py”, line 397, in extract_index
[CPU: 257.4 MB Avail: 227.93 GB]

raise ValueError(“arrays must all be same length”)
[CPU: 257.4 MB Avail: 227.93 GB]

ValueError: arrays must all be same length

[CPU: 257.4 MB Avail: 227.94 GB]

Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 129, in cryosparc_master.cryosparc_compute.run.main
File “/appdata/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/run_topaz.py”, line 332, in run_topaz_wrapper_train
utils.run_process(split_command)
File “/appdata/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/topaz_utils.py”, line 99, in run_process
assert process.returncode == 0, f"Subprocess exited with status {process.returncode} ({str_command})"
AssertionError: Subprocess exited with status 1 (/programs/x86_64-linux/topaz/0.2.5/bin/topaz train_test_split --number 50 --seed 1761975313 --image-dir /data/Projects-CryoEM/CS-ndh-2/J625/extract /data/Projects-CryoEM/CS-ndh-2/J636/topaz_particles_processed.txt)

Hi,

one possible source could be, topaz is installed with an incorrect version of Python?

Cheers

Hi,

Thanks for your answer.

I’m using Topaz version 0.2.5, and here are the Python versions currently on the system:

$ python3 --version
Python 3.10.12

$ python --version
Python 2.7.2

$ which python
/programs/x86_64-linux/python/2.7.2/bin.capsules/python

Do you think that Python versions are old (2.7) or too recent (3.10), which may indeed be the problem ?

Thanks!

This looks like possibly the same issue - maybe the solution described here will also work for you:

Hi,

for Topaz 0.2.5, the conda environment should be configured with Python 3.6 as far as I know… See also discussion that olibclarke mentioned: the software was recompiled using Python 3.6…

Cheers

p.s. see Topaz (Bepler, et al) | CryoSPARC Guide