Topaz Train - TypeError: concat() takes 1 positional argument but 2 were given

I support a group of cryosparc users who have recently run into an issue using Topaz Train. Here is the output of the failed Topaz Train job.

Micrograph preprocessing command complete.

[CPU:  228.9 MB]

Starting particle pick preprocessing by running command /users/r/a/rcat/miniconda3/envs/topaz/bin/topaz convert --down-scale 4 --threshold 0 -o /netfiles/rcat_lab/cryosparc/cs-data/J185/topaz_particles_processed.txt /netfiles/rcat_lab/cryosparc/cs-data/J185/topaz_particles_raw.txt

[CPU:  228.9 MB]

Particle pick preprocessing command complete.

[CPU:  228.9 MB]

Preprocessing done in 301.530s.
[CPU:  228.9 MB]

--------------------------------------------------------------
[CPU:  228.9 MB]

Starting train-test splitting...

[CPU:  228.9 MB]

tarting dataset splitting by running command /users/r/a/rcat/miniconda3/envs/topaz/bin/topaz train_test_split --number 17 --seed 541036979 --image-dir /netfiles/rcat_lab/cryosparc/cs-data/J185/preprocessed /netfiles/rcat_lab/cryosparc/cs-data/J185/topaz_particles_processed.txt

[CPU:  228.9 MB]

# splitting 85 micrographs with 434 labeled particles into 68 train and 17 test micrographs
[CPU:  228.9 MB]

Traceback (most recent call last):
[CPU:  228.9 MB]

File "/users/r/a/rcat/miniconda3/envs/topaz/bin/topaz", line 8, in <module>
[CPU:  228.9 MB]

sys.exit(main())
[CPU:  228.9 MB]

File "/users/r/a/rcat/miniconda3/envs/topaz/lib/python3.8/site-packages/topaz/main.py", line 148, in main
[CPU:  228.9 MB]

args.func(args)
[CPU:  228.9 MB]

File "/users/r/a/rcat/miniconda3/envs/topaz/lib/python3.8/site-packages/topaz/commands/train_test_split.py", line 108, in main
[CPU:  228.9 MB]

targets_train = pd.concat(groups_train, 0)
[CPU:  228.9 MB]

TypeError: concat() takes 1 positional argument but 2 were given
[CPU:  228.9 MB]

Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 96, in cryosparc_compute.run.main
  File "/gpfs2/scratch/rcat/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/run_topaz.py", line 307, in run_topaz_wrapper_train
    utils.run_process(split_command)
  File "/gpfs2/scratch/rcat/cryosparc/cryosparc_worker/cryosparc_compute/jobs/topaz/topaz_utils.py", line 98, in run_process
    assert process.returncode == 0, f"Subprocess exited with status {process.returncode} ({str_command})"
AssertionError: Subprocess exited with status 1 (/users/r/a/rcat/miniconda3/envs/topaz/bin/topaz train_test_split --number 17 --seed 541036979 --image-dir /netfiles/rcat_lab/cryosparc/cs-data/J185/preprocessed /netfiles/rcat_lab/cryosparc/cs-data/J185/topaz_particles_processed.txt)

I should mention that this user group had previously been running into issue with Topaz denoise jobs - that issue turned out to be related to PyTorch. The Topaz installation instructions on the Cryosparc site suggest using Python3.6 and installing via Conda - when following those steps PyTorch was using a version of Cuda that was too old for our A100 cards (these cards need Cuda 11+).

I created a new conda environment using Python3.8 (hoping to get a newer version of PyTorch). I then had to install everything via pip rather than conda install as I was only getting the CPU version of PyTorch when installing via conda, even when specifying cuda toolkit.

Here is the output of pip freeze

 pip freeze
certifi==2022.12.7
charset-normalizer==3.1.0
future==0.18.3
idna==3.4
joblib==1.2.0
numpy==1.24.3
pandas==2.0.1
Pillow==9.5.0
python-dateutil==2.8.2
pytz==2023.3
requests==2.28.2
scikit-learn==1.2.2
scipy==1.10.1
six==1.16.0
threadpoolctl==3.1.0
topaz-em==0.2.5
torch==1.13.1+cu116
torchaudio==0.13.1+cu116
torchvision==0.14.1+cu116
typing_extensions==4.5.0
tzdata==2023.3
urllib3==1.26.15

That solved the issues we were seeing with denoise jobs “failing successfully” and resulted with the “denoised_micrographs” directory being empty. My concern is that using newer versions of Pytorch may have introduced an issue with Topaz Train jobs. I don’t actually use the software much myself so I’m unsure where to look next.

Any help or advice on how to get this environment working for Topaz jobs would be helpful. Thank you in advance to the community here.

-Travis

Not directly related to your question, but I have run into this issue with lots of PyTorch and Tensorflow software, and found conda is installing the CPU version when it doesn’t detect any CUDA capable devices. We are using a cluster with a shared filesystem, so I SSH’d to one of the GPU nodes and installed the conda environment for Topaz there.

Here is my conda environment that is working properly with Topaz:

_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
blas                      1.0                         mkl    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
ca-certificates           2022.12.7            ha878542_0    conda-forge
certifi                   2021.5.30        py36h5fab9bb_0    conda-forge
cudatoolkit               11.1.1              ha002fc5_11    conda-forge
dataclasses               0.8                pyh787bdff_2    conda-forge
ffmpeg                    4.3                  hf484d3e_0    pytorch
freetype                  2.12.1               hca18f0e_1    conda-forge
future                    0.18.2           py36h5fab9bb_3    conda-forge
gmp                       6.2.1                h58526e2_0    conda-forge
gnutls                    3.6.13               h85f3911_1    conda-forge
icu                       70.1                 h27087fc_0    conda-forge
intel-openmp              2023.0.0         h9e868ea_25371  
joblib                    1.2.0              pyhd8ed1ab_0    conda-forge
jpeg                      9e                   h0b41bf4_3    conda-forge
lame                      3.100             h166bdaf_1003    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.40                 h41732ed_0    conda-forge
lerc                      3.0                  h9c3ff4c_0    conda-forge
libblas                   3.9.0           1_h86c2bf4_netlib    conda-forge
libcblas                  3.9.0           5_h92ddd45_netlib    conda-forge
libdeflate                1.10                 h7f98852_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgfortran-ng            12.2.0              h69a702a_19    conda-forge
libgfortran5              12.2.0              h337968e_19    conda-forge
libgomp                   12.2.0              h65d4601_19    conda-forge
libhwloc                  2.9.0                hd6dc26d_0    conda-forge
libiconv                  1.17                 h166bdaf_0    conda-forge
liblapack                 3.9.0           5_h92ddd45_netlib    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libopenblas               0.3.21          pthreads_h78a6416_3    conda-forge
libpng                    1.6.39               h753d276_0    conda-forge
libsqlite                 3.40.0               h753d276_0    conda-forge
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libtiff                   4.3.0                h0fcbabc_4    conda-forge
libuv                     1.44.2               h166bdaf_0    conda-forge
libwebp-base              1.2.4                h166bdaf_0    conda-forge
libxml2                   2.10.3               h7463322_0    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
mkl                       2023.0.0         h6d00ec8_25399  
ncurses                   6.3                  h27087fc_1    conda-forge
nettle                    3.6                  he412f7d_0    conda-forge
ninja                     1.11.1               h924138e_0    conda-forge
numpy                     1.19.5           py36hfc0c790_2    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openh264                  2.1.1                h780b84a_0    conda-forge
openjpeg                  2.5.0                h7d73246_0    conda-forge
openssl                   1.1.1t               h0b41bf4_0    conda-forge
pandas                    1.1.5            py36h284efc9_0    conda-forge
pillow                    8.3.2            py36h676a545_0    conda-forge
pip                       20.0.2                   py36_1    conda-forge
python                    3.6.15          hb7a2778_0_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.6                     2_cp36m    conda-forge
pytorch                   1.10.2          py3.6_cuda11.1_cudnn8.0.5_0    pytorch
pytorch-mutex             1.0                        cuda    pytorch
pytz                      2022.7.1           pyhd8ed1ab_0    conda-forge
readline                  8.1.2                h0f457ee_0    conda-forge
scikit-learn              0.24.2           py36hc89565f_1    conda-forge
scipy                     1.5.3            py36h81d768a_1    conda-forge
setuptools                49.6.0           py36h5fab9bb_3    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
sqlite                    3.40.0               h4ff8645_0    conda-forge
tbb                       2021.8.0             hf52228f_0    conda-forge
threadpoolctl             3.1.0              pyh8a188c0_0    conda-forge
tk                        8.6.12               h27826a3_0    conda-forge
topaz                     0.2.5                      py_0    tbepler
torchvision               0.11.3               py36_cu111    pytorch
typing_extensions         4.1.1              pyha770c72_0    conda-forge
wheel                     0.34.2                   py36_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
zlib                      1.2.13               h166bdaf_4    conda-forge
zstd                      1.5.2                h3eb15da_6    conda-forge

This is working for us with A100s and A40s on CUDA 11.3.

When I made the conda environment, I believe I pinned Python 3.6, and then ran:

conda install numpy pandas scikit-learn
conda install -c pytorch pytorch torchvision

For use with CryoSPARC, we recommend “wrapping” the topaz command in a shell script as described here for better control over the command’s environment.

What where the Version and Build of the pytorch package so installed? 1.10.2 and py3.6_cuda11.3_cudnn8.2.0_0, respectively?

Hi,

I’m running into the same issue with topaz 0.2.5a. I just moved in a lab that handle their cluster via SBgrid. I don’t have any control of how topaz was installed, but what I know is it was not installed through conda.

What could I do to resolve the error ? (see below)

[CPU:  233.6 MB]
Starting Topaz process using version 0.2.5a...
[CPU:  233.6 MB]
Random seed used is 58879618
[CPU:  233.6 MB]
--------------------------------------------------------------
[CPU:  233.6 MB]
Starting preprocessing...

[CPU:  233.6 MB]
Starting micrograph preprocessing by running command /programs/x86_64-linux/system/sbgrid_bin/topaz preprocess --scale 4 --niters 200 --num-workers 8 -o /data/work/kmartin/cryosparc/CS-XX/J63/preprocessed [10 MICROGRAPH PATHS EXCLUDED FOR LEGIBILITY]

[CPU:  233.6 MB]
Preprocessing over 2 processes...
[CPU:  233.8 MB]
Inverting negative staining...
[CPU:  233.9 MB]
Inverting negative staining complete.

[CPU:  233.9 MB]
Micrograph preprocessing command complete.

[CPU:  233.9 MB]
Starting particle pick preprocessing by running command /programs/x86_64-linux/system/sbgrid_bin/topaz convert --down-scale 4 --threshold 0 -o /data/work/kmartin/cryosparc/CS-XX/J63/topaz_particles_processed.txt /data/work/kmartin/cryosparc/CS-XX/J63/topaz_particles_raw.txt

[CPU:  233.9 MB]
Particle pick preprocessing command complete.

[CPU:  233.9 MB]
Preprocessing done in 83.719s.
[CPU:  233.9 MB]
--------------------------------------------------------------
[CPU:  233.9 MB]
Starting train-test splitting...

[CPU:  233.9 MB]
Starting dataset splitting by running command /programs/x86_64-linux/system/sbgrid_bin/topaz train_test_split --number 2 --seed 58879618 --image-dir /data/work/kmartin/cryosparc/CS-XX/J63/preprocessed /data/work/kmartin/cryosparc/CS-XX/J63/topaz_particles_processed.txt

[CPU:  233.9 MB]
# splitting 10 micrographs with 2060 labeled particles into 8 train and 2 test micrographs
[CPU:  233.9 MB]
Traceback (most recent call last):
[CPU:  233.9 MB]
File "/programs/x86_64-linux/topaz/0.2.5_cu11.2/bin/topaz", line 33, in <module>
[CPU:  233.9 MB]
sys.exit(load_entry_point('topaz-em==0.2.5', 'console_scripts', 'topaz')())
[CPU:  233.9 MB]
File "/programs/x86_64-linux/topaz/0.2.5_cu11.2/lib/python3.9/site-packages/topaz/main.py", line 148, in main
[CPU:  233.9 MB]
args.func(args)
[CPU:  233.9 MB]
File "/programs/x86_64-linux/topaz/0.2.5_cu11.2/lib/python3.9/site-packages/topaz/commands/train_test_split.py", line 108, in main
[CPU:  233.9 MB]
targets_train = pd.concat(groups_train, 0)
[CPU:  233.9 MB]
TypeError: concat() takes 1 positional argument but 2 were given
[CPU:  233.9 MB]
Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 96, in cryosparc_compute.run.main
  File "/net/cemaster/data/software/cryoSPARC/V2/cryosparc2_worker/cryosparc_compute/jobs/topaz/run_topaz.py", line 307, in run_topaz_wrapper_train
    utils.run_process(split_command)
  File "/net/cemaster/data/software/cryoSPARC/V2/cryosparc2_worker/cryosparc_compute/jobs/topaz/topaz_utils.py", line 98, in run_process
    assert process.returncode == 0, f"Subprocess exited with status {process.returncode} ({str_command})"
AssertionError: Subprocess exited with status 1 (/programs/x86_64-linux/system/sbgrid_bin/topaz train_test_split --number 2 --seed 58879618 --image-dir /data/work/kmartin/cryosparc/CS-XX/J63/preprocessed /data/work/kmartin/cryosparc/CS-XX/J63/topaz_particles_processed.txt)

I’m totally lost. So far by using the conda environment I never had this kind of problem.
Any help will be welcomed…

Thank you all

Kevin

Welcome to the forum @KevinM.
Did you try

  1. installing an additional copy of topaz into a fresh conda environment
  2. creating a wrapper script around the conda-based topaz installation
  3. pointing Path to Topaz executable to the wrapper script

?

Hey @wtempel, thanks for your response.
The problem is that I have no administration privilege on that cluster.
Topaz do not seems to work in a conda environment: “conda activate topaz” returns “conda : command not found”.

I created the wrapper script pointing to path to topaz executable but since conda is not working, I’m kind of stuck.

You would need to install your own copy of conda software, like miniforge, then create a new conda environment as described in the topaz repository. These steps do not require admin privileges: conda software and topaz can be installed in your home directory.

Hi,

I finally end up by using topaz 0.2.4 instead of 0.2.5a. It works now.