Topaz cross-validation errors in 5.0.4

mokca · April 30, 2026, 9:30am

We were having this problem: Topaz errors on cryoSPARC v5.0.2

(Summary: Topaz Cross validation failed in 5.0.2 even though the Topaz train sub-jobs completed successfully. This was fixed in 5.0.4)

I updated to 5.0.4 and now we get a different problem. The Topaz train sub jobs fail.

The parameter being optimized is Epoch size. The Initial value to begin with is 50 and the Value to increment parameter by is 50.

In the training jobs launched by the cross-validation job, Topaz dies on launch with the error

topaz train: error: argument --epoch-size: invalid int value: '150.0'

Inside the job Topaz is being launched with

Starting training by running command /common/app/topaz/0.2.5/bin/topaz train
--epoch-size 150.0 --k-fold 2 --fold 1 --learning-rate 0.0002 
--minibatch-size 128 --num-epochs 10 --method GE-binomial --slack -1.0 
--autoencoder 0.0 --l2 0.0 --minibatch-balance 0.0625 --model resnet8 
--units 32 --dropout 0.0 --bn on --unit-scaling 2 --ngf 32 --num-workers 4 
--cross-validation-seed 906145623 --radius 3 --num-particles 40 
-o /<...>/J84/cv/model_n150.0_fold1_train_test_curve.txt --device 1 
--train-images <...>J90/image_list_train.txt 
--train-targets <...>J90/topaz_particles_processed_train.txt 
--test-images <...>J90/image_list_test.txt 
--test-targets <...>J90/topaz_particles_processed_test.txt 
--save-prefix=<...>/J90/models/model

It looks to me like the problem is that the epoch size is being presented as a float to Topaz.

Is this something to do with the way we’re setting up to job or with the cross-validation job?