Since updating to csparc 4.6.0, I’ve now seen twice that a 2D classification job will just never finish. There’s no error message, it just sits at the start of the final iteration (iteration 80 in this case) and doesn’t move - i.e. it’s been sitting there for 3 days, so it’s not just an issue of user impatience!
I’ve only seen this happen with 2D classification jobs. The first time it happened, I killed the job and cloned it, and it successfully ran. Has anyone else seen this with the new update? I don’t know if it’s something funky with csparc or our cluster, but that it seems specific to 2D classification makes me guess it’s some type of bug in csparc?
Edit: probably should have included this. This is what end of the job log looks like at the moment:
[CPU: 27.56 GB]
Done Full Iteration 79 took 396.605s for 160000 images
[CPU: 27.53 GB]
Outputting results…
[CPU: 27.54 GB]
Output particles to J40/J40_079_particles.cs
[CPU: 27.54 GB]
Output class averages to J40/J40_079_class_averages.cs, J40/J40_079_class_averages.mrc
[CPU: 27.54 GB]
Clearing previous iteration…
[CPU: 27.54 GB]
Deleting last_output_file_path_abs: /data/kollman/frames2/calise/CS-24sep10a-***/J40/J40_078_particles.cs
[CPU: 27.54 GB]
Deleting last_output_file_path_abs: /data/kollman/frames2/calise/CS-24sep10a-***/J40/J40_078_class_averages.cs
[CPU: 27.54 GB]
Deleting last_output_file_path_abs: /data/kollman/frames2/calise/CS-24sep10a-***/J40/J40_078_class_averages.mrc
[CPU: 27.54 GB]
Removed output results for P835 J40
[CPU: 27.56 GB]
Start of Iteration 80
[CPU: 27.56 GB]
– DEV 0 THR 0 NUM 683000 TOTAL 4812.6075 ELAPSED 7069.9769 –