Ab initio reconstruction keeps failing with multiple errors


Hello,

I’m having trouble processing a set of long stacked particles. I’ve attached some representative 2D classes. No matter what I try, I haven’t been able to get a reasonable ab initio reconstruction. I’ve looked through many posts from users with similar issues, but I’m still stuck. I’m working in CryoSPARC v4.7.1.

At first, the ab initio job failed with this error:

ValueError: Detected NaN values in engine.compute_error.
41783040 NaNs in total, 90 particles with NaNs.

I checked for corrupt particles, but none were found.

Following suggestions from the forum, I changed a few settings. I set the Noise model (white, symmetric or coloured) to white, and I turned off both Enforce non-negativity and Center structures in real space.

After doing that, the job failed with the following assertion error:

Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 129, in cryosparc_master.cryosparc_compute.run.main
  File "cryosparc_master/cryosparc_compute/jobs/abinit/run.py", line 302, in cryosparc_master.cryosparc_compute.jobs.abinit.run.run_homo_abinit
  File "/cluster/software/cryosparc/cryosparc-durielab/v4.7.1/cryosparc_worker/cryosparc_compute/noise_model.py", line 119, in get_noise_estimate
    assert n.all(n.isfinite(ret))
AssertionError

For context, I have close to 20,000 particles in this ab initio job. During the upstream 2D classification, I used a circular mask of 800 Å. The particles were extracted in a 2000 pixel box and downsampled to 500 pixels. I used the template picker upstream. I am happy to share more information about any upstream jobs if needed. Thank you!

1 Like

@Radha09 Have you already applied patch 250814 for v4.7.1, which includes a fix that may apply to the error you observed?

Hi @wtempel,

I installed the patch version, but still no luck. I’m now getting this error:

Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 129, in cryosparc_master.cryosparc_compute.run.main
  File "cryosparc_master/cryosparc_compute/jobs/abinit/run.py", line 302, in cryosparc_master.cryosparc_compute.jobs.abinit.run.run_homo_abinit
  File "/cluster/software/cryosparc/cryosparc-durielab/v4.7.1/cryosparc_worker/cryosparc_compute/noise_model.py", line 119, in get_noise_estimate
    assert n.all(n.isfinite(ret))
AssertionError

This happens in a fresh ab initio job on the same dataset.

As a test, I cloned an older ab initio job in the same workspace that had finished fine with 1 class. The cloned job still runs when I keep it at 1 class. But if I rerun it with 3 classes, I hit the same error and the classes look very streaky and odd.

Do you have any suggestions on what might be going on or what to try next?

@Radha09 Please can you post the output of this sequence of commands for the post-patch job failure:

csprojectid=P99 # replace with actual project ID
csjobid=J199 # replace with id of the job that failed after patching
cryosparcm cli "get_job('$csprojectid', '$csjobid', 'job_type', 'version', 'instance_information', 'status',  'params_spec', 'errors_run', 'started_at')" 

Hi @wtempel I have reached out to my sys admin and they said that the cryosparc VM cannot reach the cluster login node, and they are still looking into it and will get back to me soon on the output of the commands sequence.
My 2D classification jobs in the same workspace run fine.

I have attached two snapshots of 2 consecutive ab inito check points. There’s a drastic difference between these checkpoints.


This is part of the log:

gpufft: creating new cufft plan (plan id 2   pid 194602) 
	gpu_id  0 
	ndims   2 
	dims    500 500 0 
	inembed 500 500 0 
	istride 1 
	idist   250000 
	onembed 500 500 0 
	ostride 1 
	odist   250000 
	batch   300 
	type    C2C 
	wkspc   automatic 
	Python traceback:

<string>:1: UserWarning: Cannot manually free CUDA array; will be freed when garbage collected
========= sending heartbeat at 2025-11-21 16:52:01.460638
========= sending heartbeat at 2025-11-21 16:52:11.488466
========= sending heartbeat at 2025-11-21 16:52:21.513651
========= sending heartbeat at 2025-11-21 16:52:31.540546
========= sending heartbeat at 2025-11-21 16:52:41.566519
========= sending heartbeat at 2025-11-21 16:52:51.591912
========= sending heartbeat at 2025-11-21 16:53:01.617953
========= sending heartbeat at 2025-11-21 16:53:11.647193
========= sending heartbeat at 2025-11-21 16:53:21.675057
========= sending heartbeat at 2025-11-21 16:53:31.702346
========= sending heartbeat at 2025-11-21 16:53:41.730731
========= sending heartbeat at 2025-11-21 16:53:51.757958
========= sending heartbeat at 2025-11-21 16:54:01.784508
========= sending heartbeat at 2025-11-21 16:54:11.811302
========= sending heartbeat at 2025-11-21 16:54:21.834806
========= sending heartbeat at 2025-11-21 16:54:31.862772
========= sending heartbeat at 2025-11-21 16:54:41.890757
========= sending heartbeat at 2025-11-21 16:54:51.916945
========= sending heartbeat at 2025-11-21 16:55:01.944139
========= sending heartbeat at 2025-11-21 16:55:11.971371
========= sending heartbeat at 2025-11-21 16:55:22.001148
========= sending heartbeat at 2025-11-21 16:55:32.027698
========= sending heartbeat at 2025-11-21 16:55:42.051439
========= sending heartbeat at 2025-11-21 16:55:52.077367
========= sending heartbeat at 2025-11-21 16:56:02.103826
========= sending heartbeat at 2025-11-21 16:56:12.134160
========= sending heartbeat at 2025-11-21 16:56:22.155663
========= sending heartbeat at 2025-11-21 16:56:32.181880
========= sending heartbeat at 2025-11-21 16:56:42.208061
========= sending heartbeat at 2025-11-21 16:56:52.234065
========= sending heartbeat at 2025-11-21 16:57:02.260066
========= sending heartbeat at 2025-11-21 16:57:12.287950
========= sending heartbeat at 2025-11-21 16:57:22.311058
========= sending heartbeat at 2025-11-21 16:57:32.338069
========= sending heartbeat at 2025-11-21 16:57:42.364857
========= sending heartbeat at 2025-11-21 16:57:52.392035
========= sending heartbeat at 2025-11-21 16:58:02.420059
========= sending heartbeat at 2025-11-21 16:58:12.439065
========= sending heartbeat at 2025-11-21 16:58:22.466070
========= sending heartbeat at 2025-11-21 16:58:32.494651
========= sending heartbeat at 2025-11-21 16:58:42.522423
========= sending heartbeat at 2025-11-21 16:58:52.548078
/cluster/software/cryosparc/cryosparc-durielab/v4.7.1/cryosparc_worker/cryosparc_compute/util/logsumexp.py:41: RuntimeWarning: divide by zero encountered in log
  return n.log(wa * n.exp(a - vmax) + wb * n.exp(b - vmax) ) + vmax
<string>:1: RuntimeWarning: divide by zero encountered in double_scalars
========= sending heartbeat at 2025-11-21 16:59:02.574986
========= sending heartbeat at 2025-11-21 16:59:12.604058
========= sending heartbeat at 2025-11-21 16:59:22.630060
========= sending heartbeat at 2025-11-21 16:59:32.657825
========= sending heartbeat at 2025-11-21 16:59:42.684058
========= sending heartbeat at 2025-11-21 16:59:52.709066
========= sending heartbeat at 2025-11-21 17:00:02.737252
========= sending heartbeat at 2025-11-21 17:00:12.761276
========= sending heartbeat at 2025-11-21 17:00:22.785832
========= sending heartbeat at 2025-11-21 17:00:32.812061
========= sending heartbeat at 2025-11-21 17:00:42.838796
========= sending heartbeat at 2025-11-21 17:00:52.865070
========= sending heartbeat at 2025-11-21 17:01:02.890066
========= sending heartbeat at 2025-11-21 17:01:12.918903
========= sending heartbeat at 2025-11-21 17:01:22.946068
========= sending heartbeat at 2025-11-21 17:01:32.975066

Did you hear back from the admins?
Alternatively, you may

  1. download the questionable ab initio job’s error report.
  2. share the job report with the CryoSPARC developers privately: If you can, you may upload the zip file to your institutions file sharing service and send me a personal message with the download link. If that’s not possible, please let me know and I can make arrangements on our end.