3D flex reconstruction fails during iteration 0

Hello all,

I am currently running into an issue while using 3D flex on cryosparc v4.5.1. When I try running 3D Flex Reconstruction I get this event log:
[CPU: 11.94 GB]
====== Load particle data =======

[CPU: 11.95 GB]
Reading in all particle data on the fly from files…

[CPU: 11.95 GB]
Loading a ParticleStack with 96000 items…

[CPU: 11.95 GB]
SSD cache ACTIVE at /beegfs/scratch/mohilab/cryosparc_cache/instance_mohi-gpu01.lsi.umich.edu:39001 (10 GB reserve) (10 TB quota)
│ Cache usage │ Amount │
│ Total / Usable │ 97.79 TiB / 9.54 TiB │
│ Used / Free │ 5.22 TiB / 4.32 TiB │
│ Hits / Misses │ 238.42 GiB / 0.00 B │
│ Acquired / Required │ 238.42 GiB / 0.00 B │
Progress: [▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇] 20/20 (100%)
Elapsed: 0h 00m 13s
Active jobs: P263-J677
SSD cache complete for 20 file(s)

[CPU: 12.08 GB]

[CPU: 12.08 GB]
Preparing all particle CTF data…

[CPU: 12.09 GB]
Parameter “Force re-do GS split” was off. Using input split…

[CPU: 12.09 GB]
Split A contains 48000 particles

[CPU: 12.09 GB]
Split B contains 48000 particles

[CPU: 12.09 GB]
Setting up particle poses…

[CPU: 12.10 GB]
====== High resolution flexible refinement =======

[CPU: 12.10 GB]
Max num L-BFGS iterations was set to 20

[CPU: 12.10 GB]
Starting L-BFGS.

[CPU: 12.10 GB]
Reconstructing half-map A

[CPU: 12.10 GB]
Iteration 0 : 47000 / 48000 particles

**** Kill signal sent by CryoSPARC (ID: ) ****

Job is unresponsive - no heartbeat received in 180 seconds.

I also noticed when I looked into the metadata log tab, I saw this alert after multiple lines of sending heartbeats:

/lsi/local/pkg/cryosparc/cryosparc_worker/bin/cryosparcw: line 150: 476151 Segmentation fault (core dumped) python -c “import cryosparc_compute.run as run; run.run()” “$@”

Is this a memory issue? I can see from the documentation: “GPU memory use is relatively limited during training time, but at reconstruction time the GPU must be able to fit at least 2x the size of a volume at the full resolution box size.” If so, are there any work arounds to this, since my particles are relatively large (800 pixels)

A test would be downsample your particles to maybe 200 pixels and test with that stack. If it completes, the box size is the problem. If you’re not close to the sampling limit, you can try downsampling to just beyond your resolution.

So in terms of the 3D flex pipeline, would I just run a downsampling job using the particles from the Flex data prep and then put that particle stack into the flex reconstruction? Or do I need to rerun the downsampled particles through a new Flex data prep and a Flex Train job?

So I did some old fashion trial and error (please let me know if this is correct)…

  1. Using the original particle set, I downsampled them from 800 to 400 pixels

  2. I then reran the Flex Data Prep with training box size 200 (pix), (ie. same as the previous Flex Data Prep and 3D flex training jobs).

  3. Using the previous 3D Flex model from the original Flex training job and the new downsampled Prepared Particles, I ran a new 3D flex reconstruction job.

So far, it looks like its working as it is already at iteration 3. I will let you know how it finishes, but I think you are correct that it was a memory issue due to box size.

Thank you for the help.

Just following up, the job completely ran.

The only question I still have is: Is there any way I can run this job using the original particle stack? Since the final FSC resolution of the Flex Reconstruction job is about the same as the Non-uniform refinement job output that I used as the initial particle stack (~4.5A), I’m not entirely sure if I will see an improvement or not, if it was run with the unbinned data. Any advice is greatly appreciated.