Flex refinement on Particules not used for model training

Hello,

I’m trying to use 3D flex, and when i try to input my whole dataset (1mil4 ptlcs) in my 3D flex train. I get a ram allocation error.

To go around that problem, I tried training only one class of my dataset (264k ptlcs) and do a 3d reconstruct on my whole dataset.

However when I do so i get an error

Is it possible to fix ?

I tried training with one class and reconstructing with another, and that worked.

I dont get why it works sometime and sometime not. is it possible to reconstruct with dataset only of similar size of the trained one ?

Please can you post the outputs of these command as text.

csprojectid=P9 # replace with actual project ID
csjobid=J199 # replace with id of the failed job
cryosparcm eventlog $csprojectid $csjobid | tail -n 60
cryosparcm cli "get_job('$csprojectid', '$csjobid', 'job_type', 'version', 'instance_information', 'status',  'params_spec', 'errors_run', 'input_slot_groups')"

cryosparcm eventlog $csprojectid $csjobid | tail -n 60

output :

[Tue, 11 Mar 2025 15:03:18 GMT] Upsampled mask
[Tue, 11 Mar 2025 15:03:19 GMT] Upsampled tetramesh
[Tue, 11 Mar 2025 15:03:21 GMT] [CPU RAM used: 3040 MB] ====== Load particle data =======
[Tue, 11 Mar 2025 15:03:23 GMT] [CPU RAM used: 3524 MB] Reading in all particle data on the fly from files…
[Tue, 11 Mar 2025 15:03:23 GMT] [CPU RAM used: 3524 MB] Loading a ParticleStack with 1424000 items…
[Tue, 11 Mar 2025 15:03:23 GMT] [CPU RAM used: 3524 MB] ──────────────────────────────────────────────────────────────
SSD cache ACTIVE at /scratch/cryo_sparc/instance_lvx0970b:39001 (10 GB reserve)
┌─────────────────────┬─────────────────────┐
│ Cache usage │ Amount │
├─────────────────────┼─────────────────────┤
│ Total / Usable │ 1.82 TiB / 1.81 TiB │
│ Used / Free │ 1.80 TiB / 1.59 GiB │
│ Hits / Misses │ 849.37 GiB / 0.00 B │
│ Acquired / Required │ 849.37 GiB / 0.00 B │
└─────────────────────┴─────────────────────┘
Progress: [▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇] 285/285 (100%)
Elapsed: 0h 00m 00s
Active jobs: P6-J613, P6-J604
SSD cache complete for 285 file(s)
──────────────────────────────────────────────────────────────
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4205 MB] Done.
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4205 MB] Preparing all particle CTF data…
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4272 MB] Preparing Gold Standard Split…
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4272 MB] Set A is smaller than set B by 180 particles (0.0126 percent difference relative to the total dataset).
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4272 MB] Particles have input alignments3D/split assigned, so reusing pre-existing split.
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4278 MB] Split A contains 711910 particles
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4278 MB] Split B contains 712090 particles
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4278 MB] Setting up particle poses…
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4416 MB] ====== High resolution flexible refinement =======
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4416 MB] Max num L-BFGS iterations was set to 20
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4416 MB] Starting L-BFGS.
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4416 MB] Reconstructing half-map A
[Tue, 11 Mar 2025 15:03:36 GMT] [CPU RAM used: 4416 MB] Iteration 0 : 132000 / 711910 particles
[Tue, 11 Mar 2025 15:03:42 GMT] [CPU RAM used: 90 MB] WARNING: io_uring support disabled (not supported by kernel), I/O performance may degrade
[Tue, 11 Mar 2025 15:10:17 GMT] [CPU RAM used: 7211 MB] Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 129, in cryosparc_master.cryosparc_compute.run.main
File “/home/cryosparc_user/Cryosparc/cryosparc_worker/cryosparc_compute/jobs/flex_refine/run_highres.py”, line 217, in run
flexmod.do_hr_refinement_flex(numiter=params[‘flex_bfgs_num_iters’])
File “cryosparc_master/cryosparc_compute/jobs/flex_refine/flexmod.py”, line 1638, in cryosparc_master.cryosparc_compute.jobs.flex_refine.flexmod.do_hr_refinement_flex.lambda7
File “/home/cryosparc_user/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/scipy/optimize/_lbfgsb_py.py”, line 199, in fmin_l_bfgs_b
res = _minimize_lbfgsb(fun, x0, args=args, jac=jac, bounds=bounds,
File “/home/cryosparc_user/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/scipy/optimize/_lbfgsb_py.py”, line 309, in _minimize_lbfgsb
sf = _prepare_scalar_function(fun, x0, jac=jac, args=args, epsilon=eps,
File “/home/cryosparc_user/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/scipy/optimize/_optimize.py”, line 402, in _prepare_scalar_function
sf = ScalarFunction(fun, x0, args, grad, hess,
File “/home/cryosparc_user/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/scipy/optimize/_differentiable_functions.py”, line 166, in init
self._update_fun()
File “/home/cryosparc_user/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/scipy/optimize/_differentiable_functions.py”, line 262, in _update_fun
self._update_fun_impl()
File “/home/cryosparc_user/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/scipy/optimize/_differentiable_functions.py”, line 163, in update_fun
self.f = fun_wrapped(self.x)
File “/home/cryosparc_user/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/scipy/optimize/_differentiable_functions.py”, line 145, in fun_wrapped
fx = fun(np.copy(x), *args)
File “/home/cryosparc_user/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/scipy/optimize/_optimize.py”, line 78, in call
self._compute_if_needed(x, *args)
File “/home/cryosparc_user/Cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/scipy/optimize/_optimize.py”, line 72, in _compute_if_needed
fg = self.fun(x, *args)
File “cryosparc_master/cryosparc_compute/jobs/flex_refine/flexmod.py”, line 1638, in cryosparc_master.cryosparc_compute.jobs.flex_refine.flexmod.do_hr_refinement_flex.lambda7
File “cryosparc_master/cryosparc_compute/jobs/flex_refine/flexmod.py”, line 1606, in cryosparc_master.cryosparc_compute.jobs.flex_refine.flexmod.errfunc_flex
IndexError: index 265000 is out of bounds for axis 0 with size 265000

cryosparcm cli "get_job('$csprojectid', '$csjobid', 'job_type', 'version', 'instance_information', 'status',  'params_spec', 'errors_run', 'input_slot_groups')"

output

{‘_id’: ‘67cfe890051da66339b7e7c2’, ‘errors_run’: [{‘message’: ‘index 265000 is out of bounds for axis 0 with size 265000’, ‘warning’: False}], ‘input_slot_groups’: [{‘connections’: [{‘group_name’: ‘flex_model’, ‘job_uid’: ‘J428’, ‘slots’: [{‘group_name’: ‘flex_model’, ‘job_uid’: ‘J428’, ‘result_name’: ‘checkpoint’, ‘result_type’: ‘flex_model.checkpoint’, ‘slot_name’: ‘checkpoint’, ‘version’: ‘F’}]}], ‘count_max’: 1, ‘count_min’: 1, ‘description’: ‘’, ‘name’: ‘flex_model’, ‘repeat_allowed’: False, ‘slots’: [{‘description’: ‘’, ‘name’: ‘checkpoint’, ‘optional’: False, ‘title’: ‘Checkpoint’, ‘type’: ‘flex_model.checkpoint’}], ‘title’: ‘3DFlex model’, ‘type’: ‘flex_model’}, {‘connections’: [{‘group_name’: ‘particles’, ‘job_uid’: ‘J603’, ‘slots’: [{‘group_name’: ‘particles’, ‘job_uid’: ‘J603’, ‘result_name’: ‘blob_fullres’, ‘result_type’: ‘particle.blob’, ‘slot_name’: ‘blob_fullres’, ‘version’: ‘F’}, {‘group_name’: ‘particles’, ‘job_uid’: ‘J603’, ‘result_name’: ‘ctf’, ‘result_type’: ‘particle.ctf’, ‘slot_name’: ‘ctf’, ‘version’: ‘F’}, {‘group_name’: ‘particles’, ‘job_uid’: ‘J603’, ‘result_name’: ‘alignments3D’, ‘result_type’: ‘particle.alignments3D’, ‘slot_name’: ‘alignments3D’, ‘version’: ‘F’}, {‘group_name’: ‘particles’, ‘job_uid’: ‘J603’, ‘result_name’: ‘blob_train’, ‘result_type’: ‘particle.blob’, ‘slot_name’: None, ‘version’: ‘F’}, {‘group_name’: ‘particles’, ‘job_uid’: ‘J603’, ‘result_name’: ‘blob_train_ctf’, ‘result_type’: ‘particle.blob’, ‘slot_name’: None, ‘version’: ‘F’}, {‘group_name’: ‘particles’, ‘job_uid’: ‘J603’, ‘result_name’: ‘blob’, ‘result_type’: ‘particle.blob’, ‘slot_name’: None, ‘version’: ‘F’}, {‘group_name’: ‘particles’, ‘job_uid’: ‘J603’, ‘result_name’: ‘alignments2D’, ‘result_type’: ‘particle.alignments2D’, ‘slot_name’: None, ‘version’: ‘F’}, {‘group_name’: ‘particles’, ‘job_uid’: ‘J603’, ‘result_name’: ‘location’, ‘result_type’: ‘particle.location’, ‘slot_name’: None, ‘version’: ‘F’}, {‘group_name’: ‘particles’, ‘job_uid’: ‘J603’, ‘result_name’: ‘pick_stats’, ‘result_type’: ‘particle.pick_stats’, ‘slot_name’: None, ‘version’: ‘F’}]}], ‘count_max’: 1, ‘count_min’: 1, ‘description’: ‘Particle stacks to use. Multiple stacks will be concatenated.’, ‘name’: ‘particles’, ‘repeat_allowed’: False, ‘slots’: [{‘description’: ‘’, ‘name’: ‘blob_fullres’, ‘optional’: False, ‘title’: ‘Particle data blobs’, ‘type’: ‘particle.blob’}, {‘description’: ‘’, ‘name’: ‘ctf’, ‘optional’: False, ‘title’: ‘Particle ctf parameters’, ‘type’: ‘particle.ctf’}, {‘description’: ‘’, ‘name’: ‘alignments3D’, ‘optional’: False, ‘title’: ‘Particle 3D alignments’, ‘type’: ‘particle.alignments3D’}], ‘title’: ‘Prepared particles’, ‘type’: ‘particle’}], ‘instance_information’: {‘CUDA_version’: ‘11.8’, ‘available_memory’: ‘166.51GB’, ‘cpu_model’: ‘AMD Ryzen Threadripper 3960X 24-Core Processor’, ‘driver_version’: ‘12.3’, ‘gpu_info’: [{‘id’: 0, ‘mem’: 25435766784, ‘name’: ‘NVIDIA GeForce RTX 3090’, ‘pcie’: ‘0000:01:00’}, {‘id’: 1, ‘mem’: 25438126080, ‘name’: ‘NVIDIA GeForce RTX 3090’, ‘pcie’: ‘0000:4e:00’}], ‘ofd_hard_limit’: 262144, ‘ofd_soft_limit’: 1024, ‘physical_cores’: 24, ‘platform_architecture’: ‘x86_64’, ‘platform_node’: ‘lvx0970b’, ‘platform_release’: ‘4.18.0-513.11.1.el8_9.x86_64’, ‘platform_version’: ‘#1 SMP Wed Jan 10 22:58:54 UTC 2024’, ‘total_memory’: ‘251.23GB’, ‘used_memory’: ‘81.65GB’}, ‘job_type’: ‘flex_highres’, ‘params_spec’: {‘flex_do_noflex_recon’: {‘value’: False}}, ‘project_uid’: ‘P6’, ‘status’: ‘failed’, ‘uid’: ‘J604’, ‘version’: ‘v4.6.2’}

Thanks for your answer,
Here is what you asked for