Hi, we use cryosparc 2.4.2 on cluster nodes, each node with 4 X 1080ti, 20 CPUs and 256 GB of memory. Nodes run CUDA9.1 and submit is via SLURM.
Most things work when using 216 pixel box, but so far any attempt with homo_refinement with 432 pix box failed:
Initially “no heartbeat received in 30 seconds” error appeared after 1st iteration, at “Computing FSCs” step of second iteration.
I tried to increase memory allocation in cluster.sh, so that instead of requested 24GB it gets 72GB - now “LogicError: cuCtxCreate failed: invalid device ordinal” appears earlier, at initial scale estimation stage.
Also, “LogicError: cuCtxCreate failed: invalid device ordinal” error appears even with 216 box when trying Non-uniform refinement.
Perhaps internal “ram_gb” parameter can be increased?
Any tips what can be tried highly appreciated!
Many thanks for any help!