Template picker not running after update v2.1.0

Dear community,

I just updated cryosparc yesterday to the latest version and suddenly can not use the template picker any more.
Is any one else suffering from that problem?
Since refinement is still working it seems not to be a general/system problem.

Traceback (most recent call last):
File “cryosparc2_worker/cryosparc2_compute/run.py”, line 78, in cryosparc2_compute.run.main
File “cryosparc2_worker/cryosparc2_compute/jobs/template_picker_gpu/run.py”, line 111, in cryosparc2_compute.jobs.template_picker_gpu.run.run
File “cryosparc2_worker/cryosparc2_compute/jobs/template_picker_gpu/run.py”, line 208, in cryosparc2_compute.jobs.template_picker_gpu.run.run
File “/data/tarek/cryosparc2/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/fft.py”, line 115, in init
onembed, ostride, odist, self.fft_type, self.batch)
File “/data/tarek/cryosparc2/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py”, line 222, in cufftPlanMany
cufftCheckStatus(status)
File “/data/tarek/cryosparc2/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py”, line 110, in cufftCheckStatus
raise cufftExceptions[status]
cufftAllocFailed

Best,
t.

Hi Tarek,

Thanks for reporting this, we’re looking into it and will keep you updated.

Regards,
Suhail

Hi Tarek,

I was unable to reproduce this issue in our instance of v2.1.0. The template picker job has not been changed for some time - what version were you running before updating? Did you manage to run the template picker job on the exact same dataset without errors before the update?

The issue may be dataset-specific, which you can confirm by trying to run the Template Picker job on the T20S Tutorial Dataset. What was the particle diameter that you were using on the failed job?

The error is caused by the GPU memory allocation, and running the watch nvidia-smi -n 1 command on the machine the job is running on while it is running will let you monitor the GPU memory usage. The template picker job should tune the batch size to fit in the available memory, but if you see the memory usage spike to 100% right before the crash, it may point to an issue with this tuning process. Occasionally, simply re-running the job will work.

Please let me know what you find!

Best,
Ali

Hi,

I’ve also run into a similar problem. The template picking worked for the tutorial through the local motion correction job, but taking particles and micrographs from the ‘inspect particle picks’ job and passing them directly to the ‘extract from micrographs’ job (box size 256 pxl) results in the following error:

Starting multithreaded pipeline …

Started pipeline

GPU 0 using a batch size of 1024

– 0.0: processing J1/imported/Frame2_0041.mrc

– 0.1: processing J1/imported/Frame2_0042.mrc

Traceback (most recent call last):
File “cryosparc2_compute/jobs/runcommon.py”, line 738, in run_with_except_hook
run_old(*args, **kw)
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/threading.py”, line 754, in run
self.__target(*self.__args, **self.__kwargs)
File “cryosparc2_compute/jobs/pipeline.py”, line 53, in stage_target
work = processor.process(item)
File “cryosparc2_compute/jobs/extract/run.py”, line 268, in process
cuda_dev = self.cuda_dev, ET=self.ET, timer=timer, mic_idx=mic_idx, batch_size = self.batch_size)
File “cryosparc2_compute/jobs/extract/extraction_gpu.py”, line 208, in do_extract_particles_single_mic_gpu
skcuda.fft.ifft( ET.output_f_gpu[:batch_size], ET.output_gpu[:batch_size], ifft_plan ,scale=True)
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/fft.py”, line 298, in ifft
return _fft(x_gpu, y_gpu, plan, cufft.CUFFT_INVERSE, y_gpu.size/plan.batch)
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/fft.py”, line 198, in _fft
int(y_gpu.gpudata))
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py”, line 301, in cufftExecC2R
cufftCheckStatus(status)
File “/home/dgl/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cufft.py”, line 110, in cufftCheckStatus
raise cufftExceptions[status]
cufftExecFailed

Can you check the compute mode of your GPUs? they should be in Default (not process-exclusive) mode

Looks like in default.

±----------------------------------------------------------------------------+
| NVIDIA-SMI 396.54 Driver Version: 396.54 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro P4000 Off | 00000000:73:00.0 On | N/A |
| 46% 33C P8 6W / 105W | 765MiB / 8111MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1321 G /usr/lib/xorg/Xorg 396MiB |
| 0 2139 G compiz 190MiB |
| 0 2394 G …quest-channel-token=6741064447546151985 176MiB |
±----------------------------------------------------------------------------+

Hi @Bruk and @tarek,

Is is possible I can get an update on this issue? Does this still happen?

Hi,
for me it is working again. I am using v2.4.0. at the moment.

Hi @tarek,

Thats great. Thank you for getting back to me!