Hi,
I am new to this forum and also new to CryoEM in general.
I am an experiencened X-Ray crystallographyer and normally manage our IT for that software suits. However, as one of our PhD students is trying some CryoEM now, I am also responsible for CryoSparc.
I set up CryoSparc on our AlphaFold computer, in a Master / Cluster configuration, because I hope that we can acquire more hardware soon. Currently, we have the following setup: NVIDIA® GeForce RTX™ 3090 24GB / Intel® Core™ i9 11900KF / 64GB RAM 3466Mhz / 4TB SSD.
We are running Nvidia 510.54 with Cuda 11.6 (because of AlphaFold) and are queuing all the GPU jobs (AF & CryoSparc) via Slurm. We are running the latest patch 220315.
The Tutorial dataset worked beautifully, however, the first real dataset make trouble. We collected over 15TB of data at the ESRF and regardless how we try to run the patch motion job, these jobs very often fail.
I find a couple of errors in the JobLogs, but cannot really interpret them. Maybe, someone can help?
I’ll often find a couple of these:
LZWDecode: Wrong length of decoded string: data probably corrupted at scanline 1023.
I guess this means, some images are broken?
Although these do occur often:
HOST ALLOCATION FUNCTION: using cudrv.pagelocked_empty
exception in cufft.Plan.del:
I don’t think my memory is running out Htop/ GPUstat and glances, all seem relatively fine.
Finally it breaks like this:
/home/cryosparcuser/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/multiprocessing/process.py:99: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (
matplotlib.pyplot.figure
) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParamfigure.max_open_warning
).
self._target(*self._args, **self._kwargs)
Traceback (most recent call last):
File “/home/cryosparcuser/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/multiprocessing/queues.py”, line 242, in _feed
send_bytes(obj)
File “/home/cryosparcuser/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/multiprocessing/connection.py”, line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File “/home/cryosparcuser/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/multiprocessing/connection.py”, line 404, in _send_bytes
self._send(header + buf)
File “/home/cryosparcuser/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/multiprocessing/connection.py”, line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Any hints?
Best & thanks
Jan