Incomplete particle extraction

Hi,

I recently encountered a problem during particle extraction from micrographs. Here is the screenshot showing the error message.

I feel this error is related to GPU out of memory, because I noticed that the used GPU memory was about 5 GB when the job was initially launched and gradually increased as the job ran. Eventually, as one or more of the GPUs ran out of memory, I started to see the error messages. Using the CPU version of the job, particle extraction successfully completed for all the micrographs without any issue. And the system memory usage stayed about the same through the whole process of the job. It seems like the GPU version of the particle extraction job was unable to release the GPU memory after it finished extraction from previous batches of micrographs. Below is some information about my system:

CPU: AMD Threadripper Pro 5975WX, 32 Core
Memory: 256 GB
GPUs: 4x NVIDIA RTX A5500, each with 24 GB of VRAM
CryoSPARC 4.1.1

I don’t know if this issue is dependent on the version of cryoSPARC. Although I never encountered this problem before while using cryoSPARC v2 and v3, the datasets I worked with before were also much smaller than this one (~2000 movies before vs 6453 movies here). So it was possible that in previous cases, particle extraction completed before the GPU ran out of memory.

Please let me know if you have any suggestions or ways to fix this problem.

Thanks.

I also sometimes have this issue.
But normally, the incomplete micrographs should go in a different output “Micrographs incomplete”.
What I do is that I just rerun the job on the incomplete micrographs and pool the extracted particles from both jobs afterwards.

We’re seeing the same problem. Yet for us it’s only happening on one of two identical (and brand new) 4xGPU processing servers (they use A5000 24GB). So I’m wondering if this relates to CryoSPARC installations, OS, or hardware being different in ways that have been hidden so far.

@drichman @Flow @YYang You may be encountering an issue that has been fixed with patch 230110.

1 Like

Thanks @wtempel that was it! Patch 230110 fixed it.

I am using the current version V4.3.1. but I still have the same problem. Thanks for suggestion in advance. L.

@Lan Please can you email us the corresponding job report.

@Lan The event log you sent us contained a FileNotFoundError. Please can you post the outputs of the following commands for the relevant file:

ls -l $(readlink -e /full/path/to/file.mrc)

[cryosparc_user@r16763 ~] ls -l (readlink -e /full/path/to/file.mrc)
total 40
drwxr-xr-x. 4 cryosparc_user cryosparc_user 4096 Oct 20 2022 Desktop
drwxr-xr-x. 3 cryosparc_user cryosparc_user 4096 Sep 23 15:39 Documents
drwxr-xr-x. 2 cryosparc_user cryosparc_user 4096 Aug 14 2022 Downloads
drwxr-xr-x. 2 cryosparc_user cryosparc_user 4096 Jul 5 2022 Music
drwxr-xr-x. 2 cryosparc_user cryosparc_user 4096 Nov 4 2022 Pictures
drwxr-xr-x. 2 cryosparc_user cryosparc_user 4096 Jul 5 2022 Public
drwxrwxr-x. 4 cryosparc_user cryosparc_user 4096 Nov 2 2022 pyem
drwxrwxr-x. 3 cryosparc_user cryosparc_user 4096 Oct 20 2022 software
drwxr-xr-x. 2 cryosparc_user cryosparc_user 4096 Jul 5 2022 Templates
drwxr-xr-x. 2 cryosparc_user cryosparc_user 4096 Jul 5 2022 Videos
[cryosparc_user@r16763 ~] ls -l (readlink -e /full/path/to/004394951061012498658_FoilHole_259379_Data_259517_259519_20210302_002050_aligned_DW.mrc)
total 40
drwxr-xr-x. 4 cryosparc_user cryosparc_user 4096 Oct 20 2022 Desktop
drwxr-xr-x. 3 cryosparc_user cryosparc_user 4096 Sep 23 15:39 Documents
drwxr-xr-x. 2 cryosparc_user cryosparc_user 4096 Aug 14 2022 Downloads
drwxr-xr-x. 2 cryosparc_user cryosparc_user 4096 Jul 5 2022 Music
drwxr-xr-x. 2 cryosparc_user cryosparc_user 4096 Nov 4 2022 Pictures
drwxr-xr-x. 2 cryosparc_user cryosparc_user 4096 Jul 5 2022 Public
drwxrwxr-x. 4 cryosparc_user cryosparc_user 4096 Nov 2 2022 pyem
drwxrwxr-x. 3 cryosparc_user cryosparc_user 4096 Oct 20 2022 software
drwxr-xr-x. 2 cryosparc_user cryosparc_user 4096 Jul 5 2022 Templates
drwxr-xr-x. 2 cryosparc_user cryosparc_user 4096 Jul 5 2022 Videos
[cryosparc_user@r16763 ~]$

lrwxrwxrwx. 1 cryosparc_user cryosparc_user 115 Oct 25 18:00 /data/D6/CS-d6/J43/imported/004394951061012498658_FoilHole_259379_Data_259517_259519_20210302_002050_aligned_DW.mrc → /run/media/cryosparc_user/LL3/xx/xx/D6_20210302/FoilHole_259379_Data_259517_259519_20210302_002050_aligned_DW.mrc

text in red and flash
LL3 is not connected during this test.

To avoid duplicating data (and required storage capacity), data may be linked to rather than copied from the source of the import. There is a caveat: the source data need to be “connected” whenever any processing is performed. In other words, processing of imported data will fail if the import source is not connected.