I’m experiencing some performance issues in “extract from micrographs” in the latest version.
Previously, I could extract 957k particles from 1150 micrographs in 802 seconds. Currently I have a job trying to extract ~5E6 particles from 4400 micrographs, which has processed 2750 of 4487 micrographs in 55,739 seconds. These are different subsets of the same data, with the same box size. It’s a large number of particles, but I’d expect roughly linear scaling given the other parameters are the same.
I also think there are some issues with inspect picks and curate micrographs. I’m not completely sure because the final measured runtimes seem reasonable but the initial “running” phase takes a very long time.
Just as a data point, I think I have always had those performance issues with particle extraction, even in previous versions. Currently trying to extract 2.4M particles from 5000 micrographs, and it’s projected to take ~35000s paralellized over 4 GPUs - in Relion that would be a few minutes split over 40 CPUs
definitely there is some non linearity. My previous extraction job from a subset of 180 micrographs from the same dataset (and same box etc) took 172s to extract 62000 particles from 180 mics, using a single GPU…
Here are the numbers for the two jobs I was comparing. I will clone and re-run the original job with the smaller number of images and see if seem attributable to the cryosparc version or just the size of the data.
Image size: 5760x4092 (K3 1x)
Box size: 288 -> 72 vs 288 -> 48
Particles / image: 858.4 vs 840.2
Images: 1115 vs 4487
Time: 801.7s vs 91261.2s
Time per image: 0.719s vs 20.34s
GPUs: 2 (GTX 1080 Ti)
I repeated the first job (1115 micrographs) by creating a new extraction job (not a clone), then setting up the same input parameters. The job took 4946.49s, so around 6 times longer in 2.13.2 than 2.12.
I also have the slow extraction problem (~35s per image). Also, I use 3 GPUs, but somehow two of them died in the beginning (no error messages) but images that are handled by those 2 GPUs did not get extracted, and the program just keep running with 1 GPUs working. Perhaps the GPU issue is because there is something wrong with my workstation.