Performance regressions in v2.13

issue_recorded
977

#1

I’m experiencing some performance issues in “extract from micrographs” in the latest version.

Previously, I could extract 957k particles from 1150 micrographs in 802 seconds. Currently I have a job trying to extract ~5E6 particles from 4400 micrographs, which has processed 2750 of 4487 micrographs in 55,739 seconds. These are different subsets of the same data, with the same box size. It’s a large number of particles, but I’d expect roughly linear scaling given the other parameters are the same.

I also think there are some issues with inspect picks and curate micrographs. I’m not completely sure because the final measured runtimes seem reasonable but the initial “running” phase takes a very long time.


#2

Just as a data point, I think I have always had those performance issues with particle extraction, even in previous versions. Currently trying to extract 2.4M particles from 5000 micrographs, and it’s projected to take ~35000s paralellized over 4 GPUs - in Relion that would be a few minutes split over 40 CPUs

Oli


#3

Does it seem linear with the particle count? I’ve never really worked with this many particles before my current dataset. Maybe there’s something causing poor scaling with a sudden inflection point.


#4

definitely there is some non linearity. My previous extraction job from a subset of 180 micrographs from the same dataset (and same box etc) took 172s to extract 62000 particles from 180 mics, using a single GPU…


#5

And looking at the log file of the current big extraction, it took 1318s for the first 200 mics (spread over 4 GPUs). So about 30x slower, particle for particle, for the extraction with more particles


#6

Hi @DanielAsarnow, @olibclarke,

In v2.13+, the Inspect Picks job includes calculations to calibrate the picking score- can you try running this job with the parameter turned off and compare speeds?

Thanks for reporting, we’re going to do some benchmarks and see whats wrong here- can you help us narrow down potential testing datasets?

What size were the images?
What box size did you use to extract?
How many particle picks per image were being extracted?
How many GPUs did you use?


#7

Here are the numbers for the two jobs I was comparing. I will clone and re-run the original job with the smaller number of images and see if seem attributable to the cryosparc version or just the size of the data.

Image size: 5760x4092 (K3 1x)
Box size: 288 -> 72 vs 288 -> 48
Particles / image: 858.4 vs 840.2
Images: 1115 vs 4487
Time: 801.7s vs 91261.2s
Time per image: 0.719s vs 20.34s
GPUs: 2 (GTX 1080 Ti)

Update

I repeated the first job (1115 micrographs) by creating a new extraction job (not a clone), then setting up the same input parameters. The job took 4946.49s, so around 6 times longer in 2.13.2 than 2.12.


#8

Hi @DanielAsarnow , thanks for that information. We’re definitely seeing the same ~20s per image time and are investigating what’s wrong. Will update when we know more.


#9

I also have the slow extraction problem (~35s per image). Also, I use 3 GPUs, but somehow two of them died in the beginning (no error messages) but images that are handled by those 2 GPUs did not get extracted, and the program just keep running with 1 GPUs working. Perhaps the GPU issue is because there is something wrong with my workstation.