Performance issues with reading files

jelka · February 27, 2020, 12:26pm

Dear cryoSPARC team,

I am using cryosparc in a cluster installation on a HPC system that has a Ceph disk cluster for processing.
The Ceph cluster performs single threaded reads with 4-5 Gb/s for normal system operations like cat, cp and around 50 Gb/s for parallel reads from one node.

When it comes to cryoSPARC, which only does single threaded reads. The performance is rather disappointing with speeds of 0.3-1.2 Gb/s.
Although cryoSPARC is rather fast in general, then it bottlenecks on lack of internal bandwidth.
My guess is that the problem lies in the python 2.7 code?

Am I the only one who sees this slow file reads in cryoSPARC?

If it is a general problem, is there any plans on improving this in the future e.g. move to python 3 and/or use parallel reads of particle stack etc.?

Cheers,
Jesper

stephan · March 4, 2020, 7:41pm

Hey @jelka,

Thanks for reporting this! We’ve never worked with this filesystem before, so bare with me. What job did you notice these speeds in? Is it during caching? If you can post the logs of this speed being reported that would be great.

Regarding python modules, if the issue is during caching, then maybe knowing the function we use might help: We use shutil.copyfile(): https://docs.python.org/2.7/library/shutil.html, and we use os.path.getsize() to get the sizes of each file we try to copy: https://docs.python.org/2.7/library/os.path.html.

jelka · March 6, 2020, 2:50pm

Thanks for the reply @stephan,

I forgot to tell you that I am not using any SSD caches at all. The nodes are not configured for this, but rely on the otherwise fast Ceph disk cluster. Is this a problem?

shutil.copyfile has, as far as I know, only a default buffer window of 16KB. Maybe increasing if would speed things up a bit? e.g. Python 3.8 has a shutil buffer of 1MB by default.

Or, since cryosparc only installs on Linux systems, maybe a native OS command could be used through os.system?

Cheers,
Jesper

stephan · March 9, 2020, 4:24pm

Hi @jelka,

Thanks for the information, that helps. Is it possible if you can report what jobs you’re seeing this slowdown in? Can you also report information about your dataset (size, number of particles)?

jelka · March 10, 2020, 10:10am

Slow bandwidth seems to be an issue on all cryosparc jobs, but it is of course more dominate in larger datasets and in jobs that needs a lot of reads.
3D Variability is one of the jobs that really stands out. A dataset of 2.5TB and 530000 particles, takes ~10 days to complete 20 iterations and if one monitors CPU, GPU and network, is all at close to idle all the time.

apunjani · March 10, 2020, 4:51pm

Hi @jelka,

A 10-day job in cryoSPARC is quite an outlier - there is definitely something not quite right. As a reference, on our systems 600,000 particles with box size 360 can be processed in 3D Var with 6 classes in ~8 hours on 1x GV100 GPU. The particles in this case are cached on local SSD.

One tip especially for 3D var is to first run “downsample particles” to reduce the box size as much as possible - 3D var runtime is at least quadratic in the box size.

On your system, are you aware of any particularly large penalty for opening and closing files during reading? One change we are hoping to make in cryoSPARC is the way it opens and closes files when there are thousands of different particle stacks being processed. On file systems with high file metadata or file operation costs (aside from the read itself) this might make a big difference.

DanielAsarnow · March 10, 2020, 8:26pm

@apunjani @jelka Is the box really ~1138 px (sqrt(2.5*2^40 / 5.3*10^5 / 4))?

jelka · March 11, 2020, 10:39am

Hi @apunjani,

I have now downsampled the particles and it look a whole lot better and will properly finish within ~8 hours. Although it seems that bandwidth is still the bottleneck, as CPU and GPU are still at a crawl.

It is clear there is a latency/penalty problem while opening+closing files upon reads. It is properly related to the remote reading of metadata from the MDS server in Ceph. But as mentioned in the beginning, I do not see this for native system operations. Nor do I see it in, for instance, Relion or cisTEM, which pulls some very nice bandwidth numbers of the Ceph cluster.

It sounds interesting with the plans for optimizing cryoSPARC reads. This might just do the trick.

Thanks,
Jesper

jelka · March 11, 2020, 10:42am

@DanielAsarnow
Sorry for the misunderstanding. The box size is 320 px. It was the whole dataset that was 2.5TB.
I guess the particle stack will be just above 600GB then.

//Jesper

apunjani · March 11, 2020, 6:20pm

Hi @jelka,
Thanks for the update - yes most likely the way cryoSPARC reads/opens/closes files is the culprit. We’re hopefully to change this in the next version.

jelka · May 18, 2020, 10:27am

Thanks @apunjani and the rest of the team.
It looks like this was improved a lot in v.2.15.
Now I have ~10 Gb/s performance i cryoSPARC.

Cheers,
Jesper