Cryosparc becomes laggy and unresponsive when using Topaz

Hi,

When we try to use Topaz train in the CryoSPARC GUI, the GUI frequently becomes laggy and unresponsive - buttons do not respond, jobs take longer to start, etc. Is this someone other people have encountered, and if so are there any tips for resolving it? Topaz doesn’t seem to be using an exorbitant amount of RAM, and although the CPU utilization is high it doesn’t look out of the ordinary.

Cheers
Oli

(actually trying it now - it is not just cryoSPARC that becomes laggy and unresponsive, it is the system as a whole - e.g tab-completion on the command line becomes much slower, for example)

Also it would be great to be able to set a default path for the Topaz executable, rather than having to locate it every time I want to run topaz

1 Like

Hi @olibclarke,

Can you try running Topaz as a standalone executable and observing if similar performance problems arise? It is possible that file system transfers in Topaz may have slowed down due to another process working on the cluster.

Regards,
Jay Yoo

I can do that, however this is not running on a cluster, it is a standalone workstation and it is the only process running.

@olibclarke I haven’t noticed any lag with Topaz. Are you seeing this with 2.15? or is this after the upgrade to 2.16 beta? I haven’t upgraded yet.

No I am seeing this in both 2.15 and 2.16…

For example - while running a Topaz extract job, clearing or killing other jobs, or dragging inputs onto other jobs becomes nearly impossible, with extreme lag

Hi @olibclarke,

When you notice this lagginess, can you send us the outputs of cryosparcm log command_core ?

Will do. Interestingly if I clone a previously completed Topaz extract job and run that, I do not see the lagginess - I guess something is being cached somehow?

Oli

Hi Oli,

The reason why cloning a job and re-running is different is because Cryosparc saves pre-processed images, so ‘topaz preprocess’ doesn’t need to be run again. This tells me that the problem you are having is likely from ‘topaz preprocess’. Can you try running a new job and reducing the ‘Number of parallel threads’ in the ‘Preprocessing parameters’ section of Topaz Train? It may just be that your CPU cache is hitting its limit, thus not letting other instructions run.

Thanks,
-Alex

1 Like

I will try that, thanks Alex!

Oli

I think that is it… also the preprocessing seems to take the vast majority of the time for Topaz extract… if I run Topaz extract on 2200 mics, it takes ~2hrs, if I re-run it with same mics but different model (avoiding pre-processing), it takes 5min.

@stephan there was nothing informative in the command_core log, however I do notice that topaz seems to be using way more threads than I ask for - for example this is the output of top during a topaz train job where I asked for 4 parallel threads:

It seems like it is launching [number of threads] X [number of workers] individual topaz processes - @alexjamesnoble is this what is supposed to happen? maybe this is why it is slowing down so much? It seems like there is no way to control the num-workers param in cryoSPARC

Hi @olibclarke,

It seems that [number of threads] X [number of workers] individual processes is intended. The [number of threads], controlled by the “Number of parallel threads” parameter simply spawns more instances of the preprocessing command except with different micrographs to preprocess. Once the instance is spawned, Topaz seems to run [number of worker] of its own processes. The “num-workers” parameters from Topaz can be controlled in cryoSPARC using the “Number of CPUs” parameter. This is the same parameter name used by the Topaz GUI to control the num-workers parameter.

Regards,
Jay Yoo

@jyoo - Ok I see! The “Number of CPUs” parameter is right next to a slider switching between CPU and GPU, making it unclear (at least to me) that this pertains to runs where one is using only the GPU.

Also the “Number of CPUs” parameter is under “training parameters” but it is used during preprocessing - maybe it would be better moving it up adjacent to the “number of threads” parameter so this is clearer?

Hi @olibclarke,

I do agree that it is unclear what the number of CPUs parameter does exactly. Moving the parameter could be a more effective way of communicating what the parameter does but the parameter name itself may warrant a change as well.

Hope this helps with your lag and unresponsiveness you’ve been experiencing.

Regards,
Jay Yoo

Hi @jyoo - it does - I would also suggest altering the defaults. E.g. for Topaz extract the defaults are 8 workers with 8 threads each, meaning it spawns 64 (!) topaz processes, which is going to make most systems feel a bit laggy…

@jyoo one other thing - Topaz extract and train seem to skip preprocessing if a job is cloned and re-run, making things much faster. However this is only true if no parameters are altered. For testing different training parameters in particular it would be handy if Topaz train was configured to just use the cached preprocessed mics unless the micrograph input or downsampling has been changed, I think.

Cheers
Oli