Building a job takes longer than usual

Hello, Everyone. I have a question about constructing a job.
In my “Project”, I have 20 “Workspace” and totally ~4,000 “jobs” in it. When I biuld a new job, it was very slow. I used the mouse to drag one map from a job output to the new job. It cost nearly half a minute when the map appeared in the dialogue of the new job. I thought maybe my job number was too large. How can I deal with that? How to put some jobs to another place for storage?

However, I constructed a new project. It can be faster than before. And the data between different projects may not be easy to link.

Now, I often spend nearly 3 minutes biulding one job. It was really awful. Can someone help me?

Hi @Yuqi

Sorry to hear you’re having trouble using cryoSPARC. Could you clarify: is the user interface itself slow (such as scrolling or opening job dialogs is lagging) or is it only certain server interactions (such as dropping an output group into an input slot, setting a parameter, etc.).

Are you running cryoSPARC in a single workstation configuration, meaning both the master components (web application, database, etc.) are running on the same machine as the jobs? If so, it may be possible that high CPU/memory or network load could cause the web application (and any other process running on the machine) to slow down or become unresponsive.

If you have a utility such as htop installed it can provide more insight into the various processes running on the machine and CPU/memory usage.

Additionally, if you can reply with any errors you see in the following log, that would be appreciated:

cryosparcm log command_core | tail -n 100

- Suhail

Thanks for your patient reply! I run cryosparc jobs on school’s GPU clusters. There are master components and many other GPU nodes. I mean the server interactions (such as dropping an output group into an input slot, setting a parameter, etc.). The computation was very fast with Tesla GV100 cards.

The master node information is here.
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel® Xeon® Silver 4110 CPU @ 2.10GHz
Stepping: 4
CPU MHz: 2100.000
BogoMIPS: 4200.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 11264K
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_ppin intel_pt ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke spec_ctrl intel_stibp flush_l1d
total used free shared buff/cache available
Mem: 62G 25G 27G 3.1G 8.9G 32G
Swap: 63G 584M 63G

Hi @Yuqi,

Thank you for the additional information. We’ve identified a possible cause for the noticeable delay and will work on getting a fix ready as soon as possible. I will post in this topic again once a fix has been released.

- Suhail

1 Like

The reason is that every time a new input is being dragged into a job, the master will update the job relationship map. This gets noticibly slow when the job count in a project approaches 1000. Deleting jobs won’t help. The deleted jobs are still in the map with all of its connections.

My solution, as a user, is to keep the total job count as low as possible. This can be achieved by reusing jobs. For example, a classification or selection job can be cleared and reused if its usefullness has expired. If a job serves no purpose in the project, I would clear it, press ‘b’, unconnect its inputs. Note that it is still connected to its children jobs.

One can even deliberately make a block of jobs as a reusable functional unit for one type of task.

Care needs to be taken to prevent circular references. This can be achieved by adding a buffering job (quick ones such as curating, inspecting, or particles sets) to the end of the reused job block. When the input of the buffering job is cleared the block is disconnected from the downstream jobs and can be reused.

Thank you very much! @ZhijieLi I didn’t know the meaning of press ‘b’. I used the GUI interface rather than command line. How to delete the connection with parent jobs?

Hi Yuqi,

The ‘b’ is a keyboard shortcut for entering the job editing interface (seems to be the only way). The job needs to be editable, meaning that it is not yet queued, or that it is finished/failed/killed and then cleared.

You can simply highlight the job square on the web interface with a mouse left click, clear it if necessary, then press ‘b’ to enter the job building interface.

This might be the only shortcut key I know of. It would be nice if someone lists all the keyboard shortcuts.

1 Like

Hi @ZhijieLi,
You can also edit a building job by clicking on the dark purple ‘Building’ button on the middle of the job card. Thanks!

@spunjani
Ah, I see. Thank you!

Hi Suhail, has this issue been fixed? I still have the problem now. My version is v4.5.1

@JuenZhang Due to significant changes between the version available in 2021 and v4.5.1, please can you describe your observations and connection details:

  1. specific UI actions that are followed by a delay
  2. the affected sections of the UI (please feel free to post screenshots)
  3. What happens during delays, such as: window goes blank, window updates steadily but slowly, input attempts fail to be recognized, etc
  4. What URL do you use to connect to the CryoSPARC UI:
    • does the URL include localhost or 127.0.0.1 (or another 127.*.*.* address)?
    • does the URL start with http:// or with https://?
    • does the URL include a colon followed by a port number, like :39000?
  5. How many jobs does the workspace include?

Hi Wtempel,
Thanks for your response.

  1. specific UI actions that are followed by a delay
    Every step becomes slow. Including show up of the job building interface; drag particle to particle stack, no changes at the begining, takes >5 s to show up the particles. when I clicked the queue jobs, takes >5s to start running the job.
  2. What happens during delays, such as: window goes blank, window updates steadily but slowly, input attempts fail to be recognized, etc
    Sometimes window goes blank. Most times, just take long time to go to next step.
  3. What URL do you use to connect to the CryoSPARC UI:
  • does the URL include localhost or 127.0.0.1 (or another 127.*.*.* address)?
  • does the URL start with http:// or with https://?
  • does the URL include a colon followed by a port number, like :39000?

I have someting like 127...*:39000
no http or https

  1. How many jobs does the workspace include?

1300 jobs
When I created a new project. Everything is fine. So it is not because of the hardware.

Thank you so much.

Thanks @JuenZhang for these details.
How much RAM does the CryoSPARC master computer have, according to the command
free -h?
What are the web browser type and version?

Are you using ssh local port forwarding, or does the browser run on the CryoSPARC master computer?

Hi, thank you.
I am using chrome. Not on the master computer. Simply type 127.. .*:39000 into the url address of the browser.
I don’t think it is because of the browser or master computer hardware. Because when I create a new project, everything becomes normal. Should be due to too many jobs.
Best regards!

@JuenZhang Please can you try upgrading CryoSPARC to the latest version and let us know if you continue to observe the problem?