Cannot queue "Select 2D classes" job

When I queue “Select 2D Classes” job with cryosparcm v4.4.1 in a local workstation, I got a ServerError: enqueue job error. It briefly shows that the job is an interactive job and must be queued on the master node, and then the job returned to BUILD position.
I tried to restart the computer. Initially it worked for one day. Then it happened again. Even after reinstalling cryosparc, the problem is still the same. I tried many times, but occationally, the job can be queued. I have used this version of cryopsarc for several months, but only encountered this problem in the last two weeks.
I have seen a similar discussion of unable to queue interactive job, but those problems were solved by restarting. In contrast, my problem persists, so that I cannot proceed to do further analysis.

Hi @ychang ,

Thank you for reporting. Can you please try to reproduce this ensuring you can see the card in the main browse view? Given a job is selected and within view, you should be able to right click → ‘queue’ or press ‘Queue’ on the sidebar footer.
image

Hi, @sdawood, Thanks for your prompt reply. Yes, I can see the card in the main browse view? It said building. If I right click, it gives me a list of actions. If I go to Queue job>Queue on default, it just got the same problem: unable to queue the job: ServerError, and get back to the “Building” state. This problem started shortly after our collaborator started remotedly helping me to do some analysis for a different project. In one occation, when I couldn’t queue a job and it was back to Building state, he could queue the same job remotely.
BTW, I’ve got another problem recently. When I use local refinement, GPU gets to 88C degrees, then the GPU is lost in the nvidia-smi -l monitor, but local refinement clock is still counting and never finish. The computer cannot be restarted by clicking restart. I have to physically hold power button to shut it off and then turn it back on. But other refinements, such as heterogeneous refinement and non-uniform refinement are fine. I am not sure whether this problem is related to the problem of unable to queue interactive job, or whether it is hardware issue or software issue.

The workaround that I found was to go back and queue up my next job–so if my next job would be a 2D classification, I would start setting that up, then go back and queue up the Select class. After I did that, the select job would properly go onto the interactive queue. I still have this issue pop up every now and again, but for whatever reason, it’s not as prevalent as it was. We did restart CS a few times since then for other reasons, but we never had a solid fix for it.

@jrgib Thank you for sharing your trick to me. I have just tested it. It actually worked!! This can explain that occationally, I could get it through. But I was not able to figure out how it got through. At least for now, I can proceed to do further analysis before the bug is fixed.

The problem is finally fixed. It seems to be related to the problem in motherboard, which also resulted in over heating of GPU. With the new motherboard, the problem totally disappeared.

1 Like