Refused connection after computer restart (to be confirmed)

I’m not sure if the current problem I’m getting is related to this nvcc, but I have been getting an error that is similarly discussed here: Refused connection after computer restart

I’m able to restart the job by deleting the file and restarting cryosparc, but this also means that my job has been cut off in the middle (about 5k-4k incomplete micrograph, about 2k completed) twice in the last three days.
(every time I come back to work the next day)

Why does this occurs? and how do I prevent this so my job can be completed?

As a reference, we also have a second computer with older version of cryosparc running on another job, and it has been working fine.

Please can you provide additional information:

  • the job type for which the error occurred
  • description of the error
  • the name of the file deleted
  • did a computer restart occur (as in the related topic)
    • did the computer shutdown and reboot spontaneously or
    • was the computer restarted manually (if so, how and why)
  • if a computer restart did not occur, please email us the job report

Sorry for the reply delays,
The job was “Patch Motion Correction”
file deleted name: “cryosparc-supervisor-56b93a0fbe8d80f91a710e9ca648d345.sock”

There was no restart occuring or shutdown or reboot.
I sent the job report via email. (1micrograph incomplete after another restart after deleting the .sock file)

@stjiafle I noticed that you more recently posted other questions on the forum. Are still concerned about this particular question?

No, I’m able to regularly solve this by removing the tmp .sock file, and so far there has not been any job that been interrupted ever since.

Manual removal of the file should be necessary only under rare circumstances, such as a “disorderly” system shutdown, and otherwise be avoided.
Otherwise, following a “regular”
cryosparcm stop (without system reboot), one should check for orphaned CryoSPARC-related processes with (run under same Linux account as CryoSPARC processes)

ps xww | grep -e cryosparc -e mongo

and kill, but not kill -9, relevant processes. After kill, the CryoSPARC-related supervisord process may take some 10 seconds to exit, but the related sock file should be automatically deleted in the process.

Ok, thank you.
I will try this the next time it happen, as far as cryosparcm stop and cryosparcm restart usually gives me the same error message, but next time I will try to find the related process first and kill it.