Issues with login on the CS webinterface for some users

Dear CS

we are running into some login issues with our CS implementation here at our University.
we are using CryoSPARC: v3.2.0+210629, tied together with SLURM and a compute backend.

we have about +23 users accessing the webinterface running all kinds of different jobs in CS.
we have seen this odd behaviour that suddenly some users cannot login while others have not issue… it seems to be very random. But when you are denied login this will be very consistent.

Only solution to get back to normal operation is to restart CS.

when checking the webapp log, i cannot se the failing users try to login.

Kind regards
Tim

Hi @timfolsen,

This is quite odd. Could you please provide the following information to help us debug this:

  • What does the popup message state when the user is unable to login? ‘User not found’ or ‘Incorrect password’?
  • What browser and version is the user running?
  • If the user opens up their browser console (right click → ‘inspect element’ → console tab), are there any error messages? If so, please reply with the content
  • If you run cryosparcm listusers do you see all users listed?
  • Are there any database errors? cryosparcm log database | tail -n 100

- Suhail

Hi Suhail

Yes its very strange,
There is no popup message and no error to be seen or any other type of message when clicking the login button.
We have tried using firefox 94.01, google chrome 96.0.4664.45, safari.
there are no error messages to be seen when using the inspector on the login page.
cryosparcm listusers list all users, also those which cannot login
there are no db errors when running cryosparcm log database | grep error

the system IS RHEL 8.4 and cryosparc is started via systemd through 2 service files. a cryosparc-env.service and a cryosparc-supervisor.service. I have just restarted the service now and everyone can login again. But the issue will reappear after some weeks…

Hi @timfolsen,

Thanks for the detailed reply. Any errors in the web application logs? cryosparcm log webapp | tail -n 100. The only case I’ve seen this happen is when the database unexpectedly quits.

- Suhail

Hi Suhail

I can se some connection refused errors when issuing cryosparcm log webapp | tail -n 100 :

error in interactive request { RequestError: Error: connect ECONNREFUSED x.x.x.x:44019
    at new RequestError (/opt/cryosparc_master/cryosparc_webapp/bundle/programs/server/npm/node_modules/request-promise-core/lib/errors.js:14:15)
  error:
    at new RequestError (/opt/cryosparc_master/cryosparc_webapp/bundle/programs/server/npm/node_modules/request-promise-core/lib/errors.js:14:15)
error in interactive request { RequestError: Error: connect ECONNREFUSED x.x.x.x:42881
    at new RequestError (/opt/cryosparc_master/cryosparc_webapp/bundle/programs/server/npm/node_modules/request-promise-core/lib/errors.js:14:15)
  error:
    at new RequestError (/opt/cryosparc_master/cryosparc_webapp/bundle/programs/server/npm/node_modules/request-promise-core/lib/errors.js:14:15)
error in interactive request { StatusCodeError: 500 - "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2 Final//EN\">\n<title>500 Internal Server Error</title>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>\n"
    at new StatusCodeError (/opt/cryosparc_master/cryosparc_webapp/bundle/programs/server/npm/node_modules/request-promise-core/lib/errors.js:32:15)
  message: '500 - "<!DOCTYPE HTML PUBLIC \\"-//W3C//DTD HTML 3.2 Final//EN\\">\\n<title>500 Internal Server Error</title>\\n<h1>Internal Server Error</h1>\\n<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>\\n"',
  error: '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">\n<title>500 Internal Server Error</title>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>\n'

It states the server (or DB?) may be overloaded… we actually do see some refresh issues in the CS webapp combined with this login issue… it may be related

Hi @timfolsen,

That HTTP 500 error is just a generic message servers usually send when there’s an unexpected error while attempting to serve a request. This 500 error was sent from one of the interactive jobs - is there one running on your instance? If so, try checking the output of cryosparcm joblog P1 J1 (replacing the project and job ID) to get a more detailed description of what went wrong.

With regards to your primary query, it’s very unusual as if something is wrong with the network/webapp/database none of the users should be able to login, not just some. Is there anything unique about your networking setup that can help us debug this? Perhaps an institutional firewall if accessing the cryoSPARC web application through different means (SSH port forward, VPN, etc.)

Regards,
Suhail

Hi -

I have the very same issue. The small difference - when this happens no user can log in. But the users with active cookies can access CryoSPARC. I bet this is the same behavior OP has.

Restarting CryoSPARC (at least the database) helps, but it kills existing jobs (no heartbeat) and we usually have 10+ jobs running, so it’s a major issue. It would be nice at least to have a way to restart the database without affecting existing running jobs.

P.S. We have pretty busy setup and a fair size database (300+ GB).

Hi all,
I also see webapp as unstable during heavy loads.

MongoDB is +550GB and growing with +20 simultaneous users.

I might be wrong, but I believe this is due to the single threaded node.js webapp process, that is getting overloaded.
A way to fix this might be to load balance node.js through NGINX:
5 Tips to Increase Node.js Application Performance

Which would be great if cryosparc could implement that or something similar?

It is enough to restart the database in my case leaving the webapp intact.
So I would blame the database not the node.js

Sorry to sidetrack this discussion with my webapp problems.
I also had MongoDB problems - right until I put the database on a fast NVMe drive locally installed in the master node, then those problems went away :wink:

Hi All

Since we upgraded to CS 3.3.1 we have not seen this odd behavior.

Jelka,
you were right. Restarting the database fixes my issue, but restarting the webapp does it too but with less harm.Thank you very much for the tip.