Issues with login on the CS webinterface for some users

timfolsen · November 26, 2021, 1:41pm

Dear CS

we are running into some login issues with our CS implementation here at our University.
we are using CryoSPARC: v3.2.0+210629, tied together with SLURM and a compute backend.

we have about +23 users accessing the webinterface running all kinds of different jobs in CS.
we have seen this odd behaviour that suddenly some users cannot login while others have not issue… it seems to be very random. But when you are denied login this will be very consistent.

Only solution to get back to normal operation is to restart CS.

when checking the webapp log, i cannot se the failing users try to login.

Kind regards
Tim

sdawood · November 26, 2021, 4:07pm

Hi @timfolsen,

This is quite odd. Could you please provide the following information to help us debug this:

What does the popup message state when the user is unable to login? ‘User not found’ or ‘Incorrect password’?
What browser and version is the user running?
If the user opens up their browser console (right click → ‘inspect element’ → console tab), are there any error messages? If so, please reply with the content
If you run cryosparcm listusers do you see all users listed?
Are there any database errors? cryosparcm log database | tail -n 100

- Suhail

timfolsen · November 29, 2021, 2:44pm

Hi Suhail

Yes its very strange,
There is no popup message and no error to be seen or any other type of message when clicking the login button.
We have tried using firefox 94.01, google chrome 96.0.4664.45, safari.
there are no error messages to be seen when using the inspector on the login page.
cryosparcm listusers list all users, also those which cannot login
there are no db errors when running cryosparcm log database | grep error

the system IS RHEL 8.4 and cryosparc is started via systemd through 2 service files. a cryosparc-env.service and a cryosparc-supervisor.service. I have just restarted the service now and everyone can login again. But the issue will reappear after some weeks…

sdawood · November 30, 2021, 3:08pm

Hi @timfolsen,

Thanks for the detailed reply. Any errors in the web application logs? cryosparcm log webapp | tail -n 100. The only case I’ve seen this happen is when the database unexpectedly quits.

- Suhail

timfolsen · December 1, 2021, 12:08pm

Hi Suhail

I can se some connection refused errors when issuing cryosparcm log webapp | tail -n 100 :

error in interactive request { RequestError: Error: connect ECONNREFUSED x.x.x.x:44019
    at new RequestError (/opt/cryosparc_master/cryosparc_webapp/bundle/programs/server/npm/node_modules/request-promise-core/lib/errors.js:14:15)
  error:
    at new RequestError (/opt/cryosparc_master/cryosparc_webapp/bundle/programs/server/npm/node_modules/request-promise-core/lib/errors.js:14:15)
error in interactive request { RequestError: Error: connect ECONNREFUSED x.x.x.x:42881
    at new RequestError (/opt/cryosparc_master/cryosparc_webapp/bundle/programs/server/npm/node_modules/request-promise-core/lib/errors.js:14:15)
  error:
    at new RequestError (/opt/cryosparc_master/cryosparc_webapp/bundle/programs/server/npm/node_modules/request-promise-core/lib/errors.js:14:15)
error in interactive request { StatusCodeError: 500 - "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2 Final//EN\">\n<title>500 Internal Server Error</title>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>\n"
    at new StatusCodeError (/opt/cryosparc_master/cryosparc_webapp/bundle/programs/server/npm/node_modules/request-promise-core/lib/errors.js:32:15)
  message: '500 - "<!DOCTYPE HTML PUBLIC \\"-//W3C//DTD HTML 3.2 Final//EN\\">\\n<title>500 Internal Server Error</title>\\n<h1>Internal Server Error</h1>\\n<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>\\n"',
  error: '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">\n<title>500 Internal Server Error</title>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>\n'

It states the server (or DB?) may be overloaded… we actually do see some refresh issues in the CS webapp combined with this login issue… it may be related

sdawood · December 8, 2021, 1:09pm

Hi @timfolsen,

That HTTP 500 error is just a generic message servers usually send when there’s an unexpected error while attempting to serve a request. This 500 error was sent from one of the interactive jobs - is there one running on your instance? If so, try checking the output of cryosparcm joblog P1 J1 (replacing the project and job ID) to get a more detailed description of what went wrong.

With regards to your primary query, it’s very unusual as if something is wrong with the network/webapp/database none of the users should be able to login, not just some. Is there anything unique about your networking setup that can help us debug this? Perhaps an institutional firewall if accessing the cryoSPARC web application through different means (SSH port forward, VPN, etc.)

Regards,
Suhail

filonovd · February 21, 2022, 5:12pm

Hi -

I have the very same issue. The small difference - when this happens no user can log in. But the users with active cookies can access CryoSPARC. I bet this is the same behavior OP has.

Restarting CryoSPARC (at least the database) helps, but it kills existing jobs (no heartbeat) and we usually have 10+ jobs running, so it’s a major issue. It would be nice at least to have a way to restart the database without affecting existing running jobs.

P.S. We have pretty busy setup and a fair size database (300+ GB).

jelka · February 21, 2022, 9:02pm

Hi all,
I also see webapp as unstable during heavy loads.

MongoDB is +550GB and growing with +20 simultaneous users.

I might be wrong, but I believe this is due to the single threaded node.js webapp process, that is getting overloaded.
A way to fix this might be to load balance node.js through NGINX:
5 Tips to Increase Node.js Application Performance

Which would be great if cryosparc could implement that or something similar?

filonovd · February 22, 2022, 3:57am

It is enough to restart the database in my case leaving the webapp intact.
So I would blame the database not the node.js

jelka · February 22, 2022, 8:43am

Sorry to sidetrack this discussion with my webapp problems.
I also had MongoDB problems - right until I put the database on a fast NVMe drive locally installed in the master node, then those problems went away

timfolsen · February 22, 2022, 8:52am

Hi All

Since we upgraded to CS 3.3.1 we have not seen this odd behavior.

filonovd · February 25, 2022, 2:56pm

Jelka,
you were right. Restarting the database fixes my issue, but restarting the webapp does it too but with less harm.Thank you very much for the tip.