Database: ERROR (spawn error) during update

closed

#1

Hi,

We tried to update cryosparc on the cluster and got a “database: ERROR (spawn error)” message. I’ve seen this error message in a few other posts there (#2054 ), but the solution proposed is not clear to me. The user installing the update is the same as the one who did the original installation and there were no cryosparc job running when the update was done.

The error appeared after the “cryosparcm update” command:


Successfully updated master from version v2.3.2 to version v2.4.0.

Starting cryoSPARC System master process…
CryoSPARC is not already running.
database: ERROR (spawn error)
command_core: started
cryosparc command core startup complete.
command_vis: started
command_proxy: started
webapp: started

CryoSPARC master started.
From this machine, access the webapp at
http://localhost:39100
From other machines on the network, access at
http://ws-0001.cm.cluster:39100

Startup can take several minutes. Point your browser to the address
and refresh until you see the cryoSPARC web interface.
CryoSPARC is running.
Stopping cryosparc.
command_proxy: stopped
command_core: stopped
command_vis: stopped
webapp: stopped
Shut down
Starting cryoSPARC System master process…
CryoSPARC is not already running.
database: ERROR (spawn error)
command_core: started
cryosparc command core startup complete.
command_vis: started
command_proxy: ERROR (spawn error)
webapp: started

Our system is:

  • Cuda: cuda91
  • OS: RHEL 7.4
  • Cluster: slum

Has anyone any suggestion, please?

Best,
Nicolas


#2

Hi @ncoudray,

Please provide logs from the database.

cryosparcm log database


#3

Hi - The first lines are (it’s seems to be running in a loop, and the log is much bigger, but it seems that it’s just always printing the same message):

2018-10-16T17:25:47.820-0400 I NETWORK [conn16432] received client metadata from 192.168.0.6:47481 conn16432: { driver: { name: “PyMongo”, version: “3.4.0” }, os: { type: “Linux”, name: “Red Hat Enterp
rise Linux Server 7.4 Maipo”, architecture: “x86_64”, version: “3.10.0-693.17.1.el7.x86_64” }, platform: “CPython 2.7.15.final.0” }
2018-10-16T17:25:47.851-0400 I NETWORK [thread1] connection accepted from 192.168.0.6:47485 #16433 (21 connections now open)
2018-10-16T17:25:47.855-0400 I NETWORK [conn16433] received client metadata from 192.168.0.6:47485 conn16433: { driver: { name: “nodejs”, version: “2.2.34” }, os: { type: “Linux”, name: “linux”, archit
ecture: “x64”, version: “3.10.0-693.17.1.el7.x86_64” }, platform: “Node.js v8.9.4, LE, mongodb-core: 2.1.18” }
2018-10-16T17:25:47.863-0400 I NETWORK [thread1] connection accepted from 192.168.0.6:47487 #16434 (22 connections now open)
2018-10-16T17:25:47.864-0400 I NETWORK [conn16434] received client metadata from 192.168.0.6:47487 conn16434: { driver: { name: “nodejs”, version: “2.2.34” }, os: { type: “Linux”, name: “linux”, archit
ecture: “x64”, version: “3.10.0-693.17.1.el7.x86_64” }, platform: “Node.js v8.9.4, LE, mongodb-core: 2.1.18” }
2018-10-16T17:25:47.865-0400 I NETWORK [thread1] connection accepted from 192.168.0.6:47489 #16435 (23 connections now open)
2018-10-16T17:25:47.865-0400 I NETWORK [conn16435] received client metadata from 192.168.0.6:47489 conn16435: { driver: { name: “nodejs”, version: “2.2.34” }, os: { type: “Linux”, name: “linux”, archit
ecture: “x64”, version: “3.10.0-693.17.1.el7.x86_64” }, platform: “Node.js v8.9.4, LE, mongodb-core: 2.1.18” }
2018-10-16T17:25:48.018-0400 I - [conn16431] end connection 192.168.0.6:47477 (23 connections now open)
2018-10-16T17:25:48.440-0400 I - [conn16432] end connection 192.168.0.6:47481 (22 connections now open)
2018-10-16T17:25:49.452-0400 I NETWORK [thread1] connection accepted from 192.168.0.6:47491 #16436 (22 connections now open)
2018-10-16T17:25:49.470-0400 I NETWORK [thread1] connection accepted from 192.168.0.6:47493 #16437 (23 connections now open)
2018-10-16T17:25:49.471-0400 I NETWORK [conn16437] received client metadata from 192.168.0.6:47493 conn16437: { driver: { name: “PyMongo”, version: “3.4.0” }, os: { type: “Linux”, name: “Red Hat Enterp
rise Linux Server 7.4 Maipo”, architecture: “x86_64”, version: “3.10.0-693.17.1.el7.x86_64” }, platform: “CPython 2.7.15.final.0” }
2018-10-16T17:25:49.488-0400 I NETWORK [conn16436] received client metadata from 192.168.0.6:47491 conn16436: { driver: { name: “nodejs”, version: “3.0.0-rc0” }, os: { type: “Linux”, name: “linux”, arc
hitecture: “x64”, version: “3.10.0-693.17.1.el7.x86_64” }, platform: “Node.js v8.9.4, LE, mongodb-core: 3.0.0-rc0” }
2018-10-16T17:25:49.510-0400 I - [conn16433] end connection 192.168.0.6:47485 (23 connections now open)
2018-10-16T17:25:49.510-0400 I - [conn16435] end connection 192.168.0.6:47489 (23 connections now open)
2018-10-16T17:25:49.510-0400 I - [conn16436] end connection 192.168.0.6:47491 (22 connections now open)
2018-10-16T17:25:50.483-0400 I - [conn16434] end connection 192.168.0.6:47487 (20 connections now open)
2018-10-16T17:25:50.496-0400 I - [conn16437] end connection 192.168.0.6:47493 (19 connections now open)

Thanks for checking


#4

You’re right, the database logs don’t show much. Is it possible if you can provide logs for command_core just after you’ve tried turning on cryoSPARC?


#5

See below - Is it this log you meant?

$ module load curl/7.60.0
$ module load gcc/6.1.0
$ module load slurm
$ module load cuda91/toolkit/9.1.85
$ module load tiff/3.9.7
$ cryosparcm start
Starting cryoSPARC System master process…
CryoSPARC is not already running.
database: ERROR (spawn error)
command_core: started
cryosparc command core startup complete.
command_vis: started
command_proxy: started
webapp: started

CryoSPARC master started.
From this machine, access the webapp at
http://localhost:39100
From other machines on the network, access at
http://ws-0001.cm.cluster:39100
Startup can take several minutes. Point your browser to the address
and refresh until you see the cryoSPARC web interface.

and in case:

$ cryosparcm status

CryoSPARC System master node installed at
/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master
Current cryoSPARC version: v2.4.0

cryosparcm process status:

command_core STARTING
command_proxy FATAL Exited too quickly (process log may have details)
command_vis RUNNING pid 377459, uptime 0:00:03
database FATAL Exited too quickly (process log may have details)
watchdog_dev STOPPED Not started
webapp STARTING
webapp_dev STOPPED Not started


global config variables:

export CRYOSPARC_LICENSE_ID=“7daa6fa4-2ec4-11e8-8ddf-3f6b71da007b”
export CRYOSPARC_MASTER_HOSTNAME=“ws-0001.cm.cluster”
export CRYOSPARC_DB_PATH="/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_database"
export CRYOSPARC_BASE_PORT=39100
export CRYOSPARC_DEVELOP=false
export CRYOSPARC_INSECURE=false


#6

My apologies, I should’ve been more clear. Please run cryosparcm log command_core. Documentation for this is here:


#7

Ah, ok, sorry. Here it is:

$ cryosparcm log command_core
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/site-packages/werkzeug/serving.py", line 577, in __init__
    self.address_family), handler)
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/SocketServer.py", line 417, in __init__
    self.server_bind()
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/BaseHTTPServer.py", line 108, in server_bind
    SocketServer.TCPServer.server_bind(self)
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/SocketServer.py", line 431, in server_bind
    self.socket.bind(self.server_address)
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
socket.error: [Errno 98] Address already in use
/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/site-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.23) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
COMMAND CORE STARTED ===  2018-10-16 20:15:58.639109  ==========================
*** BG WORKER START
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "cryosparc2_command/command_core/__init__.py", line 169, in start
    app.run(host="0.0.0.0", port=port, threaded=True, passthrough_errors=False)
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/site-packages/flask/app.py", line 841, in run
    run_simple(host, port, self, **options)
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/site-packages/werkzeug/serving.py", line 814, in run_simple
    inner()
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/site-packages/werkzeug/serving.py", line 774, in inner
    fd=fd)
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/site-packages/werkzeug/serving.py", line 660, in make_server
    passthrough_errors, ssl_context, fd=fd)
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/site-packages/werkzeug/serving.py", line 577, in __init__
    self.address_family), handler)
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/SocketServer.py", line 417, in __init__
    self.server_bind()
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/BaseHTTPServer.py", line 108, in server_bind
    SocketServer.TCPServer.server_bind(self)
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/SocketServer.py", line 431, in server_bind
    self.socket.bind(self.server_address)
  File "/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
socket.error: [Errno 98] Address already in use
/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master/deps/anaconda/lib/python2.7/site-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.23) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)

#8

This seems like either you have another instance of cryoSPARC installed on the same port (and is still running) or this instance of cryoSPARC has some orphaned processes.

Follow this to ensure all processes are killed before you start it again:


#9

I did but there is no orphan process seen when I run ps -ax | grep "supervisord".

Where can I find the *.sock file to delete? I don’t see it in the installation files, nor in the home of the supervisor?


#10

Ah, sorry - there were indeed orphan processes that needed to be killed (I just missed them the first time because I grep’ed the full “supervisor_id” name but in the list, only its first letters appear).

Anyway, Thanks a lot for the help!

Best,
Nicolas


#11

Hi I run into a similar SPAWN ERROR when trying to start cryosparc. Can anyone assist?