CS instance failed to launch

Hello,

I would very much appreciate your help to trouble shoot a problem we encountered after upgrading to the latest CS. We passed both cryosparcm test install and cryosparcm test workers P6. But CS instance stuck in launch state when we opened it in Firefox. The command_core.log showed some errors about mongo.db. Here is an example:

2023-06-25 22:02:43,419 COMMAND.DATA         dump_workspaces      INFO     | Exported all workspaces in P6 to /DATA01/cryosparc_hz/P46/workspaces.json in 0.01s
2023-06-25 22:02:43,446 COMMAND.USER         get_user_state_var   ERROR    | Caught exception while trying to find user instance_tester
2023-06-25 22:02:43,446 COMMAND.USER         get_user_state_var   ERROR    | Traceback (most recent call last):
2023-06-25 22:02:43,446 COMMAND.USER         get_user_state_var   ERROR    |   File "/app/apps/rhel7/cryosparc/cryosparc2_master/cryosparc_command/command_core/__init__.py", line 1028, in get_user_state_var
2023-06-25 22:02:43,446 COMMAND.USER         get_user_state_var   ERROR    |     value = mongo.db['users'].find_one({'_id' : user_id}, {state_key : 1})['state'].get(key, default_value)
2023-06-25 22:02:43,446 COMMAND.USER         get_user_state_var   ERROR    | TypeError: 'NoneType' object is not subscriptable
2023-06-25 22:02:43,450 COMMAND.CORE         run                  INFO     | Received task layout_tree with 4 args and 0 kwargs
2023-06-25 22:02:43,771 COMPUTE.COMMON       param_set_spec_value INFO     | Setting parameter J342.test_tensorflow with value False of python type <class 'bool'> and param type boolean
2023-06-25 22:02:43,774 COMMAND.CORE         job_set_param        WARNING  | SKIPPING background task dump_job_database with 2 and 0 kwargs - already scheduled
2023-06-25 22:02:43,793 COMPUTE.COMMON       param_set_spec_value INFO     | Setting parameter J342.test_pytorch with value False of python type <class 'bool'> and param type boolean
2023-06-25 22:02:43,796 COMMAND.CORE         job_set_param        WARNING  | SKIPPING background task dump_job_database with 2 and 0 kwargs - already scheduled
2023-06-2

Please advise how to fix this problem. Thank you!

Here is one additional piece of information that may help troubleshooting. Before we upgraded to CS 4.2, we rebuilt monogoDB in CS3.3.2 with help from wtempel. After we imported the original projects to CS3.3.2. We noted the original project was given a different project ID. For example, P46 was named P6 after imported to CS3.3.2 with the new database. However, P46 was found in the cryosparc_home directory, and P6 was nowhere found. I wonder if this may cause the problem after we upgraded to CS 4.2.

Hi @haomingz,

Looks like for some reason one of the users is missing an internal “state” variable. You can try:

  1. cryosparcm icli
  2. db.users.update_many({"state": {"$exists": False}}, {"$set": {"state": {}}})

This should initialize the state var and hopefully fix the error you’re seeing.

As for the projects changing IDs after updating and reattaching, this is the expected behaviour as attached projects are assigned a new project ID when added to an instance.

1 Like

Hi Wong,

Thanks for your quick response. Here is the output following your instructions:
[cryosparc@lnx00013 run]$ cryosparcm icli
Python 3.8.15 | packaged by conda-forge | (default, Nov 22 2022, 08:49:35)
Type ‘copyright’, ‘credits’ or ‘license’ for more information
IPython 7.33.0 – An enhanced Interactive Python. Type ‘?’ for help.

connecting to localhost:39002 …
cli, rtp, db, gfs and tools ready to use

In [1]: db.users.update_many({“state”: {“$exists”: False}}, {“$set”: {“state”: {}}})
Out[1]: <pymongo.results.UpdateResult at 0x7f21365567c0>

After CS 4.2 restarted, the CS instance was still not launched in Firefox. I did not find any errors in command_core.log as shown below:
2023-06-26 10:54:05,871 COMMAND.MAIN start INFO | === STARTED ===
2023-06-26 10:54:05,873 COMMAND.BG_WORKER background_worker INFO | === STARTED ===
2023-06-26 10:54:05,874 COMMAND.CORE run INFO | === STARTED TASKS WORKER ===

  • Serving Flask app “command_core” (lazy loading)
  • Environment: production
    WARNING: This is a development server. Do not use it in a production deployment.
    Use a production WSGI server instead.
  • Debug mode: off
    2023-06-26 10:54:08,378 COMMAND.MAIN startup INFO | Starting CryoSPARC v4.2.1+230621
    2023-06-26 10:54:08,378 COMMAND.MAIN startup INFO | platform_node : lnx00013
    2023-06-26 10:54:08,378 COMMAND.MAIN startup INFO | platform_release : 3.10.0-1160.90.1.el7.x86_64
    2023-06-26 10:54:08,378 COMMAND.MAIN startup INFO | platform_version : #1 SMP Fri Mar 17 08:39:44 UTC 2023
    2023-06-26 10:54:08,378 COMMAND.MAIN startup INFO | platform_architecture : x86_64
    2023-06-26 10:54:08,378 COMMAND.MAIN startup INFO | physical_cores : 48
    2023-06-26 10:54:08,378 COMMAND.MAIN startup INFO | max_cpu_freq : 3900.0
    2023-06-26 10:54:08,378 COMMAND.MAIN startup INFO | total_memory : 503.34GB
    2023-06-26 10:54:08,378 COMMAND.MAIN startup INFO | available_memory : 489.42GB
    2023-06-26 10:54:08,379 COMMAND.MAIN startup INFO | used_memory : 11.93GB
    2023-06-26 10:54:08,379 COMMAND.MAIN startup INFO | version : v4.2.1+230621
    2023-06-26 10:54:08,658 COMMAND.STARTUP startup INFO | CryoSPARC instance ID: 47a000a5-1c36-4bfc-9699-1ecb41602e05
    2023-06-26 10:54:08,658 COMMAND.SCHEDULER get_gpu_info INFO | UPDATING WORKER GPU INFO
    2023-06-26 10:54:08,659 COMMAND.JOBS update_all_job_sizes INFO | UPDATING ALL JOB SIZES IN 10s
    2023-06-26 10:54:08,660 COMMAND.DATA export_all_projects INFO | EXPORTING ALL PROJECTS IN 60s…
    2023-06-26 10:54:31,378 COMMAND.DATA dump_project INFO | Exporting project P6
    2023-06-26 10:54:31,380 COMMAND.DATA dump_project INFO | Exported project P6 to /DATA01/cryosparc_hz/P46/project.json in 0.00s
    2023-06-26 10:54:31,935 COMMAND.DATA dump_project INFO | Exporting project P8
    2023-06-26 10:54:31,937 COMMAND.DATA dump_project INFO | Exported project P8 to /DATA01/cryosparc_hz/P47/project.json in 0.00s
    2023-06-26 10:54:31,941 COMMAND.DATA dump_project INFO | Exporting project P9
    2023-06-26 10:54:31,942 COMMAND.DATA dump_project INFO | Exported project P9 to /DATA01/cryosparc_hz/P48/project.json in 0.00s
    2023-06-26 10:54:32,070 COMMAND.DATA dump_project INFO | Exporting project P10
    2023-06-26 10:54:32,071 COMMAND.DATA dump_project INFO | Exported project P10 to /data2/cryosparc_home/P32/project.json in 0.00s
    2023-06-26 10:54:32,177 COMMAND.DATA dump_project INFO | Exporting project P11
    2023-06-26 10:54:32,178 COMMAND.DATA dump_project INFO | Exported project P11 to /data2/cryosparc_home/P43/project.json in 0.00s
    2023-06-26 10:54:34,079 COMMAND.DATA dump_project INFO | Exporting project P12
    2023-06-26 10:54:34,081 COMMAND.DATA dump_project INFO | Exported project P12 to /data2/cryosparc_home/P39/project.json in 0.00s
    2023-06-26 10:54:36,862 COMMAND.DATA dump_project INFO | Exporting project P13
    2023-06-26 10:54:36,863 COMMAND.DATA dump_project INFO | Exported project P13 to /DATA01/CS_Project/P9/project.json in 0.00s

Could it be some firewall or communication problems? Please advise.

Hi @haomingz,

Could you please paste some recent lines of the app.log and app_api.log? They are in the same folder as your command_core.log. You can view them using
cryosparcm log app
and
cryosparcm log app_api

Also please paste your browser console logs by pressing F12 and refreshing the CryoSPARC UI tab.

The output of cryosparcm log app is as follows:

ESC[0mGET /assets/ChevronUp.970e0798.js ESC[32m200ESC[0m 2.450 ms - 517ESC[0m
ESC[0mGET /assets/Authenticated.b49ce3b7.js ESC[32m200ESC[0m 6.377 ms - -ESC[0m
ESC[0mGET /api/actions/users/view-options?granularity=job&view=table ESC[32m200ESC[0m 2.836 ms - 2ESC[0m
ESC[0mGET /api/actions/users/view-options?granularity=job&view=table ESC[32m200ESC[0m 2.836 ms - 2ESC[0m
ESC[0mGET /assets/Notifications.15714d9f.js ESC[32m200ESC[0m 1.098 ms - -ESC[0m
ESC[0mGET /assets/Notifications.15714d9f.js ESC[36m304ESC[0m 0.341 ms - -ESC[0m
ESC[0mGET /assets/ChevronRight.5d8a3128.js ESC[36m304ESC[0m 3.724 ms - -ESC[0m
ESC[0mGET /assets/SidebarLeftOpen.3b7701dd.js ESC[32m200ESC[0m 5.438 ms - 589ESC[0m
ESC[0mGET /assets/CheckCircle.4333580b.js ESC[32m200ESC[0m 1.938 ms - 315ESC[0m
ESC[0mGET /assets/Authenticated.b49ce3b7.js ESC[32m200ESC[0m 0.918 ms - -ESC[0m
ESC[0mGET /assets/Toggle.svelte_svelte&type=style&lang.0c323f51.js ESC[36m304ESC[0m 2.375 ms - -ESC[0m
ESC[0mGET /assets/Spinner.ad908aa0.js ESC[36m304ESC[0m 1.485 ms - -ESC[0m
ESC[0mGET /assets/Icon.66a18c0b.js ESC[36m304ESC[0m 4.081 ms - -ESC[0m
ESC[0mGET /assets/Exclamation.f5bfcd2d.js ESC[36m304ESC[0m 2.577 ms - -ESC[0m
ESC[0mGET /assets/TagCount.7a3f3e41.js ESC[36m304ESC[0m 1.981 ms - -ESC[0m
ESC[0mGET /assets/ChevronUp.970e0798.js ESC[36m304ESC[0m 2.698 ms - -ESC[0m
ESC[0mGET /assets/Logout.d4a6768e.js ESC[36m304ESC[0m 2.360 ms - -ESC[0m
ESC[0mGET /assets/Tag.2b4e7c42.js ESC[36m304ESC[0m 3.577 ms - -ESC[0m
ESC[0mGET /assets/Database.55c9be70.js ESC[36m304ESC[0m 0.823 ms - -ESC[0m
ESC[0mGET /assets/Search.fa01495d.js ESC[36m304ESC[0m 2.100 ms - -ESC[0m
ESC[0mGET /assets/SidebarLeftOpen.3b7701dd.js ESC[36m304ESC[0m 1.877 ms - -ESC[0m
ESC[0mGET /assets/CheckCircle.4333580b.js ESC[36m304ESC[0m 1.749 ms - -ESC[0m
ESC[0mGET /assets/LightningBolt.e8bda490.js ESC[36m304ESC[0m 2.077 ms - -ESC[0m
ESC[0mGET /assets/ChevronRight.5d8a3128.js ESC[36m304ESC[0m 1.175 ms - -ESC[0m
ESC[0mGET /assets/index.5036669d.js ESC[36m304ESC[0m 1.473 ms - -ESC[0m
ESC[0mGET /api/actions/users/state?attribute=quickAccess ESC[32m200ESC[0m 1.649 ms - 2ESC[0m
ESC[0mGET /api/actions/users/state?attribute=quickAccess ESC[32m200ESC[0m 1.649 ms - 2ESC[0m
ESC[0mGET /api/utility/click_wrap ESC[32m200ESC[0m 0.105 ms - 14ESC[0m
ESC[0mGET /api/utility/click_wrap ESC[32m200ESC[0m 0.105 ms - 14ESC[0m
ESC[0mPOST /api/cmd/core/get_license_live_enabled ESC[32m200ESC[0m 20.613 ms - 15ESC[0m
ESC[0mPOST /api/cmd/core/get_license_live_enabled ESC[32m200ESC[0m 20.613 ms - 15ESC[0m
ESC[0mPOST /api/cmd/core/get_license_ecl_enabled ESC[32m200ESC[0m 13.792 ms - 16ESC[0m
ESC[0mPOST /api/cmd/core/get_license_ecl_enabled ESC[32m200ESC[0m 13.792 ms - 16ESC[0m
ESC[0mPOST /api/cmd/core/get_update_tag ESC[32m200ESC[0m 372.702 ms - 74ESC[0m
ESC[0mPOST /api/cmd/core/get_update_tag ESC[32m200ESC[0m 372.702 ms - 74ESC[0m
ESC[0mGET /websocket ESC[33m400ESC[0m 5.017 ms - -ESC[0m
ESC[0mGET /websocket ESC[33m400ESC[0m 5.412 ms - -ESC[0m
ESC[0mGET /websocket ESC[33m400ESC[0m 6.900 ms - -ESC[0m
ESC[0mGET /websocket ESC[33m400ESC[0m 5.681 ms - -ESC[0m
ESC[0mGET /websocket ESC[33m400ESC[0m 5.178 ms - -ESC[0m
Waiting for data... (interrupt to abort)

The output of cryosparcm log app_api:

cryoSPARC Application API server running
cryoSPARC Application API server running
cryoSPARC Application API server running
cryoSPARC Application API server running
cryoSPARC Application API server running
cryoSPARC Application API server running
Waiting for data… (interrupt to abort)

Currently the UI shows a blank page:

The output of F12 and refreshing the cryosparc UI tab:
image

Hi @haomingz,

The hanging loader you are seeing in the UI is something we often encounter when network traffic is being blocked due to a firewall or reverse proxy.

Here is a link to our guide page on User Interface Error Logging, if you could send over these logs that would be much appreciated.

Also, if you are using a reverse proxy, it would be helpful if you could send over the configuration file.

You can send these files to feedback@structura.bio.

- Kelly

1 Like

Hi Kelly, I sent you the console and network log files by email. I wonder why we have this issue in CS4.2. Before the upgrade, our CS3.3.2 worked fine. We used the same port 39000. Thanks.

@haomingz We have inspected the files and It appears that the issues you are experiencing are almost certainly to do with your Apache reverse proxy setup not forwarding your websocket connection. You will need to update your configuration file to fix issue.

Here is an example configuration for Apache from our guide that accomplishes this: (Optional) Hosting CryoSPARC Through a Reverse Proxy - CryoSPARC Guide.

- Kelly

2 Likes

Thanks for your suggestions. I’ll have to work with our IT to find a solution. I’ll keep you posted.

Hi Kelly, you are right the problem was due to the reverse proxy. With a new config file, CS4.2 launched running fine. Thank you very much! Please close this case.

2 Likes