Database won't start because disk is full

Hi,

I carelessly let the disk with the cryosparc database fill up completely, and now cryosparc can’t start properly after rebooting the server. When I run “cryosparcm start”, I get the following error:

CryoSPARC is running.
Stopping cryoSPARC
Shut down
Starting cryoSPARC System master process…
Error!! The disk where the cryoSPARC database is installed has less than 5GB of free space.
Clear space before starting cryoSPARC.

Because the web interface can’t launch, I’ve been trying to clear up some space using the command line interface, as described here. However, this also doesn’t work and I get the following error:

/home/cryosparcuser/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/command.py:134: UserWarning: *** CommandClient: (http://egret:39002/api) URL Error [Errno 111] Connection refused, attempt 1 of 3. Retrying in 30 seconds

Does anybody have any advice for either starting cryoSPARC or troubleshooting the CLI error? I found a few posts from 2020 (here and here) describing a similar problem, but they didn’t mention any issues with using the command line interface.

Thanks,
Beck

The command “cryosparcm log database” produces the following output:

2024-04-25T15:51:15.797-0400 I COMMAND  [conn10] command meteor.jobs command: getMore { getMore: 122354730004, collection: "jobs", $db: "meteor" } originatingCommand: { find: "jobs", skip: 0, snapshot: true, $readPreference: { mode: "secondaryPreferred" }, $db: "meteor" } planSummary: COLLSCAN cursorid:122354730004 keysExamined:0 docsExamined:216 numYields:7 nreturned:216 reslen:16772947 locks:{ Global: { acquireCount: { r: 16 } }, Database: { acquireCount: { r: 8 } }, Collection: { acquireCount: { r: 8 } } } protocol:op_query 141ms
2024-04-25T15:51:16.015-0400 I COMMAND  [conn10] command meteor.jobs command: getMore { getMore: 122354730004, collection: "jobs", $db: "meteor" } originatingCommand: { find: "jobs", skip: 0, snapshot: true, $readPreference: { mode: "secondaryPreferred" }, $db: "meteor" } planSummary: COLLSCAN cursorid:122354730004 keysExamined:0 docsExamined:171 numYields:5 nreturned:171 reslen:16717258 locks:{ Global: { acquireCount: { r: 12 } }, Database: { acquireCount: { r: 6 } }, Collection: { acquireCount: { r: 6 } } } protocol:op_query 105ms
2024-04-25T15:51:16.135-0400 I COMMAND  [conn10] command meteor.jobs command: getMore { getMore: 122354730004, collection: "jobs", $db: "meteor" } originatingCommand: { find: "jobs", skip: 0, snapshot: true, $readPreference: { mode: "secondaryPreferred" }, $db: "meteor" } planSummary: COLLSCAN cursorid:122354730004 keysExamined:0 docsExamined:148 numYields:5 nreturned:148 reslen:16600047 locks:{ Global: { acquireCount: { r: 12 } }, Database: { acquireCount: { r: 6 } }, Collection: { acquireCount: { r: 6 } } } protocol:op_query 111ms
2024-04-25T15:51:16.228-0400 I COMMAND  [conn9] command meteor.events command: getMore { getMore: 94364112205, collection: "events", $db: "meteor" } originatingCommand: { find: "events", skip: 0, snapshot: true, $readPreference: { mode: "secondaryPreferred" }, $db: "meteor" } planSummary: COLLSCAN cursorid:94364112205 keysExamined:0 docsExamined:35608 numYields:278 nreturned:35608 reslen:16776767 locks:{ Global: { acquireCount: { r: 558 } }, Database: { acquireCount: { r: 279 } }, Collection: { acquireCount: { r: 279 } } } protocol:op_query 111ms
2024-04-25T15:51:16.360-0400 I COMMAND  [conn10] command meteor.jobs command: getMore { getMore: 122354730004, collection: "jobs", $db: "meteor" } originatingCommand: { find: "jobs", skip: 0, snapshot: true, $readPreference: { mode: "secondaryPreferred" }, $db: "meteor" } planSummary: COLLSCAN cursorid:122354730004 keysExamined:0 docsExamined:188 numYields:6 nreturned:188 reslen:16677424 locks:{ Global: { acquireCount: { r: 14 } }, Database: { acquireCount: { r: 7 } }, Collection: { acquireCount: { r: 7 } } } protocol:op_query 114ms
2024-04-25T15:51:16.409-0400 I COMMAND  [conn9] command meteor.events command: getMore { getMore: 94364112205, collection: "events", $db: "meteor" } originatingCommand: { find: "events", skip: 0, snapshot: true, $readPreference: { mode: "secondaryPreferred" }, $db: "meteor" } planSummary: COLLSCAN cursorid:94364112205 keysExamined:0 docsExamined:54992 numYields:429 nreturned:54992 reslen:16777137 locks:{ Global: { acquireCount: { r: 860 } }, Database: { acquireCount: { r: 430 } }, Collection: { acquireCount: { r: 430 } } } protocol:op_query 148ms
2024-04-25T15:51:16.576-0400 I COMMAND  [conn9] command meteor.events command: getMore { getMore: 94364112205, collection: "events", $db: "meteor" } originatingCommand: { find: "events", skip: 0, snapshot: true, $readPreference: { mode: "secondaryPreferred" }, $db: "meteor" } planSummary: COLLSCAN cursorid:94364112205 keysExamined:0 docsExamined:57069 numYields:445 nreturned:57069 reslen:16776341 locks:{ Global: { acquireCount: { r: 892 } }, Database: { acquireCount: { r: 446 } }, Collection: { acquireCount: { r: 446 } } } protocol:op_query 128ms
2024-04-25T15:51:16.890-0400 I COMMAND  [conn9] command meteor.events command: getMore { getMore: 94364112205, collection: "events", $db: "meteor" } originatingCommand: { find: "events", skip: 0, snapshot: true, $readPreference: { mode: "secondaryPreferred" }, $db: "meteor" } planSummary: COLLSCAN cursorid:94364112205 keysExamined:0 docsExamined:66907 numYields:522 nreturned:66907 reslen:16777182 locks:{ Global: { acquireCount: { r: 1046 } }, Database: { acquireCount: { r: 523 } }, Collection: { acquireCount: { r: 523 } } } protocol:op_query 128ms
2024-04-25T15:51:17.082-0400 I COMMAND  [conn9] command meteor.events command: getMore { getMore: 94364112205, collection: "events", $db: "meteor" } originatingCommand: { find: "events", skip: 0, snapshot: true, $readPreference: { mode: "secondaryPreferred" }, $db: "meteor" } planSummary: COLLSCAN cursorid:94364112205 keysExamined:0 docsExamined:49890 numYields:389 nreturned:49890 reslen:16777198 locks:{ Global: { acquireCount: { r: 780 } }, Database: { acquireCount: { r: 390 } }, Collection: { acquireCount: { r: 390 } } } protocol:op_query 145ms
2024-04-25T15:51:18.606-0400 I COMMAND  [conn9] command meteor.events command: getMore { getMore: 94364112205, collection: "events", $db: "meteor" } originatingCommand: { find: "events", skip: 0, snapshot: true, $readPreference: { mode: "secondaryPreferred" }, $db: "meteor" } planSummary: COLLSCAN cursorid:94364112205 keysExamined:0 docsExamined:68373 numYields:534 nreturned:68373 reslen:16777190 locks:{ Global: { acquireCount: { r: 1070 } }, Database: { acquireCount: { r: 535 } }, Collection: { acquireCount: { r: 535 } } } protocol:op_query 105ms
2024-04-25T15:51:19.186-0400 I NETWORK  [conn6] end connection 127.0.0.1:53670 (5 connections now open)
2024-04-25T15:51:19.204-0400 I NETWORK  [conn10] end connection 127.0.0.1:39564 (4 connections now open)
2024-04-25T15:51:19.204-0400 I NETWORK  [conn5] end connection 127.0.0.1:39556 (3 connections now open)
2024-04-25T15:51:19.204-0400 I NETWORK  [conn8] end connection 127.0.0.1:53686 (2 connections now open)
2024-04-25T15:51:19.206-0400 I NETWORK  [conn9] Error sending response to client: SocketException: Broken pipe. Ending connection from 127.0.0.1:53700 (connection id: 9)
2024-04-25T15:51:19.206-0400 I NETWORK  [conn9] end connection 127.0.0.1:53700 (1 connection now open)
2024-04-25T15:51:19.232-0400 I NETWORK  [conn7] Error sending response to client: SocketException: Broken pipe. Ending connection from 127.0.0.1:53672 (connection id: 7)
2024-04-25T15:51:19.232-0400 I NETWORK  [conn7] end connection 127.0.0.1:53672 (0 connections now open)
2024-04-25T15:51:19.302-0400 I NETWORK  [listener] connection accepted from 127.0.0.1:39580 #11 (1 connection now open)
2024-04-25T15:51:19.322-0400 I NETWORK  [conn11] received client metadata from 127.0.0.1:39580 conn11: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.6.23" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "20.04" } }
2024-04-25T15:51:19.340-0400 I ACCESS   [conn11] Successfully authenticated as principal cryosparc_user on admin from client 127.0.0.1:39580
2024-04-25T15:51:19.343-0400 I ACCESS   [conn11] Successfully authenticated as principal cryosparc_user on admin from client 127.0.0.1:39580
2024-04-25T15:51:19.345-0400 I NETWORK  [conn11] end connection 127.0.0.1:39580 (0 connections now open)
2024-04-25T15:51:22.024-0400 W FTDC     [ftdc] Uncaught exception in 'FileStreamFailed: Failed to write to interim file buffer for full-time diagnostic data capture: /home/cryosparcuser/cryosparc/cryosparc_database/diagnostic.data/metrics.interim.temp' in full-time diagnostic data capture subsystem. Shutting down the full-time diagnostic data capture subsystem.
2024-04-25T15:51:39.410-0400 I NETWORK  [listener] connection accepted from 127.0.0.1:43930 #12 (1 connection now open)
2024-04-25T15:51:39.411-0400 I NETWORK  [conn12] received client metadata from 127.0.0.1:43930 conn12: { driver: { name: "PyMongo", version: "3.13.0" }, os: { type: "Linux", name: "Linux", architecture: "x86_64", version: "5.15.0-105-generic" }, platform: "CPython 3.8.17.final.0" }
2024-04-25T15:51:39.412-0400 I NETWORK  [conn12] end connection 127.0.0.1:43930 (0 connections now open)
2024-04-25T15:51:39.413-0400 I NETWORK  [listener] connection accepted from 127.0.0.1:56786 #13 (1 connection now open)
2024-04-25T15:51:39.413-0400 I NETWORK  [conn13] received client metadata from 127.0.0.1:56786 conn13: { driver: { name: "PyMongo", version: "3.13.0" }, os: { type: "Linux", name: "Linux", architecture: "x86_64", version: "5.15.0-105-generic" }, platform: "CPython 3.8.17.final.0" }
2024-04-25T15:51:39.414-0

You can manually delete the *patch_aligned.mrc (not the *patch_aligned_doseweighted.mrc!) micrographs in a project to free up some space in an emergency; only Patch CTF should need the non-dose-weighted micrographs. If you then find you need them again, once you’ve cleared everything up and got CryoSPARC back up, you can re-run patch motion.

I did this once with no ill-effects (although I don’t have the projects and database on the same disk…)

Thank you for the suggestion! Unfortunately, I imported the *patch_aligned_doseweighted.mrc micrographs directly from a cryoSPARC live session, so there are no *patch_aligned.mrc files for me to delete. I’ll definitely keep this in mind in the future, though. Do you think there might be other files that are safe to delete, such as extracted particles? I could easily rerun those when cryoSPARC starts again. I’ll have a difficult time finding them without the web interface though, since there are 600 jobs in the project…

Extracted particles should be OK; at least the *mrc stacks. Just remember to clear the job when you get back up. :wink:

Finding them will be easy - in the project folder, just ls -l */extract/*patch_aligned_doseweighted_particles.mrc

That was a good idea, thank you! Unfortunately, I’m still getting the error that the database has less than 5 GB of free space. Now that I think about it, the database is on a separate disk than the actual project data, so would deleting some of these files actually help? I’m not sure what I can delete on the disk that actually contains the database. Commands like “cryosparcm compact” don’t work because cryosparc hasn’t started, unfortunately.

If database on OS disk, can try clearing the package manager cache…

If using recent Ubuntu, can try briefly disabling the swapfile.

Sorry, now on phone so struggling with autoincorrect.

I’m using Ubuntu, and the package manager appears to be apt-get. From what I can find online, “apt-get clean” is the command to clear the cache, but unfortunately nothing seems to have gotten removed from the disk, and the database still can’t start. The command “apt-cache stats” gives the following output:

Total package names: 129724 (3,632 k)
Total package structures: 126804 (5,579 k)
  Normal packages: 93411
  Pure virtual packages: 2651
  Single virtual packages: 16440
  Mixed virtual packages: 2298
  Missing: 12004
Total distinct versions: 108403 (9,539 k)
Total distinct descriptions: 199312 (4,783 k)
Total dependencies: 663612/163305 (15.9 M)
Total ver/file relations: 57947 (1,391 k)
Total Desc/File relations: 2704 (64.9 k)
Total Provides mappings: 64427 (1,546 k)
Total globbed strings: 251572 (6,564 k)
Total slack space: 98.4 k
Total space accounted for: 49.5 M
Total buckets in PkgHashTable: 50503
  Unused: 4712
  Used: 45791
  Utilization: 90.6699%
  Average entries: 2.76919
  Longest: 60
  Shortest: 1
Total buckets in GrpHashTable: 50503
  Unused: 3822
  Used: 46681
  Utilization: 92.4321%
  Average entries: 2.77895
  Longest: 12
  Shortest: 1

I tried disabling the swapfile using “swapoff -a”, but this also doesn’t allow cryosparc to start.

Unfortunately, I know very little about Linux (and computers in general), and I really want to learn more about the fundamentals of information technology so that I can troubleshoot problems like this more effectively. You seem to be really knowledgeable in this area, so would you have any recommendations for resources/relevant topics I should look into?

Is it the same error when trying to start regarding space?

If you have large trained sets for conda environments (e.g.: Model Angelo) they’re several gigabytes and can be moved temporarily, but another thought occurred; if CryoSPARC tried updating the database with the disk full it may have corrupted it. Not sure there. I hope one of the CryoSPARC team can offer more suggestions.

The solution was much easier than I thought. I realized that the cryosparc database wasn’t the only thing on this disk - every user’s home directory was also on the same disk. Simply clearing the trash from my own directory freed up enough space for cryosparc to start properly.

Thank you rbs_sci for all of your suggestions!

@cbeck Glad you resolved this issue without finding the database to have been corrupted in the process.
You may want to

  1. keep a close eye on disk usage going forward
  2. ensure database journaling is enabled, which has become the default in CryoSPARC v4.4