What is mongo DB? Migrating cryosparc to new server

orangeboomerang · October 31, 2024, 9:47am

hi,

I am following this guide on backing up cryosparc, and it works fine. I have an output .archive file that is 36GB, however the total size of our cryosparc projects is 9.6TB. So clearly this archive isn’t the data compressed, so what is this backup?

I am migrating cryosparc to a new server and upgrading to the latest version. In my simple mind I reasoned I could simply detach each project, keep them somewhere safe, install the new cryosparc at the new location, then import each project one-by-one. Is this not the best way to migrate? In the past to backup a project I have also simply copied the project folder and removed the lock file then reimported without issue, but I guess I should be archiving instead. Should I be using detach or archive?

Why are there then options for “migrating” and why does that not involve any detach/attach? It’s all very confusing.

wtempel · October 31, 2024, 6:14pm

Information in this post applies to CryoSPARC v4.

The database backup contains information that is stored in the CryoSPARC database, which is implemented using mongodb. The database contains:

CryoSPARC login user records
the job scheduler configuration
absolute paths to CryoSPARC project directories
job-related information: inter-job relationships, job input connections, job parameters, job status, event log text and images, etc.

The database does not store actual micrograph or volume files.
The database is used as an information source for the CryoSPARC app (“GUI”) and a destination for information created by active CryoSPARC jobs.
Job-related database records are continuously exported to CryoSPARC project directories.

Briefly, archive/unarchive assumes that neither the project’s database records nor project directory changes between archiving and unarchiving. Attach recreates database records based on information previously exported (on any CryoSPARC instance) from the database to the project directory.
There are, broadly speaking, two alternatives for the migration of a CryoSPARC instance that contains active projects.

Alternative A involves transfer of the database, either via “file-level” transfer (OS-level copy or filesystem snapshot, details), or creation, transfer and restoration of a database “dump” (cryosparcm backup, cryosparcm restore). In either case, success of migration depends on the correspondence/compatibility of the states of database and project directories, respectively. CryoSPARC malfunction and corruption of database and/or project directories will likely occur when

project directories were modified after the database backup or copy was created
an attached project directory is, via restoration of an outdated backup, rolled back to an older state

If one follows migration alternative A with a compatible database copy or backup, the CryoSPARC archive project action needs to be applied on the “source” CryoSPARC instance for project directories whose absolute path will change during the migration before copying or backing up the database. Archived projects need to be unarchived at the destination CryoSPARC instance.

In case the migration involves a change in CryoSPARC workers, one may update worker nodes (“managed workers”) and/or clusters of workers. One may also remove obsolete scheduler lanes or remove individual target nodes. Ensure that any remaining scheduler lane contains at least one functional scheduler target. One may also connect additional worker nodes or worker clusters

Alternative B does not involve a copy of the old database, but involves detaching projects from the source instance and attaching projects to the destination instance. In addition, CryoSPARC logins and scheduler lane configurations need to be created.
[updated 2024-11-05 to specify when the archive project action should be applied.