Backups for changing from CentOS 7

We have two single workstation set-ups that are still running CentOS 7, and obviously needs to be changed to continue updating CryoSPARC. I’ve included some info on the system at the end of this post incase it’s needed.

I’ve essentially inherited administration of these systems for my lab, and I am not a system administrator, so I’m being particularly paranoid about this switch. I’ve read a fair number of posts but I’m still unsure about some aspects (ok, all aspects but I have assistance with swapping the OS to Rocky 8 and installing CryoSPARC).

Do I need to save a copy of each Project Directory in addition to a copy of the database? I was planning on:

  1. Detaching each project from the CryoSPARC instance
  2. Copying the directories to an external drive
  3. Swapping the OS/install CS
  4. Transfer the directories back onto the machine
  5. Attaching them to the new CS install.
  6. Update the symlinks for imported data in active projects

The only thing I need to do to get a valid copy of the database is cryosparcm backup --dir=/path/to/external/storage? Or do I need to copy the entire cryosparc_database folder?

Is there anything else I need to back-up or make note of? (The raw data are stored elsewhere in long term storage and only active projects have the raw data locally).

Thanks in advance, and sorry for the novice questions!

$ uname -a
Linux spgpu 3.10.0-957.27.2.el7.x86_64 #1 SMP Mon Jul 29 17:46:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Master found in /opt/cryosparc2/cryosparc_master

$ grep CRYOSPARC_DB_PATH config.sh
export CRYOSPARC_DB_PATH="/opt/cryosparc2/cryosparc_database"
$ cat version
v4.7.1

The processed data is on zpools? If so could you just import the zpool with the detached projects after reinstalling the OS?

sudo zpool list

returns two results: one of two data drives (i.e. data1) and an SSD. If the other half of the data is stored on the drive not listed (i.e. data2), do I just need to transfer it to data1? Would that just be the project directories?

You may want to review the Rocky Linux life cycle schedule and consider that CryoSPARC also runs on some newer versions of Rocky Linux.
Please review the CryoSPARC Instance Migration Guide for suggestion applicable to your use case.
Some additional information about your storage is relevant to selecting the optimal backup and update strategy.
What are the outputs of these commands

df -hT /opt/cryosparc2/cryosparc_database
df -hT /opt/cryosparc2/cryosparc_master
df -hT /project/parent/dir # use actual path where projects are stored

Project directories contain data that are not stored in the database. Therefore, a database backup is not a substitute for a project directory backup.

Please review the detach/attach and archive/unarchive workflows to determine which may be most suitable for your needs. Depending on your storage arrangements and the outputs of the df commands above, neither may be needed. Feel free to ask additional questions here if you need clarification.

Both cryosparcm backup or a directory-level copy are feasible, but there are requirements for the latter method to produce a valid backup.
In addition to the validity requirement, a database backup must also be consistent. A database backup is consistent only with the state of project directories at the time of the backup. A database restored from a backup that was created before the latest changes to CryoSPARC project directories is inconsistent and may lead to the corruption of CryoSPARC projects. You may want to stop CryoSPARC before backing up the database and only resume CryoSPARC activity after the backup has been restored.
Depending on your storage arrangements for the database and project directories, a database backup may or may not be required.
Future readers of this forum topic may want to review the (new CryoSPARC v5) instance recovery tool that may be useful in related use cases.

Thanks so much for the reply!

Got it! I’ll pass the info along to the person helping me swap!

$df -hT /opt/cryosparc2/cryosparc_database
Filesystem              Type  Size  Used Avail Use% Mounted on 
/dev/mapper/centos-root ext4  197G  158G   29G  85% /
$df -hT /opt/cryosparc2/cryosparc_master
Filesystem              Type  Size  Used Avail Use% Mounted on 
/dev/mapper/centos-root ext4  197G  158G   29G  85% /
$df -hT /data
Filesystem    Type  Size  Used Avail Use% Mounted on 
/data         zfs    26T  7.9T   18T  32% /data
$df -hT /data2
Filesystem    Type  Size  Used Avail Use% Mounted on 
/dev/sdg1     ext4   19T  1.2T   17T  7%  /data2

I can’t guarantee that the new CryoSparc instance will have the same ID as the previous so I thought detach was the better way to go.

Got it! cryosparcm backup will be the very last thing I do to ensure the state of project directories is consistent.

Thank you!

I strongly recommend allowing more than 29 GB for future growth of the database; the database may grow to hundreds of gigabytes or, in some cases, several terabytes. The OS update and planned database backup/restore may be a good time to add an additional, dedicated storage device (and filesystem outside the OS root filesystem) for the CryoSPARC database.

Are you referring to the license ID or the instance ID? The latter is stored in the CryoSPARC database and would be preserved if the same database were to be used (after database backup and restore) before and after the OS update.