Migrate to a new location

ncoudray · October 25, 2023, 7:27pm

Hi -
I have a question re migrating the whole Cryosparc database and content to a new server. It looks like these instructions are for version < 3.3 and those should be used instead. However, in the new instructions, I only see how to transfer a specific project. Can we transfer all of them at once with the database and if so, what would be the best protocol to follow (for version 4.3.0)?
Thanks,
Best,
Nicolas

wtempel · October 26, 2023, 3:56pm

Please can you provide additional information:

Is this a “single workstation”-type CryoSPARC instance or does the instance use separate workers?
Are the following stored on “local” (inside the server that is to be replaced) or networked storage:
1. cryosparc_master/
2. The directory defined by $CRYOSPARC_DB_PATH
3. CryoSPARC project directories
4. directories with raw data
Can you ensure the the paths of the items above can remain the same before and after the move to the new server?

Outputs of these commands could be helpful for further discussion:

cryosparcm status | grep -v LICENSE
cryosparcm cli "get_scheduler_targets()"

nimgs-it · October 26, 2023, 3:58pm

When we have moved a CS (master+worker) instance from one server to another we ensure that the source is the same (cs) version+patch as the destination, then ensure that all the mount points are the same between both servers, then we move the cryosparc_database folder from the source to the destination server. You may have to edit the master config file if the host name has changed. Then we are able to start the instance and use the icli to remove the old lanes/workers and then re attache the new worker to the master on the destination.

wtempel · October 26, 2023, 4:04pm

… presumably ensuring that CryoSPARC has already been shutdown (with confirmation) on the old server before the data move.

ncoudray · October 26, 2023, 5:37pm

Hi - Thanks both for the replies!

@nimgs-it: When you say “ ensure that all the mount points are the same between both servers”, what do you mean exactly? Do you mean preserve the absolute path names?

@wtempel: here’s the detailed info:

The CryoSPARC instance is currently located on a slurm server with workers (and will be moved to a different slurm server)
all folders are stored under “/gpfs/data”, but scattered in range of places and sub-folders under that path
That’s probably the trickiest part - The paths will likely be different (not sure if you’re talking in terms of “relative” or “absolute” paths - but names of the paths will very likely need to differ. Relative paths of master vs db vs projects could potentially be preserved, but probably not the path relative to the raw data ones)

ncoudray · November 1, 2023, 12:02pm

and regarding the output for

cryosparcm status | grep -v LICENSE
cryosparcm cli "get_scheduler_targets()"

Here it is:

CryoSPARC System master node installed at

/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_master

Current cryoSPARC version: v4.3.0

CryoSPARC is not running.

global config variables:

export CRYOSPARC_HOSTNAME_CHECK=“ws-0001.cm.cluster”

export CRYOSPARC_HEARTBEAT_SECONDS=300

export CRYOSPARC_MASTER_HOSTNAME=“ws-0001.cm.cluster”

export CRYOSPARC_DB_PATH=“/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_database”

export CRYOSPARC_BASE_PORT=39100

export CRYOSPARC_DEVELOP=false

export CRYOSPARC_INSECURE=false

and for the second command:

[{'cache_path': '/tmp/cryosparc/bhabhaekiertlabs', 'cache_quota_mb': None, 'cache_reserve_mb': 80000, 'custom_var_names': [], 'desc': None, 'hostname': 'BP_gpu4_long', 'lane': 'BP_gpu4_long', 'name': 'BP_gpu4_long', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': 'sinfo', 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qsub_cmd_tpl': 'sbatch {{ script_path_abs }}', 'script_tpl': '#!/usr/bin/env bash\n#### cryoSPARC cluster submission script template for SLURM\n## Available variables:\n## {{ run_cmd }}            - the complete command string to run the job\n## {{ num_cpu }}            - the number of CPUs needed\n## {{ num_gpu }}            - the number of GPUs needed. \n##                            Note: the code will use this many GPUs starting from dev id 0\n##                                  the cluster scheduler or this script have the responsibility\n##                                  of setting CUDA_VISIBLE_DEVICES so that the job code ends up\n##                                  using the correct cluster-allocated GPUs.\n## {{ ram_gb }}             - the amount of RAM needed in GB\n## {{ job_dir_abs }}        - absolute path to the job directory\n## {{ project_dir_abs }}    - absolute path to the project dir\n## {{ job_log_path_abs }}   - absolute path to the log file for the job\n## {{ worker_bin_path }}    - absolute path to the cryosparc worker command\n## {{ run_args }}           - arguments to be passed to cryosparcw run\n## {{ project_uid }}        - uid of the project\n## {{ job_uid }}            - uid of the job\n##\n## What follows is a simple SLURM script:\n\n#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}\n#SBATCH -N 1\n#SBATCH -n {{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH -p gpu4_long\n## #SBATCH --mem={{ (ram_gb*1000)|int }}MB             \n#SBATCH --mem-per-cpu={{ (ram_gb*8000/num_cpu)|int }}MB\n#SBATCH -o {{ job_dir_abs }}/cryosparc_{{ project_uid }}_{{ job_uid }}.out\n#SBATCH -e {{ job_dir_abs }}/cryosparc_{{ project_uid }}_{{ job_uid }}.err\n\navailable_devs=""\nfor devidx in $(seq 0 15);\ndo\n    if [[ -z $(nvidia-smi -i $devidx --query-compute-apps=pid --format=csv,noheader) ]] ; then\n        if [[ -z "$available_devs" ]] ; then\n            available_devs=$devidx\n        else\n            available_devs=$available_devs,$devidx\n        fi\n    fi\ndone\nexport CUDA_VISIBLE_DEVICES=$available_devs\n\n{{ run_cmd }}\n\n\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'BP_gpu4_long', 'tpl_vars': ['job_uid', 'run_cmd', 'num_cpu', 'command', 'job_log_path_abs', 'num_gpu', 'cluster_job_id', 'job_dir_abs', 'worker_bin_path', 'run_args', 'project_dir_abs', 'project_uid', 'ram_gb'], 'type': 'cluster', 'worker_bin_path': '/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_worker/bin/cryosparcw'}, {'cache_path': '/tmp/cryosparc/bhabhaekiertlabs', 'cache_quota_mb': None, 'cache_reserve_mb': 80000, 'custom_var_names': [], 'custom_vars': {}, 'desc': None, 'hostname': 'BP_gpu4_med', 'lane': 'BP_gpu4_med', 'name': 'BP_gpu4_med', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': 'sinfo', 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qstat_code_cmd_tpl': None, 'qsub_cmd_tpl': 'sbatch {{ script_path_abs }}', 'script_tpl': '#!/usr/bin/env bash\n#### cryoSPARC cluster submission script template for SLURM\n## Available variables:\n## {{ run_cmd }}            - the complete command string to run the job\n## {{ num_cpu }}            - the number of CPUs needed\n## {{ num_gpu }}            - the number of GPUs needed. \n##                            Note: the code will use this many GPUs starting from dev id 0\n##                                  the cluster scheduler or this script have the responsibility\n##                                  of setting CUDA_VISIBLE_DEVICES so that the job code ends up\n##                                  using the correct cluster-allocated GPUs.\n## {{ ram_gb }}             - the amount of RAM needed in GB\n## {{ job_dir_abs }}        - absolute path to the job directory\n## {{ project_dir_abs }}    - absolute path to the project dir\n## {{ job_log_path_abs }}   - absolute path to the log file for the job\n## {{ worker_bin_path }}    - absolute path to the cryosparc worker command\n## {{ run_args }}           - arguments to be passed to cryosparcw run\n## {{ project_uid }}        - uid of the project\n## {{ job_uid }}            - uid of the job\n##\n## What follows is a simple SLURM script:\n\n#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}\n#SBATCH -N 1\n#SBATCH -n {{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH -p gpu4_medium\n## #SBATCH --mem-per-cpu={{ (ram_gb*10000/num_cpu)|int }}MB\n#SBATCH --mem={{ (ram_gb*5000)|int }}MB             \n#SBATCH -o {{ job_dir_abs }}/cryosparc_{{ project_uid }}_{{ job_uid }}.out\n#SBATCH -e {{ job_dir_abs }}/cryosparc_{{ project_uid }}_{{ job_uid }}.err\n\navailable_devs=""\nfor devidx in $(seq 0 15);\ndo\n    if [[ -z $(nvidia-smi -i $devidx --query-compute-apps=pid --format=csv,noheader) ]] ; then\n        if [[ -z "$available_devs" ]] ; then\n            available_devs=$devidx\n        else\n            available_devs=$available_devs,$devidx\n        fi\n    fi\ndone\nexport CUDA_VISIBLE_DEVICES=$available_devs\n\n{{ run_cmd }}\n\n\n\n\n\n\n\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'BP_gpu4_med', 'tpl_vars': ['job_uid', 'run_cmd', 'num_cpu', 'command', 'job_log_path_abs', 'num_gpu', 'cluster_job_id', 'job_dir_abs', 'worker_bin_path', 'run_args', 'project_dir_abs', 'project_uid', 'ram_gb'], 'type': 'cluster', 'worker_bin_path': '/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_worker/bin/cryosparcw'}, {'cache_path': '/tmp/cryosparc/bhabhaekiertlabs', 'cache_quota_mb': None, 'cache_reserve_mb': 80000, 'custom_var_names': [], 'custom_vars': {}, 'desc': None, 'hostname': 'BP_gpu4_short', 'lane': 'BP_gpu4_short', 'name': 'BP_gpu4_short', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': 'sinfo', 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qstat_code_cmd_tpl': None, 'qsub_cmd_tpl': 'sbatch {{ script_path_abs }}', 'script_tpl': '#!/usr/bin/env bash\n#### cryoSPARC cluster submission script template for SLURM\n## Available variables:\n## {{ run_cmd }}            - the complete command string to run the job\n## {{ num_cpu }}            - the number of CPUs needed\n## {{ num_gpu }}            - the number of GPUs needed. \n##                            Note: the code will use this many GPUs starting from dev id 0\n##                                  the cluster scheduler or this script have the responsibility\n##                                  of setting CUDA_VISIBLE_DEVICES so that the job code ends up\n##                                  using the correct cluster-allocated GPUs.\n## {{ ram_gb }}             - the amount of RAM needed in GB\n## {{ job_dir_abs }}        - absolute path to the job directory\n## {{ project_dir_abs }}    - absolute path to the project dir\n## {{ job_log_path_abs }}   - absolute path to the log file for the job\n## {{ worker_bin_path }}    - absolute path to the cryosparc worker command\n## {{ run_args }}           - arguments to be passed to cryosparcw run\n## {{ project_uid }}        - uid of the project\n## {{ job_uid }}            - uid of the job\n##\n## What follows is a simple SLURM script:\n\n#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}\n#SBATCH -N 1\n#SBATCH -n {{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH -p gpu4_short\n## #SBATCH --mem-per-cpu={{ (ram_gb*5000/num_cpu)|int }}MB\n#SBATCH --mem={{ (ram_gb*3000)|int }}MB             \n#SBATCH -o {{ job_dir_abs }}/cryosparc_{{ project_uid }}_{{ job_uid }}.out\n#SBATCH -e {{ job_dir_abs }}/cryosparc_{{ project_uid }}_{{ job_uid }}.err\n\navailable_devs=""\nfor devidx in $(seq 0 15);\ndo\n    if [[ -z $(nvidia-smi -i $devidx --query-compute-apps=pid --format=csv,noheader) ]] ; then\n        if [[ -z "$available_devs" ]] ; then\n            available_devs=$devidx\n        else\n            available_devs=$available_devs,$devidx\n        fi\n    fi\ndone\nexport CUDA_VISIBLE_DEVICES=$available_devs\n\n{{ run_cmd }}\n\n\n\n\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'BP_gpu4_short', 'tpl_vars': ['job_uid', 'run_cmd', 'num_cpu', 'command', 'job_log_path_abs', 'num_gpu', 'cluster_job_id', 'job_dir_abs', 'worker_bin_path', 'run_args', 'project_dir_abs', 'project_uid', 'ram_gb'], 'type': 'cluster', 'worker_bin_path': '/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_worker/bin/cryosparcw'}, {'cache_path': '/tmp/cryosparc/bhabhaekiertlabs', 'cache_quota_mb': None, 'cache_reserve_mb': 80000, 'custom_var_names': [], 'desc': None, 'hostname': 'BP_gpu4_dev', 'lane': 'BP_gpu4_dev', 'name': 'BP_gpu4_dev', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': 'sinfo', 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qsub_cmd_tpl': 'sbatch {{ script_path_abs }}', 'script_tpl': '#!/usr/bin/env bash\n#### cryoSPARC cluster submission script template for SLURM\n## Available variables:\n## {{ run_cmd }}            - the complete command string to run the job\n## {{ num_cpu }}            - the number of CPUs needed\n## {{ num_gpu }}            - the number of GPUs needed. \n##                            Note: the code will use this many GPUs starting from dev id 0\n##                                  the cluster scheduler or this script have the responsibility\n##                                  of setting CUDA_VISIBLE_DEVICES so that the job code ends up\n##                                  using the correct cluster-allocated GPUs.\n## {{ ram_gb }}             - the amount of RAM needed in GB\n## {{ job_dir_abs }}        - absolute path to the job directory\n## {{ project_dir_abs }}    - absolute path to the project dir\n## {{ job_log_path_abs }}   - absolute path to the log file for the job\n## {{ worker_bin_path }}    - absolute path to the cryosparc worker command\n## {{ run_args }}           - arguments to be passed to cryosparcw run\n## {{ project_uid }}        - uid of the project\n## {{ job_uid }}            - uid of the job\n##\n## What follows is a simple SLURM script:\n\n#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}\n#SBATCH -N 1\n#SBATCH -n {{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH -p gpu4_dev\n#SBATCH --mem-per-cpu={{ (ram_gb*2000/num_cpu)|int }}MB\n## #SBATCH --mem-per-cpu={{ (ram_gb*8000/num_cpu)|int }}MB\n## #SBATCH --mem={{ (ram_gb*1000)|int }}MB             \n#SBATCH -o {{ job_dir_abs }}/cryosparc_{{ project_uid }}_{{ job_uid }}.out\n#SBATCH -e {{ job_dir_abs }}/cryosparc_{{ project_uid }}_{{ job_uid }}.err\n\navailable_devs=""\nfor devidx in $(seq 0 15);\ndo\n    if [[ -z $(nvidia-smi -i $devidx --query-compute-apps=pid --format=csv,noheader) ]] ; then\n        if [[ -z "$available_devs" ]] ; then\n            available_devs=$devidx\n        else\n            available_devs=$available_devs,$devidx\n        fi\n    fi\ndone\nexport CUDA_VISIBLE_DEVICES=$available_devs\n\n{{ run_cmd }}\n\n\n\n\n\n\n\n\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'BP_gpu4_dev', 'tpl_vars': ['job_uid', 'run_cmd', 'num_cpu', 'command', 'job_log_path_abs', 'num_gpu', 'cluster_job_id', 'job_dir_abs', 'worker_bin_path', 'run_args', 'project_dir_abs', 'project_uid', 'ram_gb'], 'type': 'cluster', 'worker_bin_path': '/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_worker/bin/cryosparcw'}, {'cache_path': '/tmp/cryosparc/bhabhaekiertlabs', 'cache_quota_mb': None, 'cache_reserve_mb': 80000, 'custom_var_names': [], 'custom_vars': {}, 'desc': None, 'hostname': 'BP_gpu8_short', 'lane': 'BP_gpu8_short', 'name': 'BP_gpu8_short', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': 'sinfo', 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qstat_code_cmd_tpl': None, 'qsub_cmd_tpl': 'sbatch {{ script_path_abs }}', 'script_tpl': '#!/usr/bin/env bash\n#### cryoSPARC cluster submission script template for SLURM\n## Available variables:\n## {{ run_cmd }}            - the complete command string to run the job\n## {{ num_cpu }}            - the number of CPUs needed\n## {{ num_gpu }}            - the number of GPUs needed. \n##                            Note: the code will use this many GPUs starting from dev id 0\n##                                  the cluster scheduler or this script have the responsibility\n##                                  of setting CUDA_VISIBLE_DEVICES so that the job code ends up\n##                                  using the correct cluster-allocated GPUs.\n## {{ ram_gb }}             - the amount of RAM needed in GB\n## {{ job_dir_abs }}        - absolute path to the job directory\n## {{ project_dir_abs }}    - absolute path to the project dir\n## {{ job_log_path_abs }}   - absolute path to the log file for the job\n## {{ worker_bin_path }}    - absolute path to the cryosparc worker command\n## {{ run_args }}           - arguments to be passed to cryosparcw run\n## {{ project_uid }}        - uid of the project\n## {{ job_uid }}            - uid of the job\n##\n## What follows is a simple SLURM script:\n\n#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}\n#SBATCH -N 1\n#SBATCH -n {{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH -p gpu8_short\n## #SBATCH --mem-per-cpu={{ (ram_gb*20000/num_cpu)|int }}MB\n#SBATCH --mem={{ (ram_gb*1000)|int }}MB             \n#SBATCH -o {{ job_dir_abs }}/cryosparc_{{ project_uid }}_{{ job_uid }}.out\n#SBATCH -e {{ job_dir_abs }}/cryosparc_{{ project_uid }}_{{ job_uid }}.err\n\navailable_devs=""\nfor devidx in $(seq 0 15);\ndo\n    if [[ -z $(nvidia-smi -i $devidx --query-compute-apps=pid --format=csv,noheader) ]] ; then\n        if [[ -z "$available_devs" ]] ; then\n            available_devs=$devidx\n        else\n            available_devs=$available_devs,$devidx\n        fi\n    fi\ndone\nexport CUDA_VISIBLE_DEVICES=$available_devs\n\n{{ run_cmd }}\n\n\n\n\n\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'BP_gpu8_short', 'tpl_vars': ['job_uid', 'run_cmd', 'num_cpu', 'command', 'job_log_path_abs', 'num_gpu', 'cluster_job_id', 'job_dir_abs', 'worker_bin_path', 'run_args', 'project_dir_abs', 'project_uid', 'ram_gb'], 'type': 'cluster', 'worker_bin_path': '/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_worker/bin/cryosparcw'}, {'cache_path': '/tmp/cryosparc/bhabhaekiertlabs', 'cache_quota_mb': None, 'cache_reserve_mb': 80000, 'custom_var_names': ['ram_gb_multiplier'], 'custom_vars': {}, 'desc': None, 'hostname': 'BP_gpu8_medium', 'lane': 'BP_gpu8_medium', 'name': 'BP_gpu8_medium', 'qdel_cmd_tpl': 'scancel {{ cluster_job_id }}', 'qinfo_cmd_tpl': 'sinfo', 'qstat_cmd_tpl': 'squeue -j {{ cluster_job_id }}', 'qstat_code_cmd_tpl': None, 'qsub_cmd_tpl': 'sbatch {{ script_path_abs }}', 'script_tpl': '#!/usr/bin/env bash\n#### cryoSPARC cluster submission script template for SLURM\n## Available variables:\n## {{ run_cmd }}            - the complete command string to run the job\n## {{ num_cpu }}            - the number of CPUs needed\n## {{ num_gpu }}            - the number of GPUs needed. \n##                            Note: the code will use this many GPUs starting from dev id 0\n##                                  the cluster scheduler or this script have the responsibility\n##                                  of setting CUDA_VISIBLE_DEVICES so that the job code ends up\n##                                  using the correct cluster-allocated GPUs.\n## {{ ram_gb }}             - the amount of RAM needed in GB\n## {{ job_dir_abs }}        - absolute path to the job directory\n## {{ project_dir_abs }}    - absolute path to the project dir\n## {{ job_log_path_abs }}   - absolute path to the log file for the job\n## {{ worker_bin_path }}    - absolute path to the cryosparc worker command\n## {{ run_args }}           - arguments to be passed to cryosparcw run\n## {{ project_uid }}        - uid of the project\n## {{ job_uid }}            - uid of the job\n##\n## What follows is a simple SLURM script:\n\n#SBATCH --job-name cryosparc_{{ project_uid }}_{{ job_uid }}\n#SBATCH -N 1\n#SBATCH -n {{ num_cpu }}\n#SBATCH --gres=gpu:{{ num_gpu }}\n#SBATCH -p gpu8_medium\n## #SBATCH --mem-per-cpu={{ (ram_gb*20000/num_cpu)|int }}MB\n#SBATCH --mem={{ (ram_gb*1000|float * ram_gb_multiplier|float)|int }}MB             \n#SBATCH -o {{ job_dir_abs }}/cryosparc_{{ project_uid }}_{{ job_uid }}.out\n#SBATCH -e {{ job_dir_abs }}/cryosparc_{{ project_uid }}_{{ job_uid }}.err\n\navailable_devs=""\nfor devidx in $(seq 0 15);\ndo\n    if [[ -z $(nvidia-smi -i $devidx --query-compute-apps=pid --format=csv,noheader) ]] ; then\n        if [[ -z "$available_devs" ]] ; then\n            available_devs=$devidx\n        else\n            available_devs=$available_devs,$devidx\n        fi\n    fi\ndone\nexport CUDA_VISIBLE_DEVICES=$available_devs\n\n{{ run_cmd }}\n\n\n\n\n\n\n', 'send_cmd_tpl': '{{ command }}', 'title': 'BP_gpu8_medium', 'tpl_vars': ['job_uid', 'run_cmd', 'num_cpu', 'command', 'job_log_path_abs', 'num_gpu', 'cluster_job_id', 'job_dir_abs', 'worker_bin_path', 'run_args', 'project_dir_abs', 'ram_gb_multiplier', 'project_uid', 'ram_gb'], 'type': 'cluster', 'worker_bin_path': '/gpfs/data/bhabhaekiertlabs/local_software/CryoSparc/cryosparc2_worker/bin/cryosparcw'}]

Let me know if I forgot anything which could help

wtempel · November 2, 2023, 7:17pm

There are multiple options, including one in which you would

on the old server, archive all projects
transfer the database from the old server to the new server
on the new server, unarchive all projects

and in which you could preserve your project UIDs. This approach, however, is subject to some possible pitfalls. For example:

you would need to transfer your database to the new master host in a valid way, either via restoration of a database backup or a valid copy of the database directory.
for each individual project directory, you would need to ensure that the correct directory with unchanged contents is associated with the correct project during unarchiving. As of CryoSPARC v4.3.1, one might not be prevented from linking a project directory to the incorrect project, with database and project directory corruptions the likely outcome.

For these reasons, I recommend the following approach:

Do not start the migration process before having understood the following steps in their entirety.
Stop all CryoSPARC work on the old server/instance.
Collect and preserve the output of the command
cryosparcm listusers.
Collect and preserve the output of the command
cryosparcm cli "get_scheduler_targets()".
For each cluster lane, collect the output of the command
cryosparcm cluster dump in separate directories.
Backup the database and store the backup in a safe place. In the best case, the backup will never be needed, but when was the last time everything went 100% according to plan?
Detach each attached project from the old instance.
Permanently shut down the old instance.
After new absolute paths to software directories have been settled, install and start CryoSPARC master software on the new server. You may reuse your old license id provided the old instance is permanently shutdown.
Similarly, install cryosparc_worker software.
Connect workers with reference to information collected earlier, potentially using edited (“worker_bin_path”) copies of the cluster_info.json and cluster_script.sh files dumped earlier.
Create users with reference to the users list collected earlier.
Attach to the new instance project directories that were earlier detached from the old instance.
If needed, adjust symlinks for each project. Alternatively, you could ensure that the absolute paths to “outside-of-project” files do not change, possibly (untested by myself) with the help of directory symbolic links.

ncoudray · November 3, 2023, 12:54pm

Thanks @wtempel for the detailed step-by-step.

A few questions regarding the second approach:

step 7: Itt has to be done for each project individually? (Like we have ~300 projects, and there’s no way to do it for all at once for example?)
if the new CryoSparc on the new server is operated with a new licence, is step “8” needed? (for example, if instead of moving the 300 projects, we decide to move only the “active” ones, and leave the “non-active” ones on the old server)?
step 11: this is assuming the queueing system is similar on the new vs old server right?
steps 12-13: Do you think the cryosparc admin should attach all projects, then assign them to users, or each user attach their own project? (like, what’s the best way to make sure each project is re-assigned to the proper user)
out of curiosity, it looks like projects cannot be attached to a new CryoSPARC installation if they haven’t been detached from another one beforehand. Why is there such a lock? I can imagine cases where you want to share work without having it to disappear from your own installation

Thanks - Best,
Nicolas

wtempel · November 13, 2023, 5:41pm

Re step 7: You may detach multiple projects using a python script like:

import os
import sys
from cryosparc_compute.client import CommandClient
from cryosparc_compute import database_management

db = database_management.get_pymongo_client('meteor')['meteor']
cli = CommandClient(os.getenv('CRYOSPARC_MASTER_HOSTNAME'), int(os.getenv('CRYOSPARC_COMMAND_CORE_PORT')))

active_projects = [p['uid'] for p in db.projects.find({'$and': [{'archived': False}, {'deleted': False}, {'detached': False}]})]

for p in active_projects:
    should_continue = input(f"enter 'y' to detach project {p}: ")
    if should_continue.strip() == 'y':
        cli.detach_project(p)
        print(f"Detached project {p}")
    else:
        sys.exit("Detachment of projects aborted")

Suppose you saved the script to a file detach_active_projects.py, you would run it like:

cryosparcm call python /full/path/to detach_active_projects.py

Caution: one should run this script only if one fully understands and agrees with what it does.

You do not need to disable the old instance as long as

the new instance has a unique license id.
there is a clear understanding that a given project directory must only be attached one specific instance at any given time.
old and new instances have their respective separate databases.

Re step 11: Reference to old worker connection information is optional. You may alternatively redesign your worker configuration completely, based on circumstances.

Re steps 12-13: It may be more straightforward to have individual users attach their own projects after they have logged in to the GUI.

A project directory must only be attached to one particular instance at a time, cs.lock is intended to enforce this restriction. Simultaneous attachment to multiple instances, for example after defeating the cs.lock mechanism, can corrupt the project directory and the databases of the CryoSPARC instances involved.