Need more information on the cluster_info templates

questionfromwork · October 20, 2022, 11:51am

The cluster_info.json that I have question is Example A, B and C from the documentation.

The documentation is not clear and not giving enough information.
Just some example configurations without any explanation does not help user to understand how to setup the json file.

My questions are

Which cluster_info.json templates should I use?
Under what circumstance that I should provide cache_path, send_cmd_tpl and transfer_cmd_tpl? (in the cluster_info.json)

Espically confusing is “send_cmd_tpl” : “ssh loginnode {{ command }}”.

Am I suppose to replace loginnode with the actual server ip or fqdn?
What about the {{ command }} part?
I understand that is jinja2 template syntax, but what would be the command?

(We have a slurm on our cluster, and installed cryosparc worker to the clusters using spack.)

Thank you for advance for answering.

wtempel · October 21, 2022, 10:03pm

In a simple case where the CryoSPARC master host is also a cluster job submission/control host, the definition would literally be:
"send_cmd_tpl": "{{ command }}"
CryoSPARC would then replace the template variable {{ command }} with the appropriate cluster management command for local execution instead of sending the cluster management command to a remote host.
The variables are described in this section of the guide.

questionfromwork · October 27, 2022, 10:05am

Thank you.

In my case, the master is on a cloud instance (the master is not installed on the cluster).
I believe that I have to provide the ssh login detail there before the jinja2 template syntax {{ command }}.
What about the ssh key? Where should I save my ssh key so that Cryosparc master can access it?
The .ssh path of the user that have access to the cluster?
For example, the user is cryosparc_cluster_user.
Then I should create the same user on the master node (the system that hosting the master node) and put the ssh key under that user .ssh directory?

We use Linux system.

Thank you.

wtempel · October 27, 2022, 3:33pm

Even with the master in the cloud and separate from the cluster, it is still assumed that the master and cluster nodes can access the same bulk storage under a common identity and a common path. In other words, cryosparc_cluster_user is also running the CryoSPARC master processes and has the same user id as on the cluster. Is this the case?
The private ssh key must be stored securely. It need not be and should not be stored on the shared bulk storage. Once you have decided on a secure path (on the master’s root file system, for example), make that path readable only to the Linux account that’s running the master processes, and include the path like this in cluster_info.json:

"send_cmd_tpl": "ssh -i /secrets/cryosparc_ssh_key cluster_submit_host {{ command }}"

Assuming

A shared identity and username between the Linux user running the master and the cluster user submitting cluster jobs
The file /secrets/cryosparc_ssh_key is readable to only cryosparc_cluster_user (on the CryoSPARC master in the cloud) and holds the private key

questionfromwork · November 4, 2022, 10:49am

Thanks. I realize that is normal ssh command, I was able to connect to cluster with -i to provide the ssh key path.