Improvements for clusters

Hi

I would like to request some features that will make our life easier on clusters.

I like the lane approach that we can have multiple submission scripts as templates. However I think it will be useful to also include some variables that can be passed through GUI to the submission script. For example like in relion, if you can have a XXXextra1XXX variable (actually multiple of them) that can be replaced with any parameter user wants, this variable can be entered during submit to lane process.

Second request is more complicated.
I am using ssh tunnel to interact with GUI. The cluster that I am using at Stanford University offers another browser based method to reach the compute nodes (OnDemand). I would like to use that method for several reasons. However, web browser cannot communicate properly with the GUI. The URL to reach the node is as follows

https:// login sherlock stanford.edu/node/“host-name”/“port-name”/

As I consulted with the cluster admin, the issue is that CryoSPARC tries to load a Javascript file from the root of the webserver: /bee123456789123456789123456789aa.js?meteor_js_resource=true

It works fine with a SSH tunnel, but with OnDemand, it tries to load:

https:// login sherlock stanford.edu/bee123456789123456789123456789a.js?meteor_js_resource=true"

where the file is actually at:

https:// login sherlock stanford.edu/node/<nodename>/<port>/bee123456789123456789123456789aa.js?meteor_js_resource=true

Therefore I see a blank page.

If CryoSPARC does not use absolute paths when loading resources, but relative URLs, this problem should be resolved.

Can you implement these?

Thank you

2 Likes

Hi, as for your second request, can’t use solve it by connecting with VPN rather than using SSH tunnel?

Thank you Marino for your reply.

Standord cluster, probably many other clusters has a saperate 2 tier authentication system. Even on campus we need to tunnel and authenticate. Their web based system allows a computer to stay authenticated for 30days, which cuts down a lot of smartphone interaction. But for that, the javascript on cryosparc web app should load resources properly.
Is not it anyway better to use relative paths for web based applications? (The java script when we try to load cryosparc on web browser asking for an absolute path)

Alpay

More importantly, is it possible to install and run cryosparc host-name free? The nodes on our cluster are stateless, and I would like to be able to run the background instance in any node I want. Many people cannot use cryosparc on our cluster because they do not own a node and they cannot practically request the same node.

1 Like

Hi @alburse,

Our team is looking into this and will update the thread when we have more information.

Thanks,
Suhail

@alburse We’ve solved this by allowing users to run the master process on the login node. Running on a compute node gives the problems you mentioned with having to update the config.sh file for the master. Plus, running on a compute node results in a lot of wasted cycles since the master process is pretty lightweight and all the serious computation gets submitted through the queue.

The downside is that users do have to choose a non-conflicting port number, and this needs to be accessible from the login node. This is a pretty serious organizational hassle for us and I’d be interested in hearing about other options.

@sbliven Can you share with us how do your run the master process on a login node?

I think running the master process on a compute node is not a big issue. We do that for many other programs. However, running it on a specific compute node is a big issue (you should just get rid of host name if possible). I guess if you can run it on a login node, you can also run it on any node so the problem is I believe resolved (no need to run the master on a specific compute node) Keeping one CPU from a compute node is not a big deal. Accordingly there will be no problem for conflicting port number and its accessibility by running the master on a slurm/pbs/… assigned node. Are these 2 assumptions (can run it on any node and no port problem if run on a compute node) correct? By the way running cryosparc master on a login node may not be a good idea. Login nodes have several restrictions. Therefore cryosparc instance will more likely be terminated and this may cause database issues. Some of the jobs on cryosparc also runs directly on master, those processes are probably too much for login nodes.

Would you also comment on my question about the “javascript on the browser interface of cryosparc is asking for an absolute path” on my first message on this thread.

Thank you.

Alpay

The absolute path to javascript seems like a bug, but the developers would have to address that.

I think that eliminating CRYOSPARC_MASTER_HOSTNAME from config.sh would be a good idea, although it does place additional burden on the users to figure out which node the master is running on (e.g. for running cryosparcm commands, connecting to the webserver, etc). Probably this is a nontrivial feature request, since it would require the master to somehow inform clients as to the correct callback host.

Running the master on a compute node still has the posibility for conflicting ports if two users happen to get assigned cores on the same machine. However unless you have many cryosparc users on the cluster that might be low probability.

Unclean terminations can be a problem on both login nodes and compute nodes. Either way one should call cryosparcm stop rather than killing the process. We have relatively permissive login node policies, so unexpected terminations are more likely on the compute nodes (from preemption or expired allocations).

Great, I hope you can fix the javascript path bug and also share with us how to run cryosparc on login nodes soon.
Thank you

Alpay

@sbliven

Hi Spencer

Were you able to solve the absolute path to javascript issue, that I mentioned in this thread? I will be giving a talk on a workshop and It would have been great, if this can be resolved soon, so I can show people how to run cryosparc better on our cluster at Stanford. (I do not think it was resolved on the latest 2.9 version).

Thank you.

Alpay

1 Like

Hi @alburse,

Regarding the hostname issue, if you delete the CRYOSPARC_MASTER_HOSTNAME line in the cryosparc2_master/config.sh file, the hostname will be set whenever cryoSPARC is started (cryosparcm start). This is given that the following command: echo -e "$(hostname -f)" | tr -d '[:space:]' returns the correct hostname on every compute node.

1 Like

Thank you for hostname clarification, this is good news, so we do not depend on a specific node. I would appreciate if you can also fix the absolute path to javascript issue soon.

Best

Alpay

i would just like to add a +1 to this request. generically, i believe this can be solved by setting the ROOT_URL for meteor.js so that cryosparc can use a different url prefix; such that something like ondemand could reverse proxy the traffic to a running cryosparc instance; ie https://ondemand/node/$host/$port/ --> http://$host:$port/node/$host/$port/ where cryosparc is using the ROOT_URL to serve from /node/$host/$port rather than just /

@sbliven

You said “The absolute path to javascript seems like a bug, but the developers would have to address that.”. Do you have any update on this. @yee379 also seems to be interested. Thank you

Hi everyone,

Building off this post, does anyone have any use cases where they would like to edit either the cluster submission script or set custom parameters (key-value table) within the cryoSPARC interface?

If so, please let us know what solution you would prefer. Thanks!

- Suhail

@sdawood yes, I think so. We’re trying to figure out how to deal with permissions on a distributed cluster filesystem, which doesn’t have ACLs. One approach would be restrictive sudoers configurations that would allow the cryosparc user to run some permissions corrections as root after job completion.

Custom parameters will definitely be a nice feature, and I’ve expressed my support in the other thread. However, this can already be handled somewhat by adding cluster lanes for each required configuration, so the permissions and dynamic hostname issues are probably more important for cluster usage.

@sbliven

The original request on this tread was focused on Cryosparc to change the usage of absolute paths when loading resources through web browser, with relative URLs. Would yo please look into that too?

@sdawood, the original post indicated that it is impossible to launch the CryoSPARC Meteor webapp with a custom ROOT_URL. This should be an easy fix! Simply setting the ROOT_URL environment variable at the right point in the cryosparcm start call should allow users to pass custom url prefixes.

This is an essential feature for many modern HPC settings where Open OnDemand is required serve webapps securely. In the absence of this support, most of your users with access to HPC resources will have to interact with CryoSPARC through a web browser inside a NoVNC virtual desktop inside their local web browser. It’s a pretty lousy user experience. Is this on CryoSPARC’s radar? When can we expect a fix?

Thanks!

Hi @kmdalton,

Thanks for creating a separate post with more details. We’re always open to feedback and try to incorporate as much user feedback as possible into future releases. That being said, we have a very long to-do list and many ongoing projects so we can’t work on everything at once. I’ve recorded this request in our internal issue tracker and we’ll try to get to it as soon as possible.