But then, if instead I install the ‘worker’ application on the login node (the node that does have cuda package to do software compiling and can do qsub scripts to its GPU cluster, although it doesn’t actually have GPU devices on these login nodes), this should also work by installing the ‘master’ application on another remote cloud instance. and then specify ssh directly to the login node, allowing the PBS job scripts being directed to the internal GPU clusters.
Sorry, when I mentioned that GPU nodes (clusters) are not available directly to the login node, it just means that I don’t even know the computer name of the GPU clusters and it does sit on the same local private network, and can only be used by submitting jobs through PBS scripts.
Now, my master application is functioning on a cloud instance, and my cluster can be register successfully on the instance, given that the ssh information of the login node and the worker application is installed on the login node too. But I encounter an issue where i do motion correction it does not find script file. The ssh command i have is: ssh username@###.###.###.### qsub /path/to/cryosparc_projects/P1/J1/queue_sub_script.sh
In the job log, it is: Failed to launch! 1
Does the cloud instance where the master application sits still need to be a specific hostname? Or can it be an ip address? For my login node, it does have a specific hostname (actual domain name). Would this be the reason for not finding the file? (I have set non password ssh at the master application machine.)
About the API, was just simply because I thought if the set of the two-piece web application doesn’t work, is it possible to make it standalone with those APIs somehow working independently to be wrapped into PBS scripts for job submission. This is kinda like going backward…from the internet era.