Similar to this thread, I’ve tried all the variations of the hostnames, made sure firewalld is off, made sure I can ssh without a password but I get the same error, any help would be greatly appreciated
./cryosparcw connect --worker sn4622115580 --master cryoem8.ourdomain --port 39000 --nossd
---------------------------------------------------------------
CRYOSPARC CONNECT --------------------------------------------
---------------------------------------------------------------
Attempting to register worker sn4622115580 to command cryoem8.ourdomain:39002
Connecting as unix user myuser
Will register using ssh string: myuser@sn4622115580
If this is incorrect, you should re-run this command with the flag --sshstr <ssh string>
---------------------------------------------------------------
/home/myuser/cryosparc_worker/cryosparc_tools/cryosparc/command.py:135: UserWarning: *** CommandClient: (http://cryoem8.ourdomain:39002/api) HTTP Error 500 Internal Server Error; please check cryosparcm log command_core for additional information.
Response from server: b'\n<html><head>\n<meta type="copyright" content="Copyright (C) 1996-2017 The Squid Software Foundation and contributors">\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8">\n<title>ERROR: The requested URL could not be retrieved</title>\n<style type="text/css"><!-- \n /*\n * Copyright (C) 1996-2017 The Squid Software Foundation and contributors\n *\n * Squid software is distributed under GPLv2+ license and includes\n * contributions from numerous individuals and organizations.\n * Please see the COPYING and CONTRIBUTORS files for details.\n */\n\n/*\n Stylesheet for Squid Error pages\n Adapted from design by Free CSS Templates\n http://www.freecsstemplates.org\n Released for free under a Creative Commons Attribution 2.5 License\n*/\n\n/* Page basics */\n* {\n\tfont-family: verdana, sans-serif;\n}\n\nhtml body {\n\tmargin: 0;\n\tpadding: 0;\n\tbackground: #efefef;\n\tfont-size: 12px;\n\tcolor: #1e1e1e;\n}\n\n/* Page displayed title area */\n#titles {\n\tmargin-left: 15px;\n\tpadding: 10px;\n\tpadding-left: 100px;\n\tbackground: url(\'/squid-internal-static/icons/SN.png\') no-repeat left;\n}\n\n/* initial title */\n#titles h1 {\n\tcolor: #000000;\n}\n#titles h2 {\n\tcolor: #000000;\n}\n\n/* special event: FTP success page titles */\n#titles ftpsuccess {\n\tbackground-color:#00ff00;\n\twidth:100%;\n}\n\n/* Page displayed body content area */\n#content {\n\tpadding: 10px;\n\tbackground: #ffffff;\n}\n\n/* General text */\np {\n}\n\n/* error brief description */\n#error p {\n}\n\n/* some data which may have caused the problem */\n#data {\n}\n\n/* the error message received from the system or other software */\n#sysmsg {\n}\n\npre {\n font-family:sans-serif;\n}\n\n/* special event: FTP / Gopher directory listing */\n#dirmsg {\n font-family: courier;\n color: black;\n font-size: 10pt;\n}\n#dirlisting {\n margin-left: 2%;\n margin-right: 2%;\n}\n#dirlisting tr.entry td.icon,td.filename,td.size,td.date {\n border-bottom: groove;\n}\n#dirlisting td.size {\n width: 50px;\n text-align: right;\n padding-right: 5px;\n}\n\n/* horizontal lines */\nhr {\n\tmargin: 0;\n}\n\n/* page displayed footer area */\n#footer {\n\tfont-size: 9px;\n\tpadding-left: 10px;\n}\n\n\nbody\n:lang(fa) { direction: rtl; font-size: 100%; font-family: Tahoma, Roya, sans-serif; float: right; }\n:lang(he) { direction: rtl; }\n --></style>\n</head><body id="ERR_CANNOT_FORWARD">\n<div id="titles">\n<h1>ERROR</h1>\n<h2>The requested URL could not be retrieved</h2>\n</div>\n<hr>\n\n<div id="content">\n<p>The following error was encountered while trying to retrieve the URL: <a href="http://cryoem8.ourdomain:39002/api">http://cryoem8.ourdomain:39002/api</a></p>\n\n<blockquote id="error">\n<p><b>Unable to forward this request at this time.</b></p>\n</blockquote>\n\n<p>This request could not be forwarded to the origin server or to any parent caches.</p>\n\n<p>Some possible problems are:</p>\n<ul>\n<li id="network-down">An Internet connection needed to access this domains origin servers may be down.</li>\n<li id="no-peer">All configured parent caches may be currently unreachable.</li>\n<li id="permission-denied">The administrator may not allow this cache to make direct connections to origin servers.</li>\n</ul>\n\n<p>Your cache administrator is <a href="mailto:admin@localhost?subject=CacheErrorInfo%20-%20ERR_CANNOT_FORWARD&body=CacheHost%3A%20localhost%0D%0AErrPage%3A%20ERR_CANNOT_FORWARD%0D%0AErr%3A%20%5Bnone%5D%0D%0ATimeStamp%3A%20Mon,%2002%20Dec%202024%2021%3A03%3A36%20GMT%0D%0A%0D%0AClientIP%3A%2010.198.24.97%0D%0A%0D%0AHTTP%20Request%3A%0D%0APOST%20%2Fapi%20HTTP%2F1.1%0AAccept-Encoding%3A%20identity%0D%0AContent-Length%3A%20107%0D%0AUser-Agent%3A%20Python-urllib%2F3.10%0D%0AOriginator%3A%20client%0D%0ALicense-Id%3A%2040c42380-913f-11e9-9dde-5fa0d611b478%0D%0AContent-Type%3A%20application%2Fjson%0D%0AConnection%3A%20close%0D%0AHost%3A%20cryoem8.ourdomain%3A39002%0D%0A%0D%0A%0D%0A">admin@localhost</a>.</p>\n\n<br>\n</div>\n\n<hr>\n<div id="footer">\n<p>Generated Mon, 02 Dec 2024 21:03:36 GMT by localhost (squid)</p>\n<!-- ERR_CANNOT_FORWARD -->\n</div>\n</body></html>\n'
system = self._get_callable("system.describe")()
Traceback (most recent call last):
File "/home/myuser/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 105, in func
with make_json_request(self, "/api", data=data, _stacklevel=4) as request:
File "/home/myuser/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/contextlib.py", line 135, in __enter__
return next(self.gen)
File "/home/myuser/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 226, in make_request
raise CommandError(error_reason, url=url, code=code, data=resdata)
cryosparc_tools.cryosparc.errors.CommandError: *** (http://cryoem8.ourdomain:39002/api, code 500) HTTP Error 500 Internal Server Error; please check cryosparcm log command_core for additional information.
Response from server: b'\n<html><head>\n<meta type="copyright" content="Copyright (C) 1996-2017 The Squid Software Foundation and contributors">\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8">\n<title>ERROR: The requested URL could not be retrieved</title>\n<style type="text/css"><!-- \n /*\n * Copyright (C) 1996-2017 The Squid Software Foundation and contributors\n *\n * Squid software is distributed under GPLv2+ license and includes\n * contributions from numerous individuals and organizations.\n * Please see the COPYING and CONTRIBUTORS files for details.\n */\n\n/*\n Stylesheet for Squid Error pages\n Adapted from design by Free CSS Templates\n http://www.freecsstemplates.org\n Released for free under a Creative Commons Attribution 2.5 License\n*/\n\n/* Page basics */\n* {\n\tfont-family: verdana, sans-serif;\n}\n\nhtml body {\n\tmargin: 0;\n\tpadding: 0;\n\tbackground: #efefef;\n\tfont-size: 12px;\n\tcolor: #1e1e1e;\n}\n\n/* Page displayed title area */\n#titles {\n\tmargin-left: 15px;\n\tpadding: 10px;\n\tpadding-left: 100px;\n\tbackground: url(\'/squid-internal-static/icons/SN.png\') no-repeat left;\n}\n\n/* initial title */\n#titles h1 {\n\tcolor: #000000;\n}\n#titles h2 {\n\tcolor: #000000;\n}\n\n/* special event: FTP success page titles */\n#titles ftpsuccess {\n\tbackground-color:#00ff00;\n\twidth:100%;\n}\n\n/* Page displayed body content area */\n#content {\n\tpadding: 10px;\n\tbackground: #ffffff;\n}\n\n/* General text */\np {\n}\n\n/* error brief description */\n#error p {\n}\n\n/* some data which may have caused the problem */\n#data {\n}\n\n/* the error message received from the system or other software */\n#sysmsg {\n}\n\npre {\n font-family:sans-serif;\n}\n\n/* special event: FTP / Gopher directory listing */\n#dirmsg {\n font-family: courier;\n color: black;\n font-size: 10pt;\n}\n#dirlisting {\n margin-left: 2%;\n margin-right: 2%;\n}\n#dirlisting tr.entry td.icon,td.filename,td.size,td.date {\n border-bottom: groove;\n}\n#dirlisting td.size {\n width: 50px;\n text-align: right;\n padding-right: 5px;\n}\n\n/* horizontal lines */\nhr {\n\tmargin: 0;\n}\n\n/* page displayed footer area */\n#footer {\n\tfont-size: 9px;\n\tpadding-left: 10px;\n}\n\n\nbody\n:lang(fa) { direction: rtl; font-size: 100%; font-family: Tahoma, Roya, sans-serif; float: right; }\n:lang(he) { direction: rtl; }\n --></style>\n</head><body id="ERR_CANNOT_FORWARD">\n<div id="titles">\n<h1>ERROR</h1>\n<h2>The requested URL could not be retrieved</h2>\n</div>\n<hr>\n\n<div id="content">\n<p>The following error was encountered while trying to retrieve the URL: <a href="http://cryoem8.ourdomain:39002/api">http://cryoem8.ourdomain:39002/api</a></p>\n\n<blockquote id="error">\n<p><b>Unable to forward this request at this time.</b></p>\n</blockquote>\n\n<p>This request could not be forwarded to the origin server or to any parent caches.</p>\n\n<p>Some possible problems are:</p>\n<ul>\n<li id="network-down">An Internet connection needed to access this domains origin servers may be down.</li>\n<li id="no-peer">All configured parent caches may be currently unreachable.</li>\n<li id="permission-denied">The administrator may not allow this cache to make direct connections to origin servers.</li>\n</ul>\n\n<p>Your cache administrator is <a href="mailto:admin@localhost?subject=CacheErrorInfo%20-%20ERR_CANNOT_FORWARD&body=CacheHost%3A%20localhost%0D%0AErrPage%3A%20ERR_CANNOT_FORWARD%0D%0AErr%3A%20%5Bnone%5D%0D%0ATimeStamp%3A%20Mon,%2002%20Dec%202024%2021%3A03%3A36%20GMT%0D%0A%0D%0AClientIP%3A%2010.198.24.97%0D%0A%0D%0AHTTP%20Request%3A%0D%0APOST%20%2Fapi%20HTTP%2F1.1%0AAccept-Encoding%3A%20identity%0D%0AContent-Length%3A%20107%0D%0AUser-Agent%3A%20Python-urllib%2F3.10%0D%0AOriginator%3A%20client%0D%0ALicense-Id%3A%2040c42380-913f-11e9-9dde-5fa0d611b478%0D%0AContent-Type%3A%20application%2Fjson%0D%0AConnection%3A%20close%0D%0AHost%3A%20cryoem8.ourdomain%3A39002%0D%0A%0D%0A%0D%0A">admin@localhost</a>.</p>\n\n<br>\n</div>\n\n<hr>\n<div id="footer">\n<p>Generated Mon, 02 Dec 2024 21:03:36 GMT by localhost (squid)</p>\n<!-- ERR_CANNOT_FORWARD -->\n</div>\n</body></html>\n'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/myuser/cryosparc_worker/bin/connect.py", line 78, in <module>
cli = client.CommandClient(host=master_hostname, port=command_core_port, service="command_core")
File "/home/myuser/cryosparc_worker/cryosparc_compute/client.py", line 38, in __init__
super().__init__(service, host, port, url, timeout, headers, cls=NumpyEncoder)
File "/home/myuser/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 97, in __init__
self._reload() # attempt connection immediately to gather methods
File "/home/myuser/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 135, in _reload
system = self._get_callable("system.describe")()
File "/home/myuser/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 108, in func
raise CommandError(
cryosparc_tools.cryosparc.errors.CommandError: *** (http://cryoem8.ourdomain:39002, code 500) Encounted error from JSONRPC function "system.describe" with params ()
Edit: when I change to using the IP address for the --master
option it connects. Why would that be?
./cryosparcm cli "get_scheduler_targets()"
[{'cache_path': '/scratch/cryosparc-cache', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 1, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 2, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 3, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 4, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 5, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 6, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 7, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}], 'hostname': 'cryoem8.ourdomain.edu', 'lane': 'default', 'monitor_port': None, 'name': 'cryoem8.ourdomain.edu', 'resource_fixed': {'SSD': True}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95], 'GPU': [0, 1, 2, 3, 4, 5, 6, 7], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192]}, 'ssh_str': 'myuser@cryoem8.ourdomain.edu', 'title': 'Worker node cryoem8.ourdomain.edu', 'type': 'node', 'worker_bin_path': '/home/myuser/cryosparc_worker/bin/cryosparcw'}, {'cache_path': '/scratch/cryosparc-cache', 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 1, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 2, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 3, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 4, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 5, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 6, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}, {'id': 7, 'mem': 11539054592, 'name': 'NVIDIA GeForce RTX 2080 Ti'}], 'hostname': 'cryoem7', 'lane': 'cryoem7', 'monitor_port': None, 'name': 'cryoem7', 'resource_fixed': {'SSD': True}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95], 'GPU': [0, 1, 2, 3, 4, 5, 6, 7], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192]}, 'ssh_str': 'myuser@cryoem7.ourdomain.edu', 'title': 'Worker node cryoem7', 'type': 'node', 'worker_bin_path': '/home/myuser/cryosparc_worker/bin/cryosparcw'}, {'cache_path': None, 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 11538923520, 'name': 'NVIDIA GeForce RTX 2080 Ti'}], 'hostname': 'spgpu2', 'lane': 'box', 'monitor_port': None, 'name': 'spgpu2', 'resource_fixed': {'SSD': False}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47], 'GPU': [0], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]}, 'ssh_str': 'myuser@spgpu2.ourdomain.edu', 'title': 'Worker node spgpu2', 'type': 'node', 'worker_bin_path': '/home/myuser/cryosparc_worker/bin/cryosparcw'}, {'cache_path': None, 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 11538923520, 'name': 'NVIDIA GeForce RTX 2080 Ti'}], 'hostname': 'spgpu3', 'lane': 'box', 'monitor_port': None, 'name': 'spgpu3', 'resource_fixed': {'SSD': False}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47], 'GPU': [0], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]}, 'ssh_str': 'myuser@spgpu3.ourdomain.edu', 'title': 'Worker node spgpu3', 'type': 'node', 'worker_bin_path': '/home/myuser/cryosparc_worker/bin/cryosparcw'}, {'cache_path': None, 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 11538923520, 'name': 'NVIDIA GeForce RTX 2080 Ti'}], 'hostname': 'spgpu4', 'lane': 'box', 'monitor_port': None, 'name': 'spgpu4', 'resource_fixed': {'SSD': False}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47], 'GPU': [0], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]}, 'ssh_str': 'myuser@spgpu4.ourdomain.edu', 'title': 'Worker node spgpu4', 'type': 'node', 'worker_bin_path': '/home/myuser/cryosparc_worker/bin/cryosparcw'}, {'cache_path': None, 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 3149856768, 'name': 'NVIDIA GeForce GTX 1060 3GB'}], 'hostname': 'exxgpu1', 'lane': 'slow', 'monitor_port': None, 'name': 'exxgpu1', 'resource_fixed': {'SSD': False}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 'GPU': [0], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7]}, 'ssh_str': 'myuser@exxgpu1.ourdomain.edu', 'title': 'Worker node exxgpu1', 'type': 'node', 'worker_bin_path': '/home/myuser/cryosparc_worker/bin/cryosparcw'}, {'cache_path': None, 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 3149856768, 'name': 'NVIDIA GeForce GTX 1060 3GB'}], 'hostname': 'spgpu1', 'lane': 'box', 'monitor_port': None, 'name': 'spgpu1', 'resource_fixed': {'SSD': False}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47], 'GPU': [0], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]}, 'ssh_str': 'myuser@spgpu1.ourdomain.edu', 'title': 'Worker node spgpu1', 'type': 'node', 'worker_bin_path': '/scratch/software/cryosparc/cryosparc_worker/bin/cryosparcw'}, {'cache_path': None, 'cache_quota_mb': None, 'cache_reserve_mb': 10000, 'desc': None, 'gpus': [{'id': 0, 'mem': 51041271808, 'name': 'NVIDIA RTX A6000'}, {'id': 1, 'mem': 51041271808, 'name': 'NVIDIA RTX A6000'}, {'id': 2, 'mem': 51041271808, 'name': 'NVIDIA RTX A6000'}, {'id': 3, 'mem': 51041271808, 'name': 'NVIDIA RTX A6000'}, {'id': 4, 'mem': 51041271808, 'name': 'NVIDIA RTX A6000'}, {'id': 5, 'mem': 51041271808, 'name': 'NVIDIA RTX A6000'}, {'id': 6, 'mem': 51041271808, 'name': 'NVIDIA RTX A6000'}, {'id': 7, 'mem': 51041271808, 'name': 'NVIDIA RTX A6000'}], 'hostname': 'cryoem9.ourdomain.edu', 'lane': 'cryoem9', 'monitor_port': None, 'name': 'cryoem9.ourdomain.edu', 'resource_fixed': {'SSD': False}, 'resource_slots': {'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71], 'GPU': [0, 1, 2, 3, 4, 5, 6, 7], 'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257]}, 'ssh_str': 'exx@cryoem9.ourdomain.edu', 'title': 'Worker node cryoem9.ourdomain.edu', 'type': 'node', 'worker_bin_path': '/home/myuser/cryosparc_worker/bin/cryosparcw'}]
cryoem9
is the new worker. The job status stuck on “Running job on remote worker hostname cryoem9
”
"-12-02 16:51:46,560 scheduler_run_core INFO | Now trying to schedule J2056
2024-12-02 16:51:46,561 scheduler_run_job INFO | Scheduling job to cryoem9
2024-12-02 16:51:47,612 scheduler_run_job INFO | Not a commercial instance - heartbeat set to 12 hours.
2024-12-02 16:51:47,935 scheduler_run_job INFO | Launchable! -- Launching.
2024-12-02 16:51:47,943 set_job_status INFO | Status changed for P1.J2056 from queued to launched
2024-12-02 16:51:47,944 app_stats_refresh INFO | Calling app stats refresh url http://cryoem8.ourdomain:39000/api/actions/stats/refresh_job for project_uid P1, workspace_uid None, job_uid J2056 with body {'projectUid': 'P1', 'jobUid': 'J2056'}
2024-12-02 16:51:47,949 app_stats_refresh INFO | code 200, text {"success":true}
2024-12-02 16:51:47,956 run_job INFO | Running P1 J2056
2024-12-02 16:51:47,956 run_job INFO | Running job using: /home/myuser/cryosparc_worker/bin/cryosparcw
2024-12-02 16:51:47,956 run_job INFO | Running job on remote worker node hostname cryoem9
2024-12-02 16:51:47,957 run_job INFO | cmd: bash -c "nohup /home/myuser/cryosparc_worker/bin/cryosparcw run --project P1 --job J2056 --master_hostname cryoem8.ourdomain --master_command_core_port 39002 > /home/workstation/Zuker/CS-zuker/J2056/job.log 2>&1 & "
2024-12-02 16:51:48,529 run_job INFO |
2024-12-02 16:51:48,529 scheduler_run_core INFO | Finished
The job.log has the same errors:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "cryosparc_master/cryosparc_compute/run.py", line 255, in cryosparc_master.cryosparc_compute.run.run
File "cryosparc_master/cryosparc_compute/run.py", line 50, in cryosparc_master.cryosparc_compute.run.main
File "/home/ouruser/cryosparc_worker/cryosparc_compute/jobs/runcommon.py", line 131, in connect
cli = client.CommandClient(master_hostname, int(master_command_core_port), service="command_core")
File "/home/ouruser/cryosparc_worker/cryosparc_compute/client.py", line 38, in __init__
super().__init__(service, host, port, url, timeout, headers, cls=NumpyEncoder)
File "/home/ouruser/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 97, in __init__
self._reload() # attempt connection immediately to gather methods
File "/home/ouruser/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 135, in _reload
system = self._get_callable("system.describe")()
File "/home/ouruser/cryosparc_worker/cryosparc_tools/cryosparc/command.py", line 108, in func
raise CommandError(
cryosparc_tools.cryosparc.errors.CommandError: *** (http://cryoem8.ourdomain.edu:39002, code 500) Encounted error from JSONRPC function "system.describe" with params ()