The log file is probably too large. I think the cause may be the worker was not properly connected. I reconnected the worker with cryosparcw connect --worker localhost --master localhost --port 39000. The log showed errors with GPU even though I was able to import jobs with errors. I see most of the jobs were imported with some errors. But I couldn’t run jobs with GPU. The errors are pasted below:
2023-06-21 12:01:03,676 COMMAND.SCHEDULER get_gpu_info INFO | UPDATING WORKER GPU INFO
2023-06-21 12:01:03,677 COMMAND.JOBS update_all_job_sizes INFO | UPDATING ALL JOB SIZES IN 10s
2023-06-21 12:01:03,678 COMMAND.DATA export_all_projects INFO | EXPORTING ALL PROJECTS IN 60s...
2023-06-21 12:01:03,732 COMMAND.SCHEDULER get_gpu_info_run ERROR | Failed to get GPU info on worker.cryosparc.localhost.com
2023-06-21 12:01:03,732 COMMAND.SCHEDULER get_gpu_info_run ERROR | Traceback (most recent call last):
2023-06-21 12:01:03,732 COMMAND.SCHEDULER get_gpu_info_run ERROR | File "/app/apps/rhel7/cryosparc/cryosparc2_master/cryosparc_command/command_core/__init__.py", line 1173, in get_gpu_info_run
2023-06-21 12:01:03,732 COMMAND.SCHEDULER get_gpu_info_run ERROR | value = subprocess.check_output(full_command, stderr=subprocess.STDOUT, shell=shell).decode()
2023-06-21 12:01:03,732 COMMAND.SCHEDULER get_gpu_info_run ERROR | File "/app/apps/rhel7/cryosparc/cryosparc2_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/subprocess.py", line 411, in check_output
2023-06-21 12:01:03,732 COMMAND.SCHEDULER get_gpu_info_run ERROR | **kwargs).stdout
2023-06-21 12:01:03,732 COMMAND.SCHEDULER get_gpu_info_run ERROR | File "/app/apps/rhel7/cryosparc/cryosparc2_master/deps/anaconda/envs/cryosparc_master_env/lib/python3.7/subprocess.py", line 512, in run
2023-06-21 12:01:03,732 COMMAND.SCHEDULER get_gpu_info_run ERROR | output=stdout, stderr=stderr)
2023-06-21 12:01:03,732 COMMAND.SCHEDULER get_gpu_info_run ERROR | subprocess.CalledProcessError: Command '['ssh', 'cryosparc@worker.cryosparc.localhost.com', 'bash -c "eval $(/app/apps/rhel7/cryosparc/cryosparc2_worker/bin/cryosparcw env); timeout 30 python /app/apps/rhel7/cryosparc/cryosparc2_worker/cryosparc_compute/get_gpu_info.py"']' returned non-zero exit status 255.
2023-06-21 12:01:13,985 COMMAND.DATA dump_project INFO | Exporting project P1
2023-06-21 12:01:13,987 COMMAND.DATA dump_project INFO | Exported project P1 to /DATA01/cryosparc_hz/P1/project.json in 0.00s
2023-06-21 12:01:13,990 COMMAND.DATA dump_project INFO | Exporting project P2
2023-06-21 12:01:13,992 COMMAND.DATA dump_project INFO | Exported project P2 to /DATA01/cryosparc_hz/P48/project.json in 0.00s
2023-06-21 12:01:14,044 COMMAND.DATA dump_project INFO | Exporting project P3
2023-06-21 12:01:14,046 COMMAND.DATA dump_project INFO | Exported project P3 to /data2/cryosparc_home/P13/project.json in 0.00s
2023-06-21 12:01:14,073 COMMAND.DATA dump_project INFO | Exporting project P4
2023-06-21 12:01:14,075 COMMAND.DATA dump_project INFO | Exported project P4 to /data2/cryosparc_home/P25/project.json in 0.00s
2023-06-21 12:01:14,082 COMMAND.DATA dump_project INFO | Exporting project P5
2023-06-21 12:01:14,124 COMMAND.DATA dump_project INFO | Exported project P5 to /DATA01/cryosparc_hz/P48/project.json in 0.04s
2023-06-21 12:06:03,419 COMMAND.MAIN start INFO | === EXITED ===
2023-06-21 12:06:04,467 COMMAND.MAIN start INFO | === STARTED ===
2023-06-21 12:06:04,469 COMMAND.BG_WORKER background_worker INFO | === STARTED ===
2023-06-21 12:06:04,469 COMMAND.CORE run INFO | === STARTED TASKS WORKER ===
* Serving Flask app "command_core" (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off