Update worker to 4.4.0 failed

Hello,

On a single node install:

Now updating worker nodes.

===================================================

All workers:

sbiotest01.mskcc.org sparc@sbiotest01.mskcc.org


Updating worker sbiotest01.mskcc.org

Remote update

scp ./cryosparc_worker.tar.gz sparc@sbiotest01.mskcc.org:/home/sparc/cryosparc_worker

Authentication failed.

lost connection

Failed to update sbiotest01.mskcc.org! Skipping…

Manual update:
cp ./cryosparc_worker.tar.gz /home/sparc/cryosparc_worker
bin/cryosparcw update
Updating… checking versions
Current version v4.2.1 - New version v4.0.3
Followed by master/worker mismatch. Please advise.

Thank you,

Yehuda

This command is sensitive to the current working directory; you may have copied an old cryosparc_worker.tar.gz package. You may want to try instead:
(assuming

  • your master processes are using /home/sparc/cryosparc_master/
  • /home/sparc/cryosparc_master/cryosparc_worker.tar.gz exists and is not older than Nov8, 2023

):

cd /home/sparc/cryosparc_worker/
rm cryosparc_worker.tar.gz
ln -s ../cryosparc_master/cryosparc_worker.tar.gz
./bin/cryosparcw update

Thank you. I figured that out, however,
Failed to complete GPU benchmark on GPU 0: cuda failure (driver API): cuMemAlloc(&plan_cache.plans[idx].workspace, plan_cache.plans[idx].worksz)
→ CUDA_ERROR_OUT_OF_MEMORY out of memory
So I am back to 4.3.1…

Thanks for reporting this. Please can you

  • post the model of the GPU where the error occurred
  • post a screenshot (of the Event Log) are a snippet of the job log (Metadata|Log) that shows additional lines around the error
  • email us the error report for the benchmark job?

You may also want to hold-off downgrading as we are investigating a similar error we have encountered in the benchmark job. Did you encounter problems on any non-benchmark, GPU-enabled jobs on v4.4?