When running GPU test to validate install we get an error we traced back to the newer versions of nvidia-smi have dropped/changed some of the specified options. for example --querygpu is now --query-gpu. Driver package 530.30.02 looks like it should work,. 535.54.03 that came out on June 13th appears to have dropped some of these options and causes the GPU testes to fail. Has anyone else had this experience, any chance it could be “fixed” in a future patch/release? Thanks
--------------------
[CPU: 208.9 MB]
Obtaining GPU info via `nvidia-smi`...
[CPU: 209.0 MB]
Traceback (most recent call last):
File "/scratch/cluster_scratch/cryosparc/ncif-wolin-cryosparc/cryosparc_worker/cryosparc_compute/jobs/instance_testing/nvidia_smi_util.py", line 41, in run_nvidia_smi_query
memory_use_info = output_to_list(subprocess.check_output(
File "/scratch/cluster_scratch/cryosparc/ncif-wolin-cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/subprocess.py", line 415, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/scratch/cluster_scratch/cryosparc/ncif-wolin-cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['nvidia-smi', '--query-gpu=name,pci.bus_id,driver_version,persistence_mode,power.limit,clocks_throttle_reasons.sw_power_cap,clocks_throttle_reasons.hw_slowdown,compute_mode,pcie.link.gen.max,pcie.link.gen.current,temperature.gpu,utilization.gpu,utilization.memory', '--format=csv,noheader,nounits']' returned non-zero exit status 2.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "cryosparc_master/cryosparc_compute/run.py", line 96, in cryosparc_compute.run.main
File "/scratch/cluster_scratch/cryosparc/ncif-wolin-cryosparc/cryosparc_worker/cryosparc_compute/jobs/instance_testing/run.py", line 96, in run_gpu_job
gpu_info = nvidia_smi_util.run_nvidia_smi_query(
File "/scratch/cluster_scratch/cryosparc/ncif-wolin-cryosparc/cryosparc_worker/cryosparc_compute/jobs/instance_testing/nvidia_smi_util.py", line 44, in run_nvidia_smi_query
raise RuntimeError(
RuntimeError: command '['nvidia-smi', '--query-gpu=name,pci.bus_id,driver_version,persistence_mode,power.limit,clocks_throttle_reasons.sw_power_cap,clocks_throttle_reasons.hw_slowdown,compute_mode,pcie.link.gen.max,pcie.link.gen.current,temperature.gpu,utilization.gpu,utilization.memory', '--format=csv,noheader,nounits']' returned with error (code 2): b'Field "clocks_throttle_reasons.sw_power_cap" is not a valid field to query.\n\n'
---------------------------------------------------