hi,
i updated nvidia drivers and screwed everything up.
i have ubuntu 22,
cryosparc was working fine until the driver update.
tried many combinations of driver and cuda.
it first showed up with a crash in a 3-D variabliltiy job
“Call to cuInit results in CUDA_ERROR_NO_DEVICE (100)”
i have seen various suggestions online, the gpulist check fails
with the same error. i have updated cryosparc to 4.71 and still
get this error.
i have purged nvidia and cuda and reinstalled and am going crazy.
hi,
i think i found something, the current update changed
/etc/modprobe.d/virtualgl.conf to configure
NVreg_DeviceFileMode=0660
such that all /dev/nvidia* was not readable by everyone
no matter what you did, the permissions got reset.
changing that configuration to
NVreg_DeviceFileMode=0666
allows every user to read them, the cryosparcw gpulist test
now passes, and my job hasn’t crashed yet
i dont know which update introduced it.
i tried many.
and no i dont know if its still needed.
permissions on
/dev/nvidia* currently matches another similar server that
has no problems. the similar server has an older driver and slightly older kernel