Nvidia cuda etc update

hi,
i updated nvidia drivers and screwed everything up.

i have ubuntu 22,
cryosparc was working fine until the driver update.
tried many combinations of driver and cuda.
it first showed up with a crash in a 3-D variabliltiy job

“Call to cuInit results in CUDA_ERROR_NO_DEVICE (100)”

i have seen various suggestions online, the gpulist check fails
with the same error. i have updated cryosparc to 4.71 and still
get this error.

i have purged nvidia and cuda and reinstalled and am going crazy.

thanks
jpd
cryosparcw gpulist

hi,
i think i found something, the current update changed
/etc/modprobe.d/virtualgl.conf to configure
NVreg_DeviceFileMode=0660

such that all /dev/nvidia* was not readable by everyone
no matter what you did, the permissions got reset.
changing that configuration to
NVreg_DeviceFileMode=0666

allows every user to read them, the cryosparcw gpulist test
now passes, and my job hasn’t crashed yet

thanks
jpd

Thanks @jpd for sharing your finding. Do you know how the /etc/modprobe.d/virtualgl.conf was installed on your system and whether it is still needed?

hi,

i dont know which update introduced it.
i tried many.
and no i dont know if its still needed.

permissions on
/dev/nvidia* currently matches another similar server that
has no problems. the similar server has an older driver and slightly older kernel

/etc/modprobe.d/virtualgl.conf may not have been installed during an update, but may be part of another software package or program.