Welcome to the forum @bfocassio .
If your cluster is limiting a job’s access to GPU devices using cgroups, as I would recommend,
would refer to a (virtually) non-existing device, given the request for a single device
In that case,
export CUDA_VISIBLE_DEVICES=0
might work, but it might be better to omit all CUDA_VISIBLE_DEVICES
definitions from the script template (see CUDA_ERROR_NO_DEVICE - but only when AF2 is running! - #9 by wtempel).