Hi,
When I run Topaz train in v3.2, after running apparently normally for some time it terminates with the attached error. This is on CentOS 7, with RTX-3090 cards, CUDA 11.2, Topaz 0.2.4. Thoughts?
Cheers
Oli
Hi,
When I run Topaz train in v3.2, after running apparently normally for some time it terminates with the attached error. This is on CentOS 7, with RTX-3090 cards, CUDA 11.2, Topaz 0.2.4. Thoughts?
Cheers
Oli
Hi @olibclarke,
There seems to be a conflict between the last Topaz and Cryosparc v3.2. You can try to install another Topaz. Have a look to this thread…
cryoSPARC v.3.1.0 and Topaz
Best,
Juan
Thanks Juan - but this doesn’t seem to be the same error, and I am using the version of Topaz that is apparently ok per that thread… I am suspecting that maybe it has something to do with these new cards, which require CUDA 11.1, but not entirely sure
Cheers
Oli
Hi Oli,
I just tried it on our Cryosparc 3.2.0 with Topaz 0.2.3 and CUDA 10.1 + RTX 5000 and it worked fine. My guess is Topaz and Cuda 11.1 are not compatible yet. I will make an issue on the Topaz github. Thanks!
Best,
-Alex
Hi @olibclarke, @alexjamesnoble,
I just tested this on our machine with a 3090 on cryoSPARC v3.2, and I was able to get the job to complete successfully:
platform_release: "5.4.0-65-generic"
platform_version: "#73~18.04.1-Ubuntu SMP Tue Jan 19 09:02:24 UTC 2021"
platform_architecture: "x86_64"
name: "GeForce RTX 3090"
CUDA_version: "11.1.0" # this is the version of CUDA that pyCUDA was built with
nvidia-smi
:
Thu Apr 8 15:15:40 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 3090 Off | 00000000:0A:00.0 Off | N/A |
| 0% 49C P8 18W / 350W | 2MiB / 24268MiB | 0% E. Process |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 3090 Off | 00000000:42:00.0 Off | N/A |
| 0% 47C P8 23W / 350W | 2MiB / 24265MiB | 0% E. Process |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
One difference I made was to install pyTorch 1.8.1 with CUDA Toolkit 11.1:
conda activate topaz
conda install pytorch cudatoolkit=11.1 -c pytorch -c conda-forge
Thanks @stephan, we will give this a go!
Thanks for checking Stephan! We closed the Topaz issue. Can this be added to the Cryosparc Topaz installation recommendations?
-Alex
Yes thanks @stephan - after doing that it works on our system too
Definitely, done!
Hi Stephan,
does Topaz now work with cryoSPARC 3.2.0 without deactivating the cryoSPARC anaconda environment, as described here?
Cheers,
Dirk.