Hi,
When I run Topaz train in v3.2, after running apparently normally for some time it terminates with the attached error. This is on CentOS 7, with RTX-3090 cards, CUDA 11.2, Topaz 0.2.4. Thoughts?
Cheers
Oli
Hi,
When I run Topaz train in v3.2, after running apparently normally for some time it terminates with the attached error. This is on CentOS 7, with RTX-3090 cards, CUDA 11.2, Topaz 0.2.4. Thoughts?
Cheers
Oli
Hi @olibclarke,
There seems to be a conflict between the last Topaz and Cryosparc v3.2. You can try to install another Topaz. Have a look to this thread…
cryoSPARC v.3.1.0 and Topaz
Best,
Juan
Thanks Juan - but this doesn’t seem to be the same error, and I am using the version of Topaz that is apparently ok per that thread… I am suspecting that maybe it has something to do with these new cards, which require CUDA 11.1, but not entirely sure
Cheers
Oli
Hi Oli,
I just tried it on our Cryosparc 3.2.0 with Topaz 0.2.3 and CUDA 10.1 + RTX 5000 and it worked fine. My guess is Topaz and Cuda 11.1 are not compatible yet. I will make an issue on the Topaz github. Thanks!
Best,
-Alex
Hi @olibclarke, @alexjamesnoble,
I just tested this on our machine with a 3090 on cryoSPARC v3.2, and I was able to get the job to complete successfully:
platform_release: "5.4.0-65-generic"
platform_version: "#73~18.04.1-Ubuntu SMP Tue Jan 19 09:02:24 UTC 2021"
platform_architecture: "x86_64"
name: "GeForce RTX 3090"
CUDA_version: "11.1.0" # this is the version of CUDA that pyCUDA was built with
nvidia-smi
:
Thu Apr 8 15:15:40 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce RTX 3090 Off | 00000000:0A:00.0 Off | N/A |
| 0% 49C P8 18W / 350W | 2MiB / 24268MiB | 0% E. Process |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 3090 Off | 00000000:42:00.0 Off | N/A |
| 0% 47C P8 23W / 350W | 2MiB / 24265MiB | 0% E. Process |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
One difference I made was to install pyTorch 1.8.1 with CUDA Toolkit 11.1:
conda activate topaz
conda install pytorch cudatoolkit=11.1 -c pytorch -c conda-forge
Thanks for checking Stephan! We closed the Topaz issue. Can this be added to the Cryosparc Topaz installation recommendations?
-Alex
Definitely, done!
Hi Stephan,
does Topaz now work with cryoSPARC 3.2.0 without deactivating the cryoSPARC anaconda environment, as described here?
Cheers,
Dirk.