I tried running a Deeppicker training job and I got the error below. Making this job work is not a high priority for me, but I thought I would raise the issue. It seems like a compatibility issue with Tensorflow. Is this something that I need to update on my end, or is csparc 4.7.1-cuda12 using a version of tensorflow that is not yet cuda-12.8 compatible?
Traceback (most recent call last):
File "cryosparc_master/cryosparc_compute/run.py", line 129, in cryosparc_master.cryosparc_compute.run.main
File "cryosparc_master/cryosparc_compute/jobs/deep_picker/run_deep_picker.py", line 275, in cryosparc_master.cryosparc_compute.jobs.deep_picker.run_deep_picker.run_deep_picker_train
File "cryosparc_master/cryosparc_compute/jobs/deep_picker/train.py", line 56, in cryosparc_master.cryosparc_compute.jobs.deep_picker.train.train_picker
File "cryosparc_master/cryosparc_compute/jobs/deep_picker/train.py", line 118, in cryosparc_master.cryosparc_compute.jobs.deep_picker.train.train_picker
File "/home/turul_csparc/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 1510, in shuffle
return shuffle_op._shuffle( # pylint: disable=protected-access
File "/home/turul_csparc/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/tensorflow/python/data/ops/shuffle_op.py", line 32, in _shuffle
return _ShuffleDataset(
File "/home/turul_csparc/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/tensorflow/python/data/ops/shuffle_op.py", line 51, in __init__
self._seed, self._seed2 = random_seed.get_seed(seed)
File "/home/turul_csparc/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/tensorflow/python/data/util/random_seed.py", line 50, in get_seed
math_ops.equal(seed, 0), math_ops.equal(seed2, 0)),
File "/home/turul_csparc/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/turul_csparc/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.10/site-packages/tensorflow/python/framework/ops.py", line 6002, in raise_from_not_ok_status
raise core._status_to_exception(e) from None # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InternalError: {{function_node __wrapped__Equal_device_/job:localhost/replica:0/task:0/device:GPU:0}} 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, 0, reinterpret_cast<CUstream>(stream), params, nullptr)' failed with 'CUDA_ERROR_INVALID_HANDLE' [Op:Equal] name: