Error in Deep Enhancer

Hello,

I set the paths for my deep Enhancer job, but every time that I run it, I face this error:

Subprocess existed with status -11 …

Can anyone help me how I can fix this?

Please post additional messages from the Event and Job logs.

I added a picture of it, thank you!

Please post the text (to facilitate searches of the forum) of the error message in the picture, and text of relevant entries in the job log.

double check the path(s). inside /opt/miniconda3/envs/deepEMhancer_env/lib are there three files for tightTarget, highRes, and wideTarget? do you need a slash after lib? What happens if you provide a mask and set mask type to 2 (areas outside mask set to 0)? This will ignore the 3 “types” of mask generation above.

Thank you!
I still face this error:
Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 93, in cryosparc_compute.run.main
File “/opt/cryosparc3/cryosparc_worker/cryosparc_compute/jobs/deepemhancer/run.py”, line 158, in run_deepemhancer_wrapper
assert process.returncode == 0, f"Subprocess exited with status {process.returncode} ({str_command})"
AssertionError: Subprocess exited with status 1 (/opt/miniconda3/envs/deepEMhancer_env/bin/deepemhancer -i /data/data_jylee/femr_202112_krios/P3/J349/map_half_A_filtered.mrc -i2 /data/data_jylee/femr_202112_krios/P3/J349/map_half_B_filtered.mrc -o /data/data_jylee/femr_202112_krios/P3/J349/J349_map_sharp.mrc -g 1 --deepLearningModelPath /home/jyleelab/Desktop/deepEMhancerModels/production_checkpoints/deepEMhancer_tightTarget.hd5 -p wideTarget)

This is a new error that I get every time, I run it:
Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 93, in cryosparc_compute.run.main
File “/opt/cryosparc3/cryosparc_worker/cryosparc_compute/jobs/deepemhancer/run.py”, line 158, in run_deepemhancer_wrapper
assert process.returncode == 0, f"Subprocess exited with status {process.returncode} ({str_command})"
AssertionError: Subprocess exited with status 1 (/opt/miniconda3/envs/deepEMhancer_env/bin/deepemhancer -i /data/data_jylee/femr_202112_krios/P3/J349/map_half_A_filtered.mrc -i2 /data/data_jylee/femr_202112_krios/P3/J349/map_half_B_filtered.mrc -o /data/data_jylee/femr_202112_krios/P3/J349/J349_map_sharp.mrc -g 1 --deepLearningModelPath /home/jyleelab/Desktop/deepEMhancerModels/production_checkpoints/deepEMhancer_tightTarget.hd5 -p wideTarget)

to be sure, are you filling out both the path to the executable AND a different path to the folder containing the 3 models? wtempel wants you to go the log of the job and post that text so they can see more information than just the error message.

I filled both paths. But unfortunately still got this error. I will post the log tomorrow.

Does deepEMhancer run successfully with the same parameters outside of cryoSPARC?

Hi folks, I hope you don’t mind, though I’m not the OP I’d like to resurrect this thread and share our job and error outputs and try to get some advice:

License is valid.

Launching job on lane default target hawk …

Running job on master node hostname hawk
[CPU: 194.4 MB]

Job J172 Started
[CPU: 194.4 MB]

Master running v4.1.1+230110, worker running v4.1.1+230110
[CPU: 194.4 MB]

Working in directory: /data/liuchuan/cryosparc_projects/CS-bill-dnab/J172
[CPU: 194.4 MB]

Running on lane default
[CPU: 194.4 MB]

Resources allocated:
[CPU: 194.4 MB]

Worker: hawk
[CPU: 194.4 MB]

CPU : [0]
[CPU: 194.4 MB]

GPU : [0]
[CPU: 194.4 MB]

RAM : [0]
[CPU: 194.4 MB]

SSD : False
[CPU: 194.4 MB]


[CPU: 194.4 MB]

Importing job module for job type deepemhancer…
[CPU: 202.3 MB]

Job ready to run
[CPU: 202.3 MB]


[CPU: 737.0 MB]

Sanchez-Garcia, R., Gomez-Blanco, J., Cuervo, A. et al. DeepEMhancer: a deep learning solution for cryo-EM volume post-processing. Commun Biol 4, 874 (2021). doi:10.1038/s42003-021-02399-1
Structura Biotechnology Inc. and CryoSPARC do not license DeepEMhancer nor distribute DeepEMhancer binaries. Please ensure you have your own copy of DeepEMhancer licensed and installed under the terms of its Apache License v2.0, available for review at: deepEMhancer/LICENSE at master · rsanchezgarc/deepEMhancer · GitHub.
[CPU: 737.0 MB]

DeepEMhancer command:
[CPU: 737.0 MB]

/programs/x86_64-linux/deepemhancer/20220530_cu10/bin/deepemhancer -i /data/liuchuan/cryosparc_projects/CS-bill-dnab/J163/J163_005_volume_map_half_A.mrc -i2 /data/liuchuan/cryosparc_projects/CS-bill-dnab/J163/J163_005_volume_map_half_B.mrc -o /data/liuchuan/cryosparc_projects/CS-bill-dnab/J172/J172_map_sharp.mrc -g 0 --deepLearningModelPath /home/exx/.local/share/deepEMhancerModels/production_checkpoints -p tightTarget
[CPU: 737.0 MB]

Starting deepEMhancer process…
[CPU: 736.9 MB]

updating environment to select gpu: [0]
[CPU: 736.9 MB]

loading model /home/exx/.local/share/deepEMhancerModels/production_checkpoints/deepEMhancer_tightTarget.hd5 … DONE!
[CPU: 736.9 MB]

Automatic radial noise detected beyond 86.60254037844386 % of volume side
[CPU: 736.9 MB]

DONE!. Shape at 1 A/voxel after padding-> (352, 352, 352)
[CPU: 736.9 MB]

Using TensorFlow backend.
[CPU: 736.9 MB]

[CPU: 736.9 MB]

0%| | 0/361 [00:00<?, ?it/s]2023-05-23 09:58:35.095509: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED
[CPU: 736.9 MB]

Neural net inference
[CPU: 736.9 MB]

Traceback (most recent call last):
[CPU: 736.9 MB]

File “/programs/x86_64-linux/deepemhancer/20220530_cu10/bin/deepemhancer”, line 11, in
[CPU: 736.9 MB]

sys.exit(commanLineFun())
[CPU: 736.9 MB]

File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/deepEMhancer/exeDeepEMhancer.py”, line 80, in commanLineFun
[CPU: 736.9 MB]

main( ** parseArgs() )
[CPU: 736.9 MB]

File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/deepEMhancer/exeDeepEMhancer.py”, line 73, in main
[CPU: 736.9 MB]

voxel_size=boxSize, apply_postprocess_cleaning=cleaningStrengh)
[CPU: 736.9 MB]

File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/deepEMhancer/applyProcessVol/processVol.py”, line 186, in predict
[CPU: 736.9 MB]

batch_y_pred= self.model.predict_on_batch(np.expand_dims(batch_x, axis=-1))
[CPU: 736.9 MB]

File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/keras/engine/training.py”, line 1274, in predict_on_batch
[CPU: 736.9 MB]

outputs = self.predict_function(ins)
[CPU: 736.9 MB]

File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py”, line 2715, in call
[CPU: 736.9 MB]

return self._call(inputs)
[CPU: 736.9 MB]

File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py”, line 2675, in _call
[CPU: 736.9 MB]

fetched = self._callable_fn(*array_vals)
[CPU: 736.9 MB]

File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/tensorflow/python/client/session.py”, line 1458, in call
[CPU: 736.9 MB]

run_metadata_ptr)
[CPU: 736.9 MB]

tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
[CPU: 736.9 MB]

(0) Internal: Blas SGEMM launch failed : m=2097152, n=1, k=8
[CPU: 736.9 MB]

[[{{node conv3d_21/convolution}}]]
[CPU: 736.9 MB]

(1) Internal: Blas SGEMM launch failed : m=2097152, n=1, k=8
[CPU: 736.9 MB]

[[{{node conv3d_21/convolution}}]]
[CPU: 736.9 MB]

[[activation_10/Identity/_609]]
[CPU: 736.9 MB]

0 successful operations.
[CPU: 736.9 MB]

0 derived errors ignored.
[CPU: 736.9 MB]

[CPU: 736.9 MB]

0%| | 0/361 [17:01<?, ?it/s]
[CPU: 736.9 MB]

Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 93, in cryosparc_compute.run.main
File “/home/exx/cryosparc/cryosparc_worker/cryosparc_compute/jobs/deepemhancer/run.py”, line 158, in run_deepemhancer_wrapper
assert process.returncode == 0, f"Subprocess exited with status {process.returncode} ({str_command})"
AssertionError: Subprocess exited with status 1 (/programs/x86_64-linux/deepemhancer/20220530_cu10/bin/deepemhancer -i /data/liuchuan/cryosparc_projects/CS-bill-dnab/J163/J163_005_volume_map_half_A.mrc -i2 /data/liuchuan/cryosparc_projects/CS-bill-dnab/J163/J163_005_volume_map_half_B.mrc -o /data/liuchuan/cryosparc_projects/CS-bill-dnab/J172/J172_map_sharp.mrc -g 0 --deepLearningModelPath /home/exx/.local/share/deepEMhancerModels/production_checkpoints -p tightTarget)

@drichman What happens when you run the same deepemhancer command outside CryoSPARC?

@wtempel The job hangs at 0% of the progress bar, with the GPU’s RAM loaded but no GPU usage or system CPU/RAM usage, for about 20-30 minutes before giving this error output:

drichman@hawk:/data/liuchuan/cryosparc_projects/CS-bill-dnab/J172/test1_DER_2023-06-08$ /programs/x86_64-linux/deepemhancer/20220530_cu10/bin/deepemhancer -i /data/liuchuan/cryosparc_projects/CS-bill-dnab/J163/J163_005_volume_map_half_A.mrc -i2 /data/liuchuan/cryosparc_projects/CS-bill-dnab/J163/J163_005_volume_map_half_B.mrc -o /data/liuchuan/cryosparc_projects/CS-bill-dnab/J172/J172_map_sharp.mrc -g 0 --deepLearningModelPath /home/exx/.local/share/deepEMhancerModels/production_checkpoints -p tightTarget
updating environment to select gpu: [0]
Using TensorFlow backend.
loading model /home/exx/.local/share/deepEMhancerModels/production_checkpoints/deepEMhancer_tightTarget.hd5 … DONE!
Automatic radial noise detected beyond 86.60254037844386 % of volume side
DONE!. Shape at 1 A/voxel after padding-> (352, 352, 352)
Neural net inference
0%| | 0/361 [00:00<?, ?it/s]2023-06-08 12:19:51.912814: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
File “/programs/x86_64-linux/deepemhancer/20220530_cu10/bin/deepemhancer”, line 11, in
sys.exit(commanLineFun())
File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/deepEMhancer/exeDeepEMhancer.py”, line 80, in commanLineFun
main( ** parseArgs() )
File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/deepEMhancer/exeDeepEMhancer.py”, line 73, in main
voxel_size=boxSize, apply_postprocess_cleaning=cleaningStrengh)
File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/deepEMhancer/applyProcessVol/processVol.py”, line 186, in predict
batch_y_pred= self.model.predict_on_batch(np.expand_dims(batch_x, axis=-1))
File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/keras/engine/training.py”, line 1274, in predict_on_batch
outputs = self.predict_function(ins)
File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py”, line 2715, in call
return self._call(inputs)
File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py”, line 2675, in _call
fetched = self._callable_fn(*array_vals)
File “/programs/x86_64-linux/deepemhancer/20220530_cu10/miniconda3/envs/deepEMhancer_env/lib/python3.6/site-packages/tensorflow/python/client/session.py”, line 1458, in call
run_metadata_ptr)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) Internal: Blas SGEMM launch failed : m=2097152, n=1, k=8
[[{{node conv3d_21/convolution}}]]
(1) Internal: Blas SGEMM launch failed : m=2097152, n=1, k=8
[[{{node conv3d_21/convolution}}]]
[[activation_10/Identity/_609]]
0 successful operations.
0 derived errors ignored.
0%| | 0/361 [15:35<?, ?it/s]

Maybe other forum members have encountered this issue before and/or can comment here. You may also consider browsing existing issues or opening a new issue on the deepEMhancer site.

Does it work outside of CryoSPARC? That’s always the first test for any third party software.

Unfortunately it doesn’t run outside CryoSPARC either. My DeepEMhancer is installed through SBGrid, and I tried running the following command with the bin path and the bin.capsules path. I also tried 1 and 2 GPUs.

/programs/x86_64-linux/deepemhancer/20220530_cu10/bin.capsules/deepemhancer -i /data/liuchuan/cryosparc_projects/CS-bill-dnab/J163/J163_005_volume_map_half_A.mrc -i2 /data/liuchuan/cryosparc_projects/CS-bill-dnab/J163/J163_005_volume_map_half_B.mrc -o /data/liuchuan/cryosparc_projects/CS-bill-dnab/J172/J172_map_sharp.mrc -g 1 --deepLearningModelPath /home/exx/.local/share/deepEMhancerModels/production_checkpoints -p tightTarget

The errors reported in the output are

tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED

and the root errors

tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) Internal: Blas SGEMM launch failed : m=2097152, n=1, k=8
[[{{node replica_0/model_1/conv3d_21/convolution}}]]
(1) Internal: Blas SGEMM launch failed : m=2097152, n=1, k=8
[[{{node replica_0/model_1/conv3d_21/convolution}}]]
[[replica_0/model_1/conv3d_21/add/_813]]

(See above post for the complete output)

Please let me know if you have any insight. Then I’ll write to SBGrid for more help.

Do other CUDA processes run OK? I think you should talk to SBGrid.