Hi all!
I am trying to run patch motion cor on cryosparc v.3.2.0, CUDA version is 10.1 and I’m running it in a cluster using GPU (32 cores with HT, CPU 3.2GHz Xeon E5-2667 v4, GPU GeForce GTX 1080 Ti - 11GB, 256GB RAM) but I also try CPU (Dell PE M630, 24 cores with HT, 3.2GHz Xeon E5-2667 v3, 64GB RAM).
I have the following error.
License is valid.
Launching job on lane CPU-SGE target CPU-SGE ...
Launching job on cluster CPU-SGE
====================== Cluster submission script: ========================
==========================================================================
#!/bin/sh
#$ -V
#$ -N cryosparc_P1_J23
#$ -pe openmpi 1 -l dedicated=24 -A Cryosparc
#$ -e P1/J23/J23.err
#$ -o P1/J23/J23.out
#$ -cwd
#$ -S /bin/bash
export CUDA_VISIBLE_DEVICES=""
soft/cryosparc2/cryosparc_worker/bin/cryosparcw run --project P1 --job J23 --master_hostname hal.lmb.internal --master_command_core_port 39042 > P1/J23/job.log 2>&1
==========================================================================
==========================================================================
-------- Submission command:
qsub /P1/J23/queue_sub_script.sh
-------- Cluster Job ID:
submitted
-------- Queued on cluster at 2021-04-28 18:56:22.750650
Failed to check cluster job status! 1
[CPU: 68.1 MB] Project P1 Job J23 Started
[CPU: 68.1 MB] Master running v3.2.0+210413, worker running v3.2.0+210413
[CPU: 68.4 MB] Running on lane CPU-SGE
[CPU: 68.4 MB] Resources allocated:
[CPU: 68.4 MB] Worker: CPU-SGE
[CPU: 68.4 MB] CPU : [0, 1, 2, 3, 4, 5]
[CPU: 68.4 MB] GPU : [0]
[CPU: 68.4 MB] RAM : [0, 1]
[CPU: 68.4 MB] SSD : False
[CPU: 68.4 MB] --------------------------------------------------------------
[CPU: 68.4 MB] Importing job module for job type patch_motion_correction_multi...
[CPU: 206.5 MB] Job ready to run
[CPU: 206.5 MB] ***************************************************************
[CPU: 206.8 MB] Job will process this many movies: 1079
[CPU: 206.9 MB] parent process is 110258
[CPU: 163.3 MB] Calling CUDA init from 110289
[CPU: 209.7 MB] Outputting partial results now...
[CPU: 210.7 MB] Traceback (most recent call last):
File "cryosparc_worker/cryosparc_compute/run.py", line 84, in cryosparc_compute.run.main
File "cryosparc_worker/cryosparc_compute/jobs/motioncorrection/run_patch.py", line 402, in cryosparc_compute.jobs.motioncorrection.run_patch.run_patch_motion_correction_multi
AssertionError: Child process with PID 110289 has terminated unexpectedly!
''''
CUDA version is 10.1 but I have tried also 9.1 with the same result.
I also have tried to run Motioncor2 with this error:
[CPU: 85.9 MB] --------------------------------------------------------------
[CPU: 85.9 MB] Processed 0 of 1079 movies in 0.01s
[CPU: 85.9 MB] Raw movie filepath located at: J4/imported/FoilHole_5452167_Data_4501361_4501363_20210406_011954_Fractions.mrcs - creating MotionCor2 command string...
[CPU: 1.30 GB] Finished creating MotionCor2 command string in 14.84s
[CPU: 1.30 GB] Starting MotionCor2 process...
[CPU: 1.30 GB] Running MotionCor2 command: /public/EM/MOTIONCOR2/MotionCor2 -InMrc /P1/J4/imported/FoilHole_5452167_Data_4501361_4501363_20210406_011954_Fractions.mrcs -OutMrc /P1/J21/motioncorrected/012542470421558195703_FoilHole_5452167_Data_4501361_4501363_20210406_011954_Fractions_motioncor2_aligned.mrc -Patch 5.0 5.0 -Kv 300.0 -PixSize 1.52 -FmDose 0.06974359047718537 -Gpu 0 -GpuMemUsage 0.5 -LogFile /P1/J21/motioncor2_logs/0
[CPU: 1.30 GB] Running process 3251606
[CPU: 1.30 GB] ERROR motioncor2 failed to produce output file /P1/J21/motioncorrected/012542470421558195703_FoilHole_5452167_Data_4501361_4501363_20210406_011954_Fractions_motioncor2_aligned.mrc
[CPU: 1.30 GB] Finished MotionCor2 process in 0.33s
[CPU: 1.30 GB] Traceback (most recent call last):
File "cryosparc_worker/cryosparc_compute/run.py", line 84, in cryosparc_compute.run.main
File "soft/cryosparc2/cryosparc_worker/cryosparc_compute/jobs/motioncorrection/run_motioncor2.py", line 364, in run_motioncor2_wrapper
with open(output_path_abs) as mrc_file:
FileNotFoundError: [Errno 2] No such file or directory: '/P1/J21/motioncorrected/012542470421558195703_FoilHole_5452167_Data_4501361_4501363_20210406_011954_Fractions_motioncor2_aligned.mrc'
I am using the same workspace for all the jobs.
Can anyone help me??
Thanks,
Irene