Refinement crashes during pre-processing

After starting refinement for two different datasets now I’m getting the same error, appended, which occurs during preprocessing. And I have run jobs on at least one of these datasets before. Puzzled - ideas welcome:

ERROR: No Heartbeat
This job was killed because a heartbeat was not received for 30 seconds.

The last 100 lines of the job standard output are printed below.

====== ENVIRONMENT ========
{‘CCP4_OPEN’: ‘UNKNOWN’, ‘ROSETTA3_DB’: ‘/home/user/software/rosetta_bin_linux_2016.46.59086_bundle/main/database’, ‘CCP4_MASTER’: ‘/home/user/software/ccp4’, ‘NODE_ENV’: ‘production’, ‘SHELL’: ‘/bin/bash’, ‘CRYOSPARC_BULK_DIR’: ‘/home/user/software/cryosparc/run/bulk’, ‘SUPERVISOR_ENABLED’: ‘1’, ‘TEST_METADATA’: ‘{}’, ‘MMCIFDIC’: ‘/home/user/software/ccp4/ccp4-7.0/lib/ccp4/cif_mmdic.lib’, ‘XDG_RUNTIME_DIR’: ‘/run/user/1000’, ‘PYTHONPATH’: ‘/home/user/software/EMAN2/lib:/home/user/software/EMAN2/bin:/home/user/software/localrec-1.2.0-beta.3/lib:’, ‘CRYOSPARC_REGISTER_DONE’: ‘true’, ‘CRYOSPARC_METEOR_BINDIR’: ‘’, ‘CETC’: ‘/home/user/software/ccp4/ccp4-7.0/etc’, ‘XDG_SESSION_ID’: ‘1672’, ‘CCP4_SCR’: ‘/tmp/user’, ‘CRYOSPARC_RESULTS_MIGRATION_DONE’: ‘true’, ‘warpdoc’: ‘/home/user/software/ccp4/arp_warp_7.6/manual’, ‘OMP_NUM_THREADS’: ‘1’, ‘CRYOSPARC_NODEJS_BINDIR’: ‘/home/user/software/cryosparc/nodejs/bin’, ‘SSH_TTY’: ‘/dev/pts/7’, ‘MAIL’: ‘/var/mail/user’, ‘SSH_CONNECTION’: ‘10.124.11.44 65049 156.111.6.127 22’, ‘MONGO_OPLOG_URL’: ‘mongodb://localhost:38001/local’, ‘CRYOSPARC_MASTER_HOSTNAME’: ‘narwhal’, ‘PHENIX_ROSETTA_PATH’: ‘/home/user/software/rosetta_bin_linux_2016.46.59086_bundle’, ‘LESSOPEN’: ‘| /usr/bin/lesspipe %s’, ‘LIBTBX_BUILD’: ‘’, ‘SUPERVISOR_GROUP_NAME’: ‘webapp’, ‘PROMPT_COMMAND’: ‘history -a’, ‘IMOD_QTLIBDIR’: ‘/usr/local/IMOD/qtlib’, ‘PORT’: ‘38000’, ‘SUPERVISOR_SERVER_URL’: ‘unix:///tmp/supervisor-5311641205534462182.sock’, ‘PHENIX’: ‘/home/user/software/phenix-dev-2666’, ‘QT_QPA_PLATFORMTHEME’: ‘appmenu-qt5’, ‘DISPLAY’: ‘localhost:10.0’, ‘EMAN2DIR’: ‘/home/user/software/EMAN2’, ‘CRYOSPARC_DEVELOP’: ‘false’, ‘BPARAM’: ‘/home/user/software/bsoft/parameters/’, ‘warpbin’: ‘/home/user/software/ccp4/arp_warp_7.6/bin/bin-x86_64-Linux’, ‘ROOT_URL’: ‘http://localhost:38000’, ‘CRYOSPARC_RAM_SLOTS’: ‘32’, ‘FOR_DISABLE_STACK_TRACE’: ‘1’, ‘SUPERVISOR_PROCESS_NAME’: ‘webapp’, ‘CRYOSPARC_HTTP_PORT’: ‘38000’, ‘_’: ‘/home/user/software/cryosparc/anaconda2/bin/supervisord’, ‘LS_COLORS’: ‘rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:.tar=01;31:.tgz=01;31:.arc=01;31:.arj=01;31:.taz=01;31:.lha=01;31:.lz4=01;31:.lzh=01;31:.lzma=01;31:.tlz=01;31:.txz=01;31:.tzo=01;31:.t7z=01;31:.zip=01;31:.z=01;31:.Z=01;31:.dz=01;31:.gz=01;31:.lrz=01;31:.lz=01;31:.lzo=01;31:.xz=01;31:.bz2=01;31:.bz=01;31:.tbz=01;31:.tbz2=01;31:.tz=01;31:.deb=01;31:.rpm=01;31:.jar=01;31:.war=01;31:.ear=01;31:.sar=01;31:.rar=01;31:.alz=01;31:.ace=01;31:.zoo=01;31:.cpio=01;31:.7z=01;31:.rz=01;31:.cab=01;31:.jpg=01;35:.jpeg=01;35:.gif=01;35:.bmp=01;35:.pbm=01;35:.pgm=01;35:.ppm=01;35:.tga=01;35:.xbm=01;35:.xpm=01;35:.tif=01;35:.tiff=01;35:.png=01;35:.svg=01;35:.svgz=01;35:.mng=01;35:.pcx=01;35:.mov=01;35:.mpg=01;35:.mpeg=01;35:.m2v=01;35:.mkv=01;35:.webm=01;35:.ogm=01;35:.mp4=01;35:.m4v=01;35:.mp4v=01;35:.vob=01;35:.qt=01;35:.nuv=01;35:.wmv=01;35:.asf=01;35:.rm=01;35:.rmvb=01;35:.flc=01;35:.avi=01;35:.fli=01;35:.flv=01;35:.gl=01;35:.dl=01;35:.xcf=01;35:.xwd=01;35:.yuv=01;35:.cgm=01;35:.emf=01;35:.ogv=01;35:.ogx=01;35:.aac=00;36:.au=00;36:.flac=00;36:.m4a=00;36:.mid=00;36:.midi=00;36:.mka=00;36:.mp3=00;36:.mpc=00;36:.ogg=00;36:.ra=00;36:.wav=00;36:.oga=00;36:.opus=00;36:.spx=00;36:.xspf=00;36:’, ‘CRYOSPARC_CUDA_DEVS’: ‘1,0’, ‘IMOD_CALIB_DIR’: ‘/usr/local/ImodCalib’, ‘CRYOSPARC_LICENSE_ID’: ‘0345fe16-7af1-571d-b3ee-2dfb756ed79f’, ‘CBIN’: ‘/home/user/software/ccp4/ccp4-7.0/bin’, ‘CRYOSPARC_ROOT_DIR’: ‘/home/user/software/cryosparc’, ‘CRYOSPARC_UPLOAD_DIR’: ‘/home/user/software/cryosparc/run/bulk/uploads’, ‘IMOD_PLUGIN_DIR’: ‘/usr/local/IMOD/lib/imodplug’, ‘CRYOSPARC_CACHE_CUSHION’: ‘10240.0’, ‘CCP4_HELPDIR’: ‘/home/user/software/ccp4/ccp4-7.0/help/’, ‘CLIB’: ‘/home/user/software/ccp4/ccp4-7.0/lib’, ‘MATPLOTLIBRC’: ‘/home/user/software/EMAN2/extlib’, ‘HOME’: ‘/home/user’, ‘LD_LIBRARY_PATH’: ‘/home/user/software/pydusa-1.15efatal: Not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
s-fftmpi-6/fftw_mpi/installation/lib:/home/user/software/bsoft/lib:/home/user/software/rubinstein/minimal_libraries:/usr/local/IMOD/lib:/usr/local/IMOD/lib:’, ‘PHENIX_VERSION’: ‘dev-2666’, ‘LANG’: ‘en_US.UTF-8’, ‘CCP4I_TOP’: ‘/home/user/software/ccp4/ccp4-7.0/share/ccp4i’, ‘CINCL’: ‘/home/user/software/ccp4/ccp4-7.0/include’, ‘CRANK’: ‘/home/user/software/ccp4/ccp4-7.0/share/ccp4i/crank’, ‘MKL_NUM_THREADS’: ‘1’, ‘CRYOSPARC_CODE_DIR’: ‘/home/user/software/cryosparc/cryosparc-compute’, ‘HTTP_FORWARDED_COUNT’: ‘1’, ‘BSOFT’: ‘/home/user/software/bsoft’, ‘CRYOSPARC_ANACONDA_BINDIR’: ‘/home/user/software/cryosparc/anaconda2/bin’, ‘CCP4I_TCLTK’: ‘/home/user/software/ccp4/ccp4-7.0/bin’, ‘CRYOSPARC_MONGO_PORT’: ‘38001’, ‘MONGO_URL’: ‘mongodb://localhost:38001/meteor’, ‘LESSCLOSE’: ‘/usr/bin/lesspipe %s %s’, ‘LIBTBX_TMPVAL’: ‘’, ‘IMOD_DIR’: ‘/usr/local/IMOD’, ‘CLIBD’: ‘/home/user/software/ccp4/ccp4-7.0/lib/data’, ‘ROSETTA3’: ‘/home/user/software/rosetta_bin_linux_2016.46.59086_bundle/main’, ‘USER’: ‘user’, ‘NUMEXPR_NUM_THREADS’: ‘1’, ‘SSH_CLIENT’: ‘10.124.11.44 65049 22’, ‘LOGNAME’: ‘user’, ‘CRYOSPARC_EXPERIMENTAL’: ‘true’, ‘PATH’: ‘/home/user/software/cryosparc/mongodb/bin:/home/user/software/cryosparc/nodejs/bin:/home/user/software/cryosparc/anaconda2/bin:/home/user/bin:/home/user/.local/bin:/home/user/software/EMAN2/bin:/home/user/software/EMAN2/extlib/bin:/home/user/software/cryosparc/bin:/home/user/software/ccp4/arp_warp_7.6/bin/bin-x86_64-Linux:/home/user/software/ccp4/ccp4-7.0/etc:/home/user/software/ccp4/ccp4-7.0/bin:/home/user/software/phenix-dev-2666/build/bin:/home/user/software/bsoft/bin:/home/user/software/pyem:/home/user/software/mag_distortion_correct_1.0.0/bin:/home/user/software/summovie_1.0.2/bin:/usr/local/cuda-8.0/bin:/home/user/software/Chimera/bin:/home/user/software/localrec-1.2.0-beta.3/scripts:/home/user/software/localrec-1.2.0-beta.3:/home/user/software/scipion:/home/user/software/pat3dem/bin:/home/user/software/direx-0.7-rev406-linux/bin:/home/user/software/rosetta_bin_linux_2016.46.59086_bundle/main/source/bin:/home/user/software/Gctf_v1.06/bin:/usr/local/IMOD/bin:/home/user/software/motioncor2:/home/user/software/rubinstein/bin:/home/user/software/mag_distortion_estimate_1.0.1/bin:/home/user/software/unblur_1.0.2/bin:/home/user/software/ctffind4:/home/user/software/frealign_v9.11/bin:/home/user/software/Gautomatch_v0.53/bin:/home/user/software/Gctf_v0.50/bin:/home/user/software/relion2-beta/build/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin’, ‘CCP4’: ‘/home/user/software/ccp4/ccp4-7.0’, ‘METEOR_SETTINGS’: ‘{“public”:{“webinstance”:false, “instancename”:“narwhal”, “instancetype”:“academicbeta”}}’, ‘TERM’: ‘xterm’, ‘IMOD_JAVADIR’: ‘/usr/local/java’, ‘GFORTRAN_UNBUFFERED_PRECONNECTED’: ‘Y’, ‘CRYOSPARC_JOB_LOG_DIR’: ‘/home/user/software/cryosparc/run/sparcjobs’, ‘CRYOSPARC_SUPERVISOR_SOCK_FILE’: ‘/tmp/supervisor-5311641205534462182.sock’, ‘CRYOSPARC_INSTALL_TYPE’: ‘master’, ‘CRYOSPARC_MONGODB_BINDIR’: ‘/home/user/software/cryosparc/mongodb/bin’, ‘CHTML’: ‘/home/user/software/ccp4/ccp4-7.0/html’, ‘LIBTBX_OPATH’: ‘’, ‘CLIBD_MON’: ‘/home/user/software/ccp4/ccp4-7.0/lib/data/monomers/’, ‘CEXAM’: ‘/home/user/software/ccp4/ccp4-7.0/examples’, ‘OLDPWD’: ‘/home/user/software/cryosparc’, ‘APP_ID’: ‘1ipv1tp12xzfzhba088c’, ‘SHLVL’: ‘2’, ‘PWD’: ‘/home/user/software/cryosparc/cryosparc-webapp/bundle/programs/server’}

362fd95749426e7abfc5e7b9bac54796ce247f9bc56ebcf5e34b735672259824
License Data: {“request_date”:“Tuesday February 28 2017”,“issued_date”:“Monday January 16 2017”,“expiry_date”:“Saturday April 1 2017”,“issued_to_inst”:“Columbia University”,“issued_to_name”:“oc2188@columbia.edu”,“license_type”:“academic_beta”,“version”:“all”,“valid”:true}
License Signature: (10365924453888071233479798693970338084142666169103566741106823764873174971188687206837612449950650491175899698689625249475981226730332626169114060891905579071853531492052147352468722582611815357462453781611604454531856666611627880459804393119521228830684849443722358966fatal: Not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: Not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
Not a git repository
To compare two paths outside a working tree:
usage: git diff [–no-index]
5341461320926136928850311286954112788908621591080259396037425169667631184537059270563269176237789255642241448841753758032235840928848

Hmm, this seems to be happening for all my jobs now, even ones that were working before… maybe a server-side issue, it mentions the license?

Oli

Ahaha, it’s a GPU issue. Whenever it is running on the same GPU as relion (which was using 7GB/12GB memory), I get this error. If I disable that GPU and use the other one, all is good. Interesting.

Hmm, still getting some weird errors with 0.3.6 that I haven’t had with previous betas - e.g. this, part way through an ab initio run with three classes:

ERROR: No Heartbeat
This job was killed because a heartbeat was not received for 30 seconds.

The last 100 lines of the job standard output are printed below.

<matplotlib.figure.Figure at 0x7f2283f76b90>
e[2K
e[2K
<matplotlib.figure.Figure at 0x7f22902ea310>
<matplotlib.figure.Figure at 0x7f228256b250>
<matplotlib.figure.Figure at 0x7f22902f1c90>
<matplotlib.figure.Figure at 0x7f2282224250>
<matplotlib.figure.Figure at 0x7f231845f250>
<matplotlib.figure.Figure at 0x7f2281ecd990>
<matplotlib.figure.Figure at 0x7f22990b5650>
<matplotlib.figure.Figure at 0x7f2281bf1c10>
<matplotlib.figure.Figure at 0x7f2283f76b90>
e[2K
e[2K
<matplotlib.figure.Figure at 0x7f22902ea310>
<matplotlib.figure.Figure at 0x7f228256b250>
<matplotlib.figure.Figure at 0x7f22902f1c90>
<matplotlib.figure.Figure at 0x7f2282224250>
<matplotlib.figure.Figure at 0x7f231845f250>
<matplotlib.figure.Figure at 0x7f2281ecd990>
<matplotlib.figure.Figure at 0x7f22990b5650>
<matplotlib.figure.Figure at 0x7f2281bf1c10>
<matplotlib.figure.Figure at 0x7f2283f76b90>
e[2K
e[2K
<matplotlib.figure.Figure at 0x7f22902ea310>
<matplotlib.figure.Figure at 0x7f228256b250>
<matplotlib.figure.Figure at 0x7f22902f1c90>
<matplotlib.figure.Figure at 0x7f2282224250>
<matplotlib.figure.Figure at 0x7f231845f250>
<matplotlib.figure.Figure at 0x7f2281ecd990>
<matplotlib.figure.Figure at 0x7f22990b5650>
<matplotlib.figure.Figure at 0x7f2281bf1c10>
<matplotlib.figure.Figure at 0x7f2283f76b90>
e[2K
e[2K
<matplotlib.figure.Figure at 0x7f22902ea310>
<matplotlib.figure.Figure at 0x7f228256b250>
<matplotlib.figure.Figure at 0x7f22902f1c90>
<matplotlib.figure.Figure at 0x7f2282224250>
<matplotlib.figure.Figure at 0x7f231845f250>
<matplotlib.figure.Figure at 0x7f2281ecd990>
<matplotlib.figure.Figure at 0x7f22990b5650>
<matplotlib.figure.Figure at 0x7f2281bf1c10>
<matplotlib.figure.Figure at 0x7f2283f76b90>
e[2K
e[2K
<matplotlib.figure.Figure at 0x7f22902ea310>
<matplotlib.figure.Figure at 0x7f228256b250>
<matplotlib.figure.Figure at 0x7f22902f1c90>
<matplotlib.figure.Figure at 0x7f2282224250>
<matplotlib.figure.Figure at 0x7f231845f250>
<matplotlib.figure.Figure at 0x7f2281ecd990>
<matplotlib.figure.Figure at 0x7f22990b5650>
<matplotlib.figure.Figure at 0x7f2281bf1c10>
<matplotlib.figure.Figure at 0x7f2283f76b90>
e[2K
e[2K
<matplotlib.figure.Figure at 0x7f22902ea310>
<matplotlib.figure.Figure at 0x7f228256b250>
<matplotlib.figure.Figure at 0x7f22902f1c90>
<matplotlib.figure.Figure at 0x7f2282224250>
<matplotlib.figure.Figure at 0x7f231845f250>
<matplotlib.figure.Figure at 0x7f2281ecd990>
<matplotlib.figure.Figure at 0x7f22990b5650>
<matplotlib.figure.Figure at 0x7f2281bf1c10>
<matplotlib.figure.Figure at 0x7f2283f76b90>
e[2K
e[2K
<matplotlib.figure.Figure at 0x7f22902ea310>
<matplotlib.figure.Figure at 0x7f228256b250>
<matplotlib.figure.Figure at 0x7f22902f1c90>
<matplotlib.figure.Figure at 0x7f2282224250>
<matplotlib.figure.Figure at 0x7f231845f250>
<matplotlib.figure.Figure at 0x7f2281ecd990>
<matplotlib.figure.Figure at 0x7f22990b5650>
<matplotlib.figure.Figure at 0x7f2281bf1c10>
<matplotlib.figure.Figure at 0x7f2283f76b90>
e[2K
e[2K
<matplotlib.figure.Figure at 0x7f22902ea310>
<matplotlib.figure.Figure at 0x7f228256b250>
<matplotlib.figure.Figure at 0x7f22902f1c90>
<matplotlib.figure.Figure at 0x7f2282224250>
<matplotlib.figure.Figure at 0x7f231845f250>
<matplotlib.figure.Figure at 0x7f2281ecd990>
<matplotlib.figure.Figure at 0x7f22990b5650>
<matplotlib.figure.Figure at 0x7f2281bf1c10>
<matplotlib.figure.Figure at 0x7f2283f76b90>
e[2K
e[2K
<matplotlib.figure.Figure at 0x7f22902ea310>
<matplotlib.figure.Figure at 0x7f228256b250>
<matplotlib.figure.Figure at 0x7f22902f1c90>
<matplotlib.figure.Figure at 0x7f2282224250>
<matplotlib.figure.Figure at 0x7f231845f250>
<matplotlib.figure.Figure at 0x7f2281ecd990>
<matplotlib.figure.Figure at 0x7f22990b5650>
<matplotlib.figure.Figure at 0x7f2281bf1c10>
<matplotlib.figure.Figure at 0x7f2283f76b90>
e[2K
e[2K
<matplotlib.figure.Figure at 0x7f22902ea310>
<matplotlib.figure.Figure at 0x7f228256b250>
<matplotlib.figure.Figure at 0x7f22902f1c90>
<matplotlib.figure.Figure at 0x7f2282224250>
<matplotlib.figure.Figure at 0x7f231845f250>
<matplotlib.figure.Figure at 0x7f2281ecd990>
<matplotlib.figure.Figure at 0x7f22990b5650>
<matplotlib.figure.Figure at 0x7f2281bf1c10>
<matplotlib.figure.Figure at 0x7f2283f76b90>
e[2K
e[2K
<matplotlib.figure.Figure at 0x7f22902ea310>
<matplotlib.figure.Figure at 0x7f228256b250>
<matplotlib.figure.Figure at 0x7f22902f1c90>
<matplotlib.figure.Figure at 0x7f2282224250>
<matplotlib.figure.Figure at 0x7f231845f250>
<matplotlib.figure.Figure at 0x7f2281ecd990>
<matplotlib.figure.Figure at 0x7f22990b5650>
<matplotlib.figure.Figure at 0x7f2281bf1c10>
<matplotlib.figure.Figure at 0x7f2283f76b90>