Hi,
I am getting different errors when running 2D classification in cryosparc v2.14.2. A snapshot is attached and another log file as well. I am not sure if the underlining cause is the same. The machine has 64 GB of RAM, box size is 384. Similar jobs with more particles ran normally earlier with the same parameters.
================= CRYOSPARCW ======= 2020-03-09 17:25:20.172376 =========
Project P22 Job J114
Master cryosparc.host.utmb.edu Port 39002
===========================================================================
========= monitor process now starting main process
MAINPROCESS PID 31292
========= monitor process now waiting for main process
MAIN PID 31292
class2D.run cryosparc2_compute.jobs.jobregister
/mnt/ape2/cryosparc/software/cryosparc/cryosparc2_worker-v2.14.2/deps/anaconda/lib/python2.7/site-packages/skcuda/cublas.py:284: UserWarning: creating CUBLAS context to get version number
warnings.warn('creating CUBLAS context to get version number')
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
***************************************************************
Running job J114 of type class_2D
Running job on hostname %s vds1-2.utmb.edu
Allocated Resources : {u'lane': u'vds12', u'target': {u'monitor_port': None, u'lane': u'vds12', u'name': u'vds1-2.utmb.edu', u'title': u'Worker node vds1-2.utmb.edu', u'resource_slots': {u'GPU': [0], u'RAM': [0, 1, 2, 3, 4, 5, 6, 7], u'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]}, u'hostname': u'vds1-2.utmb.edu', u'worker_bin_path': u'/mnt/ape2/cryosparc/software/cryosparc/cryosparc2_worker-v2.14.2/bin/cryosparcw', u'cache_path': u'/mnt/scratch/cryosparc_cache', u'cache_quota_mb': None, u'resource_fixed': {u'SSD': True}, u'cache_reserve_mb': 10000, u'type': u'node', u'ssh_str': u'cryosparc@vds1-2.utmb.edu', u'desc': None}, u'license': True, u'hostname': u'vds1-2.utmb.edu', u'slots': {u'GPU': [0], u'RAM': [0, 1, 2], u'CPU': [0, 1]}, u'fixed': {u'SSD': True}, u'lane_type': u'vds12', u'licenses_acquired': 1}
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
/mnt/ape2/cryosparc/software/cryosparc/cryosparc2_worker-v2.14.2/deps/anaconda/lib/python2.7/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice.
warnings.warn("Mean of empty slice.", RuntimeWarning)
/mnt/ape2/cryosparc/software/cryosparc/cryosparc2_worker-v2.14.2/deps/anaconda/lib/python2.7/site-packages/numpy/core/_methods.py:70: RuntimeWarning: invalid value encountered in true_divide
ret = ret.dtype.type(ret / rcount)
cryosparc2_compute/sigproc.py:771: RuntimeWarning: invalid value encountered in divide
frc[k, :copylen] = (AB / n.sqrt(AA*BB))[:copylen]
cryosparc2_compute/sigproc.py:838: RuntimeWarning: invalid value encountered in greater
crossings = n.where((fsc[:-1] > thresh) * (fsc[1:] < thresh))[0]
cryosparc2_compute/sigproc.py:838: RuntimeWarning: invalid value encountered in less
crossings = n.where((fsc[:-1] > thresh) * (fsc[1:] < thresh))[0]
========= sending heartbeat
========= sending heartbeat
cryosparc2_compute/util/logsumexp.py:40: RuntimeWarning: divide by zero encountered in log
return n.log(wa * n.exp(a - vmax) + wb * n.exp(b - vmax) ) + vmax
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
/mnt/ape2/cryosparc/software/cryosparc/cryosparc2_worker-v2.14.2/deps/anaconda/lib/python2.7/site-packages/matplotlib/pyplot.py:516: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`).
max_open_warning, RuntimeWarning)
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
cryosparc2_compute/sigproc.py:771: RuntimeWarning: divide by zero encountered in divide
frc[k, :copylen] = (AB / n.sqrt(AA*BB))[:copylen]
cryosparc2_compute/sigproc.py:846: RuntimeWarning: invalid value encountered in double_scalars
x = (thresh - fa) * (b-a) / (fb - fa) + a
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= main process now complete.
========= monitor process now complete.
Jobs fail towards the end at one of the last iterations.
Thanks,
Michael