Error restarting streaming 2D job (Live)

Hi,

I attempted to restart a streaming 2D job that had somehow ended up at “completed” rather than “waiting” status, and hence was not accepting any more particles. When I did so I got the attached error, and job failed status.

Cheers
Oli

Hi @olibclarke,

In v2.16, the 2D Classification job’s input groups changed- this requires a new 2D Classification job to be run in order to be compatible with the new compute code that removes “duplicate” particle coordinates from a single micrograph. To do so, navigate to the “2DClasses” tab in your live session and hit the “Resume/Restart” button, which will automatically build a new job and input the correct groups to allow the Streaming 2D Classification job to resume.

Hi @stephan - this entire run was started in 2.16 though? It did go away when I restarted, but I’m not sure why it would be picking up old inputs?

Hey @olibclarke,

If you want to restart a streaming 2D Classification job, you need to do it from the cryoSPARC Live UI; when you hit Resume/Restart, the server does some work to ensure the session is able to restart, and supplies the necessary parameters and inputs to successfully restart the job.

Yes - that’s what I did - but you’re saying I need to rerun because of the new version, but it was never run with any other version?

Hi @olibclarke,

Maybe this has to do with the same reason the job went to “completed” status instead of “waiting”- usually a Streaming 2D Classification job’s status is set to “completed” status by command_rtp when a user hits the “Stop” button in the UI or the job’s status changes to “failed” or “killed” and the job already signalled to the database that it was “ready” (the first 21 iterations were completed, and it was in the streaming stage). Is there any indication in the streamlog or stdout of the job process (cryosparcm joblog...) that it somehow failed?

Hi @stephan,

when I run cryosparcm joblog I get the attached error. Are there other arguments I need to provide?

Cheers
Oli

Hi @olibclarke,

cryosparcm joblog <project_uid> <job_uid>
e.g. cryosparcm joblog P12 J33

https://guide.cryosparc.com/setup-configuration-and-management/management-and-monitoring/cryosparcm#cryosparcm-joblog-px-jxx

Ok see attached! Maybe if cryosparcm joblog is run without arguments it should give a short message informing the user of required parameters though, rather than an assertion error?

Project P21 Job J21
Master ubuntu Port 39002
===========================================================================
========= monitor process now starting main process
MAINPROCESS PID 40063
========= monitor process now waiting for main process
MAIN PID 40063
class2D.run_streaming cryosparc2_compute.jobs.jobregister
/home/user/software/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/skcuda/cublas.py:284: UserWarning: creating CUBLAS context to get version number
  warnings.warn('creating CUBLAS context to get version number')
/home/user/software/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/multiprocessing/process.py:114: ComplexWarning: Casting complex values to real discards the imaginary part
  self._target(*self._args, **self._kwargs)
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
cryosparc2_compute/sigproc.py:771: RuntimeWarning: invalid value encountered in divide
  frc[k, :copylen] = (AB / n.sqrt(AA*BB))[:copylen]
cryosparc2_compute/sigproc.py:840: RuntimeWarning: invalid value encountered in greater
  crossings = n.where((fsc[:-1] > thresh) * (fsc[1:] < thresh))[0]
cryosparc2_compute/sigproc.py:840: RuntimeWarning: invalid value encountered in less
  crossings = n.where((fsc[:-1] > thresh) * (fsc[1:] < thresh))[0]
========= sending heartbeat
/home/user/software/cryosparc/cryosparc2_worker/deps/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.py:6571: UserWarning: The 'normed' kwarg is deprecated, and has been replaced by the 'density' kwarg.
  warnings.warn("The 'normed' kwarg is deprecated, and has been "
========= sending heartbeat
***************************************************************
Running job  J21  of type  class_2D_streaming
Running job on hostname %s ubuntu
Allocated Resources :  {u'lane': u'default', u'target': {u'lane': u'default', u'name': u'ubuntu', u'title': u'Worker node ubuntu', u'resource_slots': {u'GPU': [0, 1, 2, 3], u'RAM': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47], u'CPU': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]}, u'hostname': u'ubuntu', u'worker_bin_path': u'/home/user/software/cryosparc/cryosparc2_worker/bin/cryosparcw', u'cache_path': u'/scratch/', u'cache_quota_mb': None, u'resource_fixed': {u'SSD': True}, u'gpus': [{u'mem': 11523260416, u'id': 0, u'name': u'GeForce RTX 2080 Ti'}, {u'mem': 11523260416, u'id': 1, u'name': u'GeForce RTX 2080 Ti'}, {u'mem': 11523260416, u'id': 2, u'name': u'GeForce RTX 2080 Ti'}, {u'mem': 11523260416, u'id': 3, u'name': u'GeForce RTX 2080 Ti'}], u'cache_reserve_mb': 10000, u'type': u'node', u'ssh_str': u'user@ubuntu', u'desc': None}, u'license': True, u'hostname': u'ubuntu', u'slots': {u'GPU': [0], u'RAM': [0, 1, 2], u'CPU': [0, 1]}, u'fixed': {u'SSD': True}, u'lane_type': u'default', u'licenses_acquired': 1}
**** handle exception rc
set status to failed
Traceback (most recent call last):
  File "cryosparc2_worker/cryosparc2_compute/run.py", line 85, in cryosparc2_compute.run.main
  File "cryosparc2_worker/cryosparc2_compute/jobs/class2D/run_streaming.py", line 273, in cryosparc2_compute.jobs.class2D.run_streaming.run_class_2D_streaming
  File "cryosparc2_compute/geometry.py", line 679, in remove_duplicate_particles
    assert error_field in particles.fields(), "Particle dataset must have field to minimize error"
AssertionError: Particle dataset must have field to minimize error
========= main process now complete.
========= monitor process now complete.```