Hi,
I have previously run RBMC jobs without problems, but now they seem to be stalling out, and I can’t figure out why. I am running RBMC jobs on subsets of a larger dataset for which I previously ran RBMC successfully. They all run fine to the stage of estimating motion hyperparameters.
One job then stalls at the stage of estimating dose weights (getting part way through and then generating a lot of “sending heartbeat” errors) and the other does the same at the motion-correct-particles step - getting 90% of the way through and then stalling out.
I have tried killing and restarting the jobs and so far it seems reproducible. Thoughts? Here is part of the joblog for the job that stalls at the motion-correct-particles step:
refmotion worker 0 (NVIDIA GeForce RTX 3090)
BFGS iterations: 55
scale (alpha): 0.092566
noise model (sigma2): 57.930584
TIME (s) SECTION
0.000080141 sanity
9.548886196 read movie
0.022416273 get gain, defects
0.026118669 read bg
0.019918833 read rigid
0.892744111 prep_movie
0.562088867 extract from frames
0.000181282 extract from refs
0.000000190 adj
0.000000120 bfactor
0.029171596 rigid motion correct
0.000290954 get noise, scale
0.025338349 optimize trajectory
0.067452653 shift_sum patches
0.028979984 ifft
0.000201022 unpad
0.000075901 fill out dataset
0.001446938 write output files
11.225392080 --- TOTAL ---
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909cfdd3a0> (size 3). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909cf7e4c0> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909cf7e460> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909ca97610> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909cbbe580> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909cfdda90> (size 3). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909cf7e460> (size 3). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f90b50aaca0> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909ca97730> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f90b506ae20> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909cbbe3d0> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909cfdd430> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f90ccdf49d0> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f90b48a1bb0> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909cfdda60> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909d5c63d0> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f909e8d3640> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
/home/exx/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.8/multiprocessing/process.py:108: UserWarning: Kernel function slice_volume called with very small array <cryosparc_compute.gpu.gpuarray.GPUArray object at 0x7f90b4b16a00> (size 2). Array will be passed to kernel as a pointer. Consider modifying the kernel to accept individual scalar arguments instead.
self._target(*self._args, **self._kwargs)
========= sending heartbeat at 2023-11-30 02:08:20.709730
========= sending heartbeat at 2023-11-30 02:08:30.727057
========= sending heartbeat at 2023-11-30 02:08:40.743080
========= sending heartbeat at 2023-11-30 02:08:50.759071
========= sending heartbeat at 2023-11-30 02:09:00.775034
========= sending heartbeat at 2023-11-30 02:09:10.790033
========= sending heartbeat at 2023-11-30 02:09:20.806038
========= sending heartbeat at 2023-11-30 02:09:30.822404
refmotion worker 2 (NVIDIA GeForce RTX 3090)
BFGS iterations: 148
scale (alpha): 0.081300
noise model (sigma2): 58.623901
TIME (s) SECTION
0.000079441 sanity
10.905865442 read movie
0.018212542 get gain, defects
0.023676769 read bg
0.000432995 read rigid
0.799730807 prep_movie
1.709923751 extract from frames
0.011128406 extract from refs
0.000042221 adj
0.000000030 bfactor
3.044188999 rigid motion correct
0.056932615 get noise, scale
69.468694558 optimize trajectory
0.875048076 shift_sum patches
0.009892401 ifft
0.012079587 unpad
0.000074771 fill out dataset
0.024842503 write output files
86.960845912 --- TOTAL ---
========= sending heartbeat at 2023-11-30 02:09:40.828734
========= sending heartbeat at 2023-11-30 02:09:50.854572
========= sending heartbeat at 2023-11-30 02:10:00.864866
========= sending heartbeat at 2023-11-30 02:10:10.882144
========= sending heartbeat at 2023-11-30 02:10:20.899354
========= sending heartbeat at 2023-11-30 02:10:30.910133
========= sending heartbeat at 2023-11-30 02:10:40.955937
========= sending heartbeat at 2023-11-30 02:10:50.972382
========= sending heartbeat at 2023-11-30 02:11:00.988478
========= sending heartbeat at 2023-11-30 02:11:10.996304
========= sending heartbeat at 2023-11-30 02:11:21.006176
========= sending heartbeat at 2023-11-30 02:11:31.022453
========= sending heartbeat at 2023-11-30 02:11:41.030690
========= sending heartbeat at 2023-11-30 02:11:51.046836
========= sending heartbeat at 2023-11-30 02:12:01.056730
========= sending heartbeat at 2023-11-30 02:12:11.067822
========= sending heartbeat at 2023-11-30 02:12:21.078090
========= sending heartbeat at 2023-11-30 02:12:31.094232
========= sending heartbeat at 2023-11-30 02:12:41.110630
========= sending heartbeat at 2023-11-30 02:12:51.127007
========= sending heartbeat at 2023-11-30 02:13:01.143423
========= sending heartbeat at 2023-11-30 02:13:11.159257
========= sending heartbeat at 2023-11-30 02:13:21.178199
========= sending heartbeat at 2023-11-30 02:13:31.194548
========= sending heartbeat at 2023-11-30 02:13:41.203154
========= sending heartbeat at 2023-11-30 02:13:51.219666
========= sending heartbeat at 2023-11-30 02:14:01.235715
========= sending heartbeat at 2023-11-30 02:14:11.245820
========= sending heartbeat at 2023-11-30 02:14:21.264506
========= sending heartbeat at 2023-11-30 02:14:31.280803
========= sending heartbeat at 2023-11-30 02:14:41.297289
========= sending heartbeat at 2023-11-30 02:14:51.313997
========= sending heartbeat at 2023-11-30 02:15:01.330050
========= sending heartbeat at 2023-11-30 02:15:11.346634
========= sending heartbeat at 2023-11-30 02:15:21.364130
========= sending heartbeat at 2023-11-30 02:15:31.372194
========= sending heartbeat at 2023-11-30 02:15:41.379104
========= sending heartbeat at 2023-11-30 02:15:51.395700
========= sending heartbeat at 2023-11-30 02:16:01.412183
========= sending heartbeat at 2023-11-30 02:16:11.428304
========= sending heartbeat at 2023-11-30 02:16:21.445758
========= sending heartbeat at 2023-11-30 02:16:31.463648
========= sending heartbeat at 2023-11-30 02:16:41.475053
========= sending heartbeat at 2023-11-30 02:16:51.481863
========= sending heartbeat at 2023-11-30 02:17:01.498655
========= sending heartbeat at 2023-11-30 02:17:11.515850
========= sending heartbeat at 2023-11-30 02:17:21.532620
========= sending heartbeat at 2023-11-30 02:17:31.549033
========= sending heartbeat at 2023-11-30 02:17:41.619394
========= sending heartbeat at 2023-11-30 02:17:51.627667
========= sending heartbeat at 2023-11-30 02:18:01.643631
========= sending heartbeat at 2023-11-30 02:18:11.661498
========= sending heartbeat at 2023-11-30 02:18:21.679504
========= sending heartbeat at 2023-11-30 02:18:31.695779
========= sending heartbeat at 2023-11-30 02:18:41.712237
========= sending heartbeat at 2023-11-30 02:18:51.728461
========= sending heartbeat at 2023-11-30 02:19:01.745011
========= sending heartbeat at 2023-11-30 02:19:11.763211
========= sending heartbeat at 2023-11-30 02:19:21.778779
========= sending heartbeat at 2023-11-30 02:19:31.795715
========= sending heartbeat at 2023-11-30 02:19:41.812123
========= sending heartbeat at 2023-11-30 02:19:51.828688
========= sending heartbeat at 2023-11-30 02:20:01.844835
========= sending heartbeat at 2023-11-30 02:20:11.863007
========= sending heartbeat at 2023-11-30 02:20:21.871119
========= sending heartbeat at 2023-11-30 02:20:31.878390
========= sending heartbeat at 2023-11-30 02:20:41.894805
========= sending heartbeat at 2023-11-30 02:20:51.905530
========= sending heartbeat at 2023-11-30 02:21:01.922219
========= sending heartbeat at 2023-11-30 02:21:12.015110
========= sending heartbeat at 2023-11-30 02:21:22.032972
========= sending heartbeat at 2023-11-30 02:21:32.049534
========= sending heartbeat at 2023-11-30 02:21:42.065438
========= sending heartbeat at 2023-11-30 02:21:52.081849
========= sending heartbeat at 2023-11-30 02:22:02.097618
========= sending heartbeat at 2023-11-30 02:22:12.117476
========= sending heartbeat at 2023-11-30 02:22:22.134201
========= sending heartbeat at 2023-11-30 02:22:32.151864
========= sending heartbeat at 2023-11-30 02:22:42.168232
========= sending heartbeat at 2023-11-30 02:22:52.184874
========= sending heartbeat at 2023-11-30 02:23:02.192862
========= sending heartbeat at 2023-11-30 02:23:12.200698
========= sending heartbeat at 2023-11-30 02:23:22.211809
========= sending heartbeat at 2023-11-30 02:23:32.223875
========= sending heartbeat at 2023-11-30 02:23:42.231270
========= sending heartbeat at 2023-11-30 02:23:52.247779
========= sending heartbeat at 2023-11-30 02:24:02.263889
========= sending heartbeat at 2023-11-30 02:24:12.280722
========= sending heartbeat at 2023-11-30 02:24:22.297260
========= sending heartbeat at 2023-11-30 02:24:32.308644
========= sending heartbeat at 2023-11-30 02:24:42.324481
========= sending heartbeat at 2023-11-30 02:24:52.340723
========= sending heartbeat at 2023-11-30 02:25:02.357111
========= sending heartbeat at 2023-11-30 02:25:12.364100
========= sending heartbeat at 2023-11-30 02:25:22.376419
========= sending heartbeat at 2023-11-30 02:25:32.385214
========= sending heartbeat at 2023-11-30 02:25:42.401021
========= sending heartbeat at 2023-11-30 02:25:52.413298
========= sending heartbeat at 2023-11-30 02:26:02.425600
========= sending heartbeat at 2023-11-30 02:26:12.437911
========= sending heartbeat at 2023-11-30 02:26:22.452014
========= sending heartbeat at 2023-11-30 02:26:32.469572
========= sending heartbeat at 2023-11-30 02:26:42.484982
========= sending heartbeat at 2023-11-30 02:26:52.501202
========= sending heartbeat at 2023-11-30 02:27:02.510597
========= sending heartbeat at 2023-11-30 02:27:12.518179
========= sending heartbeat at 2023-11-30 02:27:22.533824
========= sending heartbeat at 2023-11-30 02:27:32.549864
========= sending heartbeat at 2023-11-30 02:27:42.566476
========= sending heartbeat at 2023-11-30 02:27:52.583094
========= sending heartbeat at 2023-11-30 02:28:02.600788
========= sending heartbeat at 2023-11-30 02:28:12.618841
========= sending heartbeat at 2023-11-30 02:28:22.635472
========= sending heartbeat at 2023-11-30 02:28:32.644542
========= sending heartbeat at 2023-11-30 02:28:42.660543
========= sending heartbeat at 2023-11-30 02:28:52.676920
========= sending heartbeat at 2023-11-30 02:29:02.693204
========= sending heartbeat at 2023-11-30 02:29:12.710159
========= sending heartbeat at 2023-11-30 02:29:22.719698
========= sending heartbeat at 2023-11-30 02:29:32.726635
========= sending heartbeat at 2023-11-30 02:29:42.743262
========= sending heartbeat at 2023-11-30 02:29:52.759361
========= sending heartbeat at 2023-11-30 02:30:02.769073
========= sending heartbeat at 2023-11-30 02:30:12.786427
========= sending heartbeat at 2023-11-30 02:30:22.795118
========= sending heartbeat at 2023-11-30 02:30:32.810124
========= sending heartbeat at 2023-11-30 02:30:42.826373
========= sending heartbeat at 2023-11-30 02:30:52.842840
========= sending heartbeat at 2023-11-30 02:31:02.850101
========= sending heartbeat at 2023-11-30 02:31:12.867265
========= sending heartbeat at 2023-11-30 02:31:22.883070
========= sending heartbeat at 2023-11-30 02:31:32.892516
========= sending heartbeat at 2023-11-30 02:31:42.908552
========= sending heartbeat at 2023-11-30 02:31:52.924722
========= sending heartbeat at 2023-11-30 02:32:02.940708
========= sending heartbeat at 2023-11-30 02:32:12.957600
========= sending heartbeat at 2023-11-30 02:32:22.974048
========= sending heartbeat at 2023-11-30 02:32:32.989806
========= sending heartbeat at 2023-11-30 02:32:43.004049
========= sending heartbeat at 2023-11-30 02:32:53.019889
========= sending heartbeat at 2023-11-30 02:33:03.035913
========= sending heartbeat at 2023-11-30 02:33:13.053145
========= sending heartbeat at 2023-11-30 02:33:23.061011
========= sending heartbeat at 2023-11-30 02:33:33.077720
========= sending heartbeat at 2023-11-30 02:33:43.093810
========= sending heartbeat at 2023-11-30 02:33:53.110168
========= sending heartbeat at 2023-11-30 02:34:03.119291
========= sending heartbeat at 2023-11-30 02:34:13.136220
========= sending heartbeat at 2023-11-30 02:34:23.144144
========= sending heartbeat at 2023-11-30 02:34:33.161149
========= sending heartbeat at 2023-11-30 02:34:43.169490
========= sending heartbeat at 2023-11-30 02:34:53.186135
========= sending heartbeat at 2023-11-30 02:35:03.227365
========= sending heartbeat at 2023-11-30 02:35:13.242282
========= sending heartbeat at 2023-11-30 02:35:23.257962
========= sending heartbeat at 2023-11-30 02:35:33.274338
========= sending heartbeat at 2023-11-30 02:35:43.291171
========= sending heartbeat at 2023-11-30 02:35:53.307337
========= sending heartbeat at 2023-11-30 02:36:03.322874
========= sending heartbeat at 2023-11-30 02:36:13.339889
========= sending heartbeat at 2023-11-30 02:36:23.355875
========= sending heartbeat at 2023-11-30 02:36:33.372014
========= sending heartbeat at 2023-11-30 02:36:43.387926
========= sending heartbeat at 2023-11-30 02:36:53.404640
========= sending heartbeat at 2023-11-30 02:37:03.420912
========= sending heartbeat at 2023-11-30 02:37:13.488806
========= sending heartbeat at 2023-11-30 02:37:23.505033
========= sending heartbeat at 2023-11-30 02:37:33.521045
========= sending heartbeat at 2023-11-30 02:37:43.538700
========= sending heartbeat at 2023-11-30 02:37:53.555159
========= sending heartbeat at 2023-11-30 02:38:03.562826
========= sending heartbeat at 2023-11-30 02:38:13.572811
========= sending heartbeat at 2023-11-30 02:38:23.589276
========= sending heartbeat at 2023-11-30 02:38:33.605466
========= sending heartbeat at 2023-11-30 02:38:43.621732
========= sending heartbeat at 2023-11-30 02:38:53.638729
========= sending heartbeat at 2023-11-30 02:39:03.654393
========= sending heartbeat at 2023-11-30 02:39:13.664271
========= sending heartbeat at 2023-11-30 02:39:23.676765
========= sending heartbeat at 2023-11-30 02:39:33.692646
========= sending heartbeat at 2023-11-30 02:39:43.701849
========= sending heartbeat at 2023-11-30 02:39:53.718049
========= sending heartbeat at 2023-11-30 02:40:03.733888
========= sending heartbeat at 2023-11-30 02:40:13.752182
========= sending heartbeat at 2023-11-30 02:40:23.769323
========= sending heartbeat at 2023-11-30 02:40:33.785764
========= sending heartbeat at 2023-11-30 02:40:43.801994
========= sending heartbeat at 2023-11-30 02:40:53.817922
========= sending heartbeat at 2023-11-30 02:41:03.834314
========= sending heartbeat at 2023-11-30 02:41:13.852451
========= sending heartbeat at 2023-11-30 02:41:23.870078
========= sending heartbeat at 2023-11-30 02:41:33.886588
========= sending heartbeat at 2023-11-30 02:41:43.902030
========= sending heartbeat at 2023-11-30 02:41:53.918540
========= sending heartbeat at 2023-11-30 02:42:03.937762
========= sending heartbeat at 2023-11-30 02:42:13.955371
========= sending heartbeat at 2023-11-30 02:42:23.971753
========= sending heartbeat at 2023-11-30 02:42:33.982264
========= sending heartbeat at 2023-11-30 02:42:43.993904
========= sending heartbeat at 2023-11-30 02:42:54.001125
========= sending heartbeat at 2023-11-30 02:43:04.017046
========= sending heartbeat at 2023-11-30 02:43:14.034703
========= sending heartbeat at 2023-11-30 02:43:24.052110
========= sending heartbeat at 2023-11-30 02:43:34.068597
========= sending heartbeat at 2023-11-30 02:43:44.085131
========= sending heartbeat at 2023-11-30 02:43:54.094210
========= sending heartbeat at 2023-11-30 02:44:04.101911
========= sending heartbeat at 2023-11-30 02:44:14.119887
========= sending heartbeat at 2023-11-30 02:44:24.126573
========= sending heartbeat at 2023-11-30 02:44:34.144394
========= sending heartbeat at 2023-11-30 02:44:44.160685
========= sending heartbeat at 2023-11-30 02:44:54.176496
========= sending heartbeat at 2023-11-30 02:45:04.193222
========= sending heartbeat at 2023-11-30 02:45:14.210218
========= sending heartbeat at 2023-11-30 02:45:24.226400
========= sending heartbeat at 2023-11-30 02:45:34.262869
========= sending heartbeat at 2023-11-30 02:45:44.269621
========= sending heartbeat at 2023-11-30 02:45:54.286198
========= sending heartbeat at 2023-11-30 02:46:04.302525
========= sending heartbeat at 2023-11-30 02:46:14.327062
========= sending heartbeat at 2023-11-30 02:46:24.343392
========= sending heartbeat at 2023-11-30 02:46:34.359640
========= sending heartbeat at 2023-11-30 02:46:44.377235
========= sending heartbeat at 2023-11-30 02:46:54.394319
========= sending heartbeat at 2023-11-30 02:47:04.409789
========= sending heartbeat at 2023-11-30 02:47:14.418891
========= sending heartbeat at 2023-11-30 02:47:24.436102
========= sending heartbeat at 2023-11-30 02:47:34.453098
========= sending heartbeat at 2023-11-30 02:47:44.463801
========= sending heartbeat at 2023-11-30 02:47:54.479837
========= sending heartbeat at 2023-11-30 02:48:04.495696
========= sending heartbeat at 2023-11-30 02:48:14.507535
========= sending heartbeat at 2023-11-30 02:48:24.524084
========= sending heartbeat at 2023-11-30 02:48:34.539942
========= sending heartbeat at 2023-11-30 02:48:44.556711
========= sending heartbeat at 2023-11-30 02:48:54.573542
========= sending heartbeat at 2023-11-30 02:49:04.589948
========= sending heartbeat at 2023-11-30 02:49:14.608323
========= sending heartbeat at 2023-11-30 02:49:24.617432
========= sending heartbeat at 2023-11-30 02:49:34.629482
========= sending heartbeat at 2023-11-30 02:49:44.645741
========= sending heartbeat at 2023-11-30 02:49:54.662038
========= sending heartbeat at 2023-11-30 02:50:04.670918
========= sending heartbeat at 2023-11-30 02:50:14.683125
========= sending heartbeat at 2023-11-30 02:50:24.699002
========= sending heartbeat at 2023-11-30 02:50:34.715649
========= sending heartbeat at 2023-11-30 02:50:44.732429
========= sending heartbeat at 2023-11-30 02:50:54.748892
========= sending heartbeat at 2023-11-30 02:51:04.764786
========= sending heartbeat at 2023-11-30 02:51:14.782271
========= sending heartbeat at 2023-11-30 02:51:24.799104
========= sending heartbeat at 2023-11-30 02:51:34.807943
========= sending heartbeat at 2023-11-30 02:51:44.820245
========= sending heartbeat at 2023-11-30 02:51:54.832539
========= sending heartbeat at 2023-11-30 02:52:04.849527
========= sending heartbeat at 2023-11-30 02:52:14.866774
========= sending heartbeat at 2023-11-30 02:52:24.884279
========= sending heartbeat at 2023-11-30 02:52:34.901658
========= sending heartbeat at 2023-11-30 02:52:44.918425
========= sending heartbeat at 2023-11-30 02:52:54.934657
========= sending heartbeat at 2023-11-30 02:53:04.951256
========= sending heartbeat at 2023-11-30 02:53:14.968825
========= sending heartbeat at 2023-11-30 02:53:24.985807
========= sending heartbeat at 2023-11-30 02:53:35.003367
========= sending heartbeat at 2023-11-30 02:53:45.020645
========= sending heartbeat at 2023-11-30 02:53:55.037210
========= sending heartbeat at 2023-11-30 02:54:05.053552
========= sending heartbeat at 2023-11-30 02:54:15.071106
========= sending heartbeat at 2023-11-30 02:54:25.080337
========= sending heartbeat at 2023-11-30 02:54:35.097928
========= sending heartbeat at 2023-11-30 02:54:45.114631
========= sending heartbeat at 2023-11-30 02:54:55.130766
========= sending heartbeat at 2023-11-30 02:55:05.147124
========= sending heartbeat at 2023-11-30 02:55:15.165397
========= sending heartbeat at 2023-11-30 02:55:25.181817
========= sending heartbeat at 2023-11-30 02:55:35.199327
========= sending heartbeat at 2023-11-30 02:55:45.215233
========= sending heartbeat at 2023-11-30 02:55:55.231508
========= sending heartbeat at 2023-11-30 02:56:05.248105
========= sending heartbeat at 2023-11-30 02:56:15.265385
========= sending heartbeat at 2023-11-30 02:56:25.284301
========= sending heartbeat at 2023-11-30 02:56:35.300908
========= sending heartbeat at 2023-11-30 02:56:45.318799
========= sending heartbeat at 2023-11-30 02:56:55.336565
========= sending heartbeat at 2023-11-30 02:57:05.352568
========= sending heartbeat at 2023-11-30 02:57:15.370731
========= sending heartbeat at 2023-11-30 02:57:25.387486
========= sending heartbeat at 2023-11-30 02:57:35.404261
========= sending heartbeat at 2023-11-30 02:57:45.421174
========= sending heartbeat at 2023-11-30 02:57:55.438611
========= sending heartbeat at 2023-11-30 02:58:05.455126
========= sending heartbeat at 2023-11-30 02:58:15.473219
========= sending heartbeat at 2023-11-30 02:58:25.489842
EDIT:
Here are my compute settings if helpful (nothing else is running on the system, and the system has two 3090 GPUs)