Error: Read timed out in 2D classification

Environment

  • Version: v3.2.0+210713
  • CUDA v 11.2
  • Ubuntu 20.04
  • Single workstation installation

I’m getting the following error as I try to execute a 2D classification:

[CPU: 3.18 GB]   Using random seed of 856403283

[CPU: 3.18 GB]   Loading a ParticleStack with 4054692 items...

[CPU: 3.18 GB]    SSD cache : cache successfuly synced in_use
[CPU: 3.18 GB]    SSD cache : cache successfuly synced, found 15.38MB of files on SSD.
Traceback (most recent call last):
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 426, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 421, in _make_request
    httplib_response = conn.getresponse()
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/http/client.py", line 1369, in getresponse
    response.begin()
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/http/client.py", line 310, in begin
    version, status, reason = self._read_status()
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/http/client.py", line 271, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/socket.py", line 589, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 727, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/urllib3/util/retry.py", line 410, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/urllib3/packages/six.py", line 735, in reraise
    raise value
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen
    chunked=chunked,
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 428, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 336, in _raise_timeout
    self, url, "Read timed out. (read timeout=%s)" % timeout_value
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='XXXXX', port=61002): Read timed out. (read timeout=300)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "cryosparc_worker/cryosparc_compute/run.py", line 84, in cryosparc_compute.run.main
  File "cryosparc_worker/cryosparc_compute/jobs/class2D/run.py", line 56, in cryosparc_compute.jobs.class2D.run.run_class_2D
  File "/localapps/cryosparc/cryosparc_worker/cryosparc_compute/particles.py", line 81, in read_blobs
    u_blob_paths = cache.download_and_return_cache_paths(u_rel_paths)
  File "/localapps/cryosparc/cryosparc_worker/cryosparc_compute/jobs/cache.py", line 113, in download_and_return_cache_paths
    compressed_keys = rc.cli.cache_request_check(worker_hostname, rc._project_uid, rc._job_uid, com.compress_paths(rel_paths))
  File "/localapps/cryosparc/cryosparc_worker/cryosparc_compute/client.py", line 54, in func
    r = requests.post(self.url, data = json.dumps(data, cls=NumpyEncoder), headers = header, timeout=self.timeout)
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/requests/api.py", line 119, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/requests/sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "/localapps/cryosparc/cryosparc_worker/deps/anaconda/envs/cryosparc_worker_env/lib/python3.7/site-packages/requests/adapters.py", line 529, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPConnectionPool(host='XXXXX', port=61002): Read timed out. (read timeout=300)

I’ve already tried a brand new installation, or using subsets of particles in case the big amount that I was handling was the problem. Nothing seems to work. Other job types, such as particle stack import or Ab-initio reconstructions seem to work properly. Note that XXXX replaces the name of my workstation!

Thanks a lot!

Hi @ngr,

Do you know how many micrographs are being processed in this case?
Is it possible that the master node where the cryoSPARC instance is running is being overloaded?
A short-term solution if your master node isn’t being overworked is to use the Downsample Particles job to re-stack your particles (the current cache system will try to copy over all 4m particles to the cache, even if some of those particles aren’t going to be processed). Note that this will duplicate data.

A little over 17K micrographs. Sorry for the naive question but, overloaded how? How would I check?

I’ll try the downsampling shortcut, and get back to you asap, thanks :slight_smile:

Hi, did you solve this problem ? I’m running into the same problem here. I have ~400 K particles and 3.3 TB space of SSD. The problem started popping up since yesterday, the same set of data and machine setting ran smoothly until yesterday.
Thanks in advance !

Hey @hgxy15, I did not. Sorry for the late reply. I’m still processing the same dataset and, whenever I have to import particles that come from the 17K movies, the same happens. The only way around I’ve found is Downsampling as explained above.

Anyway, the downsampling job takes quite a while, probably due to our storage configuration. I currently am facing a similar problem again: although no other job is using the input particles I get this message:

[CPU: 258.6 MB] SSD cache : requested files are locked for past 5063s, checking again in 5s.

The particles are a total of 100K and have been splitted from an Import job of over 4M ptcls that come, again, from 17K movies. Also they should fit in my 1TB scratch disk if I did the math right (180px boxsize)

@stephan , is there any way to solve this issue other than downsampling job (it takes quite a while for me!)

Thanks a lot!

@ngr The situation may be addressed by manipulating the master database more directly. The cryosparcm mongo command is documented in the guide.
For the relevant mongo commands (and some precautions), please see an older post on this forum.