Web app crashes upon file I/O error

One of the servers hosting our distributed file system upon which one of our projects is stored is having issues and therefore some files cannot be accessed. Perviously this caused the whole instance to become very unresponsive. Therefore I asked the sysadmin to lower the file I/O time-out (not sure how). The result is now that the web app crashes when a user tries to access an unavailable file.

From app.log:

[Download] P12 J534.volume.mask_fsc_auto /scratch/fhgfs/username/projectname/cryoSPARC/CS-project-name/J534/J534_008_volume_mask_fsc_auto.mrc
2024-03-20 14:43:29,253 ERROR | uncaughtException: Unknown system error -70: Unknown system error -70, read
2024-03-20 14:43:29,253 ERROR | Error: Unknown system error -70: Unknown system error -70, read

A file I/O error should not crash the web app.

Hi @daniel.s.d.larsson ,

Thanks for the post! With regards to the specific error you pointed out, it seems that our existing validation technique for determining whether a file is available on disk didn’t error out on your machine. We’ve made a note to add an additional check that will cause the endpoint (and not the app itself) to error out if there’s an issue actually reading the file.

We cannot guarantee you won’t run into any other issues if users try to interact with active projects in which their files aren’t accessible on disk. I’d recommend detaching projects if you foresee the disk being unavailable for an extended period of time. Alternatively, you can export individual jobs and move them to a project that is available for users to interact with.

- Suhail

1 Like