Data Management for Large Instances

bowman · April 3, 2019, 11:28pm

Hello,

I couldn’t find anything in the documentation so I figured I would ask here - are there any best practices or recommendations in place for managing Cryosparc data? I have a heavily used instance that has reached a 50TB footprint quite quickly.

With v2, it is nice that the data are organized by user and project directories, but what is the best way to archive data as we use the program without breaking the UI or database references (in case it needs to be restored later). Is it really just as simple as removing them, or is there a functionality existing or planned to assist this in cryosparc v2?

Thanks,

Charles

apunjani · April 12, 2019, 4:26pm

Hi @bowman,

We are building some tools right now along with the new modules that will be coming out in cryoSPARC, specifically for this use case. It’s a bit tricky to manage archiving, re-importing, being able to share projects/jobs with other cryoSPARC instances, and making sure all the links/metadata/paths/etc stay intact, while also being able to deal with schema changes in future versions. We will have some tools to clear out intermediate results from jobs, as well as to completely archive a project in a bundle that can be resurrected at a later date.