Download archive of 3D classes

Working from home so much these days, I’ve been downloading maps constantly instead of copy-pasting paths. It would be really great to be able download a single archive of all the 3D classes from a heterogeneous refinement, with one click.

1 Like

100%. Same for 3D-VA and ab initio jobs. Being able to download an archive of all outputs would be handy for most jobs, really, including homogeneous refinement jobs.

Would it be generally safe to mask the volumes (with some large padding extent)? after zipping this saves a lot of space/bandwidth

Depends - not for the half maps, which one might want to do further processing with - If any are masked it needs to be explicitly stated at the time of download to avoid confusion

Same here, my hack is to make use of the structured folders and bash one-liners like:

for j in J310 J314 J367 J412 J451; 
do 
    wget  http://your.cs2.instance:39000/download_result_file/P(id)/${j}.volume.map_sharp; 
done

on macs one can use curl:

for job in  J869 J871 J810 J811 J812
do
  curl -o ${job}.mrc http://yourcs2:39000/download_result_file/P(id)/${job}.volume.map_sharp
done

for 3D classifications cmd line downsampling in e.g. relion, can save some bandwidth/download time (would be great to have a cli for CS2 hint):

#downscaling of cs2  maps
source relion
res=3 #resolution vs mb for a 560^3 box: 3-> 26mb 4-> 12mb
for f in *.map_sharp *.map
  do
    mv $f ${f}.mrc
    relion_image_handler --i ${f}.mrc --rescale_angpix $res --o rescld_${res}_${f}.mrc
  done
2 Likes

I wouldn’t want them to be masked - just a ZIP stream so they can be downloaded in one click. (After all, I would otherwise download them all one by one, this is really a number-of-clicks consideration rather than space or bandwidth).

The issues with masking are 1) you might miss something (personally I would use this for 3D classes) and 2) it interferes with computing map sigma for comparing different maps.

totally agree - one exception I would suggest is for 3D variability display, where they are used primarily for visualization, and the downloads can be large if the volumes haven’t been downsampled, in that case compression would be an advantage

also it would be handy if masks were compressed automatically for download, as for masks you can really save bandwidth by compressing

I wouldn’t default to masking - but making masked and compressed versions of each map available for download separately, and clearly masked as such would be very helpful for those folks working remotely on slow connections (a lot of people right now!) perhaps two archives - one masked, one not?

One caveat to compression is that in many cases the connection is already compressed. If you use SSH forwarding and add -C or if you use a common VPN like Pulse Secure, for example.

sure - but not if you are downloading from the cryosparc web interface as far as I can tell - speeds don’t track for that at least using the VPN which we have to use (Cisco Anyconnect)

(to be clear anyconnect can be configured for compression, but only as an admin - this is often out of user control at academic institutions)

Aha! actually even on my vpn there is a way to enable this… use the vpn, but then rather than just accessing the host directly (which is what I have been doing) use ssh port forwarding to localhost with -C, e.g:

ssh -N -f -L localhost:39000:localhost:39000 user@hostname -C

Thanks @DanielAsarnow this makes my life so much better for large maps - downloading a 512MB map in 7s instead of 4 minutes :slight_smile:

@apunjani maybe worth adding this to the guide to working from home? https://cryosparc.com/blog/remote-access I suspect I’m not the only one who didn’t know about this!

Cheers
Oli

Nice! If your workstation and home PC are reasonably new, you might also get a slight boost from AES-NI with -o Ciphers=aes128-gcm@openssh.com.

1 Like

Hi @olibclarke, @DanielAsarnow,

Thanks for these super helpful tips! I’ve updated the guide to include these (Appendix B & C)

1 Like