Advice to resolve long, flexible structure

AnokhiShah · December 20, 2022, 2:54pm

Hi,

I was hoping for some advice for pushing the processing of the structure that I am currently working on. We have a model for the structure that should help with illustrating my questions:

The protein is made up of the C-terminal central domain and the N-terminal coiled coil regions. I have been able to resolve the C-terminal domain to 2.7 A. Resolving the full length protein has been difficult due to dissociation at the air-water interface of the N-terminal domain. We therefore only have a low percentage of particles with the intact N-terminal domain intact. The other ‘problem’ is that it is likely the N-terminal domain is flexible so the density is diffuse the further form the C-terminal domain (see 2D classes).

http://130.88.90.132:39000/api/files/62b47f1a08a0a541a132650f

I have thus far been able to resolve below with 3D refinement to low resolution.

I have seen this post: Resolving flexible protein about local refinement which I have tried, which improved the resolution of the small part of the N-terminal region to 4.2 A.

What I need help with, is how to push the processing further to resolve more of the N-terminal coiled coil. I think that what I have achieved so far is due to box size limitations and I should be able to go further, but I am not sure what I can do. I am appreciative of any suggestions and happy to clarify anything that isn’t clear.

Thanks in advance for any help.

Kind Regards,
Anokhi

schiracha · December 20, 2022, 3:27pm

Hi Anokhi,

If you have already tried utilizing flexible refinement, 3d variability, and non-uniform refinement then you may have to start more focused from the start. There’s a lot of information on these methods already, so I personally won’t go into that detail. If you haven’t don that I would start with some of those features first. A larger box size may also be helpful, but if your worried about particle overlap or memory / processing times, you could reconstruct independent regions. A difficulty with proteins like this is making a mask to focus on the more flexible region may in fact exclude a lot of the data depending on the level of flexibility.

I would personally try solving the flexible region independently and maybe even in 2 different structures if you think you can see it okay in your micrographs.

I would begin by manually picking about 100-500 particles that are just the filamentous region, make templates, and then use template picker to pick a lot of those particles and probably a bunch of junk on about 10% of my data. I would then use 2D classification on this small subset to make a better SET of templates and use that for template picking on the whole data set. I would pick by making a relatively large box the first time around and in the first iteration of making templates so that I bias my picks to force the globular region off to the side and then extract a smaller box size after picking from the whole data set. There’s some tricks you can do to focus your particle selection on the filament.

After that, again use several iterations of 2D classification to pick out the filamentous particles. I would keep “re-center” densities off during all of this 2D class because it will simply pull your bulky globular region back into the classes. You may start with several million particles this way and reduce down to a few 10s -100s of thousands but it generally goes relatively quick (except the extraction part) and you end up with a large set of particles that can be refined without a huge data set and having hte alignment dominated by the bulky region.

This is one method, there are other approaches and I’m not familiar with your data so no promises this will work, but its one approach. Hopefully you get some other responses and can use the method (or devise the method) that is suited to your system and data and gets you the structure you need.

Best,

Randall

I would also keep re-centering off during ab-initio and I would widen my mask and lower my mask threshhold to somewhere between .1 and .05 during reconstruction. play with some of these parameters.

user123 · December 21, 2022, 6:16pm

Hi Anokhi, perhaps you could try re-extracting with a much larger box (maybe 2-3x) and with significant binning (say to 4A), then run refinement and 3DVA or 3Dflex. Make sure your mask includes generous regions of the flexible N-term. If you re-extract with 2x box size and 4x binning can you get any low resolution density for the flexible region?
You could also try signal subtraction with a mask on the stable core and see if you can get any 2D classes or ab initio + refinement of the very flexible N-tem region.

AnokhiShah · December 22, 2022, 1:54pm

Hi Randall,

Thanks very much for the detailed reply.

I have tried to use flexible refinement and 3DVA, both of which my linux box didn’t have sufficient compute for, even with binning. I agree that these are good avenues to explore. The box size I currently use is 512. I will try and increase this and see if it extends what I can observe. Again the difficulty is with compute power.

I will have a go at picking the N-terminal region exclusively, keeping recenter off. I think this could possibly work and is certainly worth a try.

Again, very appreciative of your time and help.

Kind Regards,
Anokhi

AnokhiShah · December 22, 2022, 1:55pm

Thanks for your reply. I will certainly give it a go with the substantial binning and see if my linux box will handle the processing.