Ab-initio job shows different results on repetition

sameer · June 26, 2017, 6:45pm

Hi,

I have noticed that repeating the same ab initio job is giving me different results.
I’ve been running an ab initio job on 37K particles using an initial model (which was previously generated via refinement in cryo sparc). The output is 6 classes. I ran the same job thrice and noticed that the classes show structural differences and the number of particles in each class also varies.

I was wondering if anyone else has experienced this, and whether it might be worth running jobs multiple times to get a more accurate estimate of what is in the particle stack?

I’ve attached screen shots showing the results of the original run (run1) and the subsequent two repetitions (run2 and run3)

Thanks for your time and help.

Best,
Sameer

olibclarke · June 26, 2017, 8:30pm

If you’re using an initial model, I would suggest trying the multi-refine experiment instead - I have had much better results with that than with ab initio jobs in cases where I have many classes.

Cheers

Oli

sameer · June 26, 2017, 9:09pm

Hi Oli,

Thanks for your reply. Could you please tell me the best way to enable the multi-refine experiments?

This is what I found in one of your earlier posts

There is an experimental “multi-refine” mode in cryosparc (which can be enabled by adding export CRYOSPARC_EXPERIMENTAL=true to the cryosparc config.sh) which I believe is akin to multi-reference classification, and I’m having a play with that at the moment, but I’m not exactly sure of how that works under the hood.

Is this still the best way to do it?

Thanks so much for your help.

Best,
Sameer

olibclarke · June 26, 2017, 9:16pm

Yes, that’s exactly right Sameer. Enable in config.sh and you should get another, multi-refine experiment appearing when you restart cryosparc. Then just initialize multirefine with your desired initial model (select your initial model as many times as the desired number of classes in the Experiment tab) and you should be good to go.

Depending on the aim of experiment (whether to classify conformational/compositional heterogeneity or to remove junk), it may be worthwhile initializing a couple of classes with random or junk density, as bad particles will tend to accumulate in these.

Cheers
Oli

sameer · June 26, 2017, 10:12pm

Hi Oli,

Thanks for yout reply.
I added export CRYOSPARC_EXPERIMENTAL=true to the config.sh file and then restarted cryosparc, but I still don’t see a multi-refine option. Does this option appear under the New Experiment tab?
Also, could you tell me what cryosparc version you are using?

Thanks for the help.

Best,
Sameer

olibclarke · June 26, 2017, 10:46pm

Hi Sameer, I am using the latest - 0.4.1.

Yes, it should appear under the new experiment tab, assuming you have a dataset selected - see attached picture of what I see.

Cheers
Oli

sameer · June 26, 2017, 10:52pm

Hi Oli,

I don’t see this option. But I just realized I’m not running the latest version. I’ll update it and see if that works.

Thanks so much for your help.

Best,
SAmeer

Marcus · June 27, 2017, 2:54am

This is actually expected behaviour. The SGD algorithm used in ab initio processing is a randomized algorithm that will potentially give different results when you use a different set of random numbers. Usually the differences aren’t huge, but particularly for 3D classification they can sometimes be significant.

If you want to get the exact same result, you need to set the random seed to a fixed number. By default though it will use a different random seed for each run. (Note that the random seed used is recorded in the output of the run in case you want it.)

As a corallary to this, if you run ab initio 3D classification and don’t like the results that you get it can sometimes be advantageous to simply run the same job again. It will use a different random seed and hence may produce different results.

Cheers,
Marcus

sameer · June 27, 2017, 10:06pm

Hi Marcus,

Thank you very much for your reply.
I was wondering if the probability of getting larger differences in abinitio results is also dependent on sample heterogeneity?

Thank you.

Best,
Sameer

Marcus · June 27, 2017, 11:48pm

Hi Sameer,

Yes, if you’re running with multiple classes and there is significant heterogeneity you’re likely to see more run-to-run variability although this is largely anecdotal. We haven’t tested this extensively.

On the other hand, if you’re running with a single class the results are pretty stable regardless of the sample properties

Cheers,
Marcus

sameer · June 29, 2017, 2:33pm

Hi Marcus,

Thank you very much for your reply.

Best,
SAmeer