Retain particle data when importing star file

Until recently, importing a star file without the particle data path failed with an error. It’s now possible to import particles by only supplying a star file and while there is a subtle warning it’s not obvious that all identifying data for the particles in the star file is stripped at this step.

I’m sure there’s a very small number of users who might benefit from having particles with no identifying information other than a stack index but it seems more helpful to the majority to be able to connect particles to micrographs later on either inside or outside of cryosparc.
At the very least, retaining X,Y coordinates would allow for a location based intersect.

The reassign particles to micrographs job type fails if a data path was not supplied at the original import step. Is there a reason that importing particles without specifying a data path should be a one way path forward?

-Ryan

1 Like

@RyanFeathers Please can you provide a minimal example star file that could be used to replicate the problem and an indication of the items needed for identification? A publicly available example would be ideal. Otherwise, please send me a private message for alternative arrangements.

Hi @wtempel

It’s difficult to provide data to reproduce the problem since it would also require a particle mrcs stack. Here is an example though of how the star files look for the scenario I’m describing.

Step 1: import particle stack by supplying the star file below as the “Particle meta path” but leaving the “Particle data path” input blank.

data_optics

loop_
_rlnOpticsGroupName #1
_rlnOpticsGroup #2
_rlnMicrographOriginalPixelSize #3
_rlnVoltage #4
_rlnSphericalAberration #5
_rlnAmplitudeContrast #6
_rlnImagePixelSize #7
_rlnImageSize #8
_rlnImageDimensionality #9
opticsGroup1 1 0.670000 200.000000 2.700000 0.100000
1.340000 360 2

data_particles

loop_
_rlnCoordinateX #1
_rlnCoordinateY #2
_rlnAutopickFigureOfMerit #3
_rlnClassNumber #4
_rlnAnglePsi #5
_rlnImageName #6
_rlnMicrographName #7
_rlnOpticsGroup #8
_rlnCtfMaxResolution #9
_rlnCtfFigureOfMerit #10
_rlnDefocusU #11
_rlnDefocusV #12
_rlnDefocusAngle #13
_rlnCtfBfactor #14
_rlnCtfScalefactor #15
_rlnPhaseShift #16
_rlnGroupNumber #17
_rlnAngleRot #18
_rlnAngleTilt #19
_rlnOriginXAngst #20
_rlnOriginYAngst #21
_rlnNormCorrection #22
_rlnRandomSubset #23
_rlnLogLikeliContribution #24
_rlnMaxValueProbDistribution #25
_rlnNrOfSignificantSamples #26
3858.000000 3296.000000 2.572302 1 19.798898 000001@Extract/
jobXXX/Micrographs/0001.mrcs MotionCorr/jobXXX/Micrographs/0001.mrc 1 5.378
000 0.074578 28444.599609 27340.589844 54.970001 0.000000 1.00000
0 0.000000 1 21.125019 110.769609 -0.20617 -0.08021 0.601192 1 1.183087e+05 0.248838 7

Step 2: Refine, sort, classify, remove particles in cryosparc.

Step 3: Determine you need to extract with a different box or that you want to move the particles out of cryosparc.

Reassign particles to micrographs produces AssertionError: Non-optional inputs from the following input groups and their slots are not connected: particles.location. Please connect all required inputs.

Export particles using pyem produces

data_optics

loop_
_rlnVoltage #1
_rlnImagePixelSize #2
_rlnSphericalAberration #3
_rlnAmplitudeContrast #4
_rlnOpticsGroup #5
_rlnImageSize #6
_rlnImageDimensionality #7
200.000000 1.340000 2.700000 0.100000 2 360 2

data_particles

loop_
_rlnImageName #1
_rlnAngleRot #2
_rlnAngleTilt #3
_rlnAnglePsi #4
_rlnOriginXAngst #5
_rlnOriginYAngst #6
_rlnDefocusU #7
_rlnDefocusV #8
_rlnDefocusAngle #9
_rlnPhaseShift #10
_rlnCtfBfactor #11
_rlnOpticsGroup #12
_rlnRandomSubset #13
_rlnClassNumber #14
000031@JXXX/imported/001831877752332628869_0001.mrcs -10.709635 103.421730 -168.
153915 23.290876 19.220625 28444.599609 27340.589844 54.969997 0.000000 0.000000
2 2 1

Ideally, the exported star file would have at least the same information as the imported. A satisfactory solution would be to retain the data for rlnCoordinateX, rlnCoordinateY, and rlnMicrographName, so that the particles can be re-extracted or reassigned to the specified micrographs in rlnMicrographName. At the very minimum, retaining X,Y coordinates would allow the particles to be extracted outside of cryosparc. This is the least ideal option since it would still require comparing the coordinates to the originally imported star file to fill in the missing rlnMicrographName.

This scenario was not possible previously when cryosparc required an input for “Particle data path”. As it is now, if you didn’t supply that input, there is no way to proceed forward when you get to step 3.

The easy solution is to always provide the Particle data path but the warning message is subtle. The result of not seeing or understanding the warning will be a hard lesson for some users.

@RyanFeathers Thank you for bringing this issue to our attention. We will implement a more prominent warning.

1 Like