Were the movies collected using serialEM? The datatype 101 indicates that the MRC file stores 4-bit integers. Datatype 101 is non-standard according to MRC2014, but it was proposed to be an extention to the standard as it would save a lot of storage with counting detectors.
I think Motioncur2 should be able to handle the mode 101 movies. So if doing motion correction in cryoSPARC is not absolutely necessary right now, you can send the motion corrected micrographs. If doing motion correction in cryoSPARC is desired then you can use the “newstack” program in IMOD to convert the movies back to mode 0 (int8, signed), at the expense of generating a new dataset with doubled size. I think cryoSPARC can directly read mode 0 MRC.
For cryoSPARC developers, the serialEM document mentioned:
"For 4-bit data (mode 101), each byte contains a pair of pixels, and the first
pixel (the one with lower coordinate) is in the lower 4 bits, while the second
pixel is in the higher 4 bits. This fill order is the same for little-endian
and big-endian files. "
and that each row of pixel is padded to full bytes if the X-size is odd (which is not common):
“Each line has int((nx + 1) / 2) bytes, with 4-bits of padding at the end of each line when nx is odd.”
Just an update in case someone else runs into this, standard conversion from 101 needs to convert the storage mode of the output file as 16-bit signed integer, in order to be read correctly by cryosparc. This results in a file that is 4 times bigger than the original. When I tried the “standard” command
newstack input101.mrc output.mrc
This indeed resulted in a file that is twice as big, but when I tried to import said movie stack in cryosparc, the result was a bunch of noise.
Instead the following command reads to a readble file (by both cryosparc & imagej):
newstack input101.mrc -mo 1 output.mrc
I would appreciate any update on this, or if there is something I am missing when converting the movies. As the motion correction details would significantly benefit my particles, is there any other way to import those motionpaths from motioncorr2 for example?
The newstack document page says that if the input file is mode 101(4-bit) then the default output would be in mode 0 (8-bit signed integer). So getting the 2x sized mrc from the first command is expected. The second command forces generation of mode 1 (16 bit signed int) MRC, which seems to be working (but not ideal, sure).
What is surprising is the noise image you get from command 1. Since you mentioned imageJ later, I guess they also looked like pure noise when checked in imageJ, or were used for motioncur?
Anyways, I am curious about the noise MRC files. Can you send me a small chuck of a mode 101 file and its two derivitives from method 1 and method 2? You can use the linux “head” command to make a copy of first 256k bytes of a file (hopefully not all zeros):
What is surprising is the noise image you get from command 1. Since you mentioned imageJ later, I guess they also looked like pure noise when checked in imageJ, ?
A quick look into the Mode 0 file suggests that the numbers are all shifted by -127, thus appear as negative numbers.
For example, in the Mode 0 file, the first 4 pixels are:
0x80 0x81 0x80 0x80
And the rest of the nubmers all start with “8”.
In binary 0x80 would be 10000000(bin), that is -127 as signed int8. The original number that it was converted from is probably a zero. Similarly 0x81 is 10000001(bin) and -126 as signed int8, and its origin is probably a 1.
I guess newstack shifted the numbers in order to maximize the range of numbers it can save with signed int. Otherwise most of the 127 negative numbers usable in signed int8 would simply be a waste of space when recording counts.
In thoery this should be OK if the programs reading the files know that the numbers are all shifted by -127.
But for programs that do not know that all these numbers are shifted by -127 from the original values, these numbers probably all look like readings of -0.99, resulting from flipping and scaling.
If we human did not know that these are counting data collected with particular considerations, originally saved as 4-bit signed int, we will be quite confused too.
I think there might be a simple fix: you can use the -byte 0 option to force newstack to write unsigned 8bit number in mode 0 files:
newstack -by 0 t2.mrc byte.mrc
Because the counts should all be small positive numbers, forcing writing unsigned number should has the same effect as writing signed, but not shifted int8 into the mode 0 MRC. (If there are spurious negative numbers, damaging them by forcing poisitivity should not cause much negative effect anyways.)
My test with the small chunk of mode 0 file you sent me seems to have worked : all 0x80 are now 0x00 and 0x81 now 0x01. I do not have 4-bit MRC file so Please try it and let us know.
Zhijie
Some updates after Stav shared a chunk of mode 101 file (thanks, Stav!) :
In the mode 101 file, most pixels have readings of 0, 1 and 2. This is consistent with the serialEM description that when saving in mode 101 the range of counts is 0-15. In other words, the 4-bit integers are unsigned.
K2 super-resolution or K3 unbinned counting mode images are stored with pairs of 4-bit integers (maximum 15) in each byte. (The left pixel in a pair is in the low 4 bits and the right pixel is in the high 4 bits of a packed byte.)
This simplifies the reading of the mode 101 MRC files a lot (as opposed to, if the 4-bit int are signed): we can simply &0x0f to get the left pixel and >>4 to get the right pixel from each byte.
Glad I could help. So for now the only way to “read” the files by cryosparc is by converting them using the -byte 0 option, which would lead to a file that is twice as big as the original, am I right?
I collected a dataset on Krios/K2/Super-res (0.55A/pix, 40 frames). The micrographs were collected on EPU - non-gain corrected, normalized 4-bit encryption. Each file was 1.5GB and would not load into cryosparc directly. To load them into crysosparc, I used: newstack -by 0 command. The resultant files were 1.2GB. This did not make sense to me why the file size would be smaller. I however loaded them into cry0sparc and did patch motion correction and the results were HORRENDOES…
I am wondering if the 4-bit to 8-bit conversion messed up anything. It does not make sense to me why the file size would become smaller. Also, I tried to convert them to 16-bit noted by @stavros in this thread (using the newstack -mo 1 option) and the resultant file size was 1.8GB - again this does not make sense, since I would assume it to be 2.4GB (2x size of the file generated by newstack -by 0)
Do you have any kneww-jerk thoughts or suggestions? I would really appreciate it if you could help me troubleshoot this.