Importing mrc movies

stavros · June 26, 2019, 10:08pm

Hello,

I’m trying to import some movie stacks and I get the following error:

Traceback (most recent call last): File “cryosparc2_master/cryosparc2_compute/run.py”, line 78, in cryosparc2_compute.run.main (/home/installtest/deps_manage/cryosparc2_package/deploy/stage/cryosparc2_master/cryosparc2_compute/run.c:3954) File “cryosparc2_compute/jobs/imports/run.py”, line 613, in run_import_movies_or_micrographs imgdata = mrc.read_mrc(abs_path)[1].sum(axis=0) * gainref File “cryosparc2_compute/blobio/mrc.py”, line 114, in read_mrc data = read_mrc_data(file_obj, header, start_page, end_page, out) File “cryosparc2_compute/blobio/mrc.py”, line 65, in read_mrc_data dtype = mrc_datatype_to_dtype(datatype) File “cryosparc2_compute/blobio/mrc.py”, line 44, in mrc_datatype_to_dtype assert False,‘Unsupported MRC datatype: {0}’.format(datatype) AssertionError: Unsupported MRC datatype: 101

I dont understand why the mrc files are “not supported”.

Thanks, let me know if any further information is required.

ZhijieLi · July 1, 2019, 5:13pm

Hi,

Were the movies collected using serialEM? The datatype 101 indicates that the MRC file stores 4-bit integers. Datatype 101 is non-standard according to MRC2014, but it was proposed to be an extention to the standard as it would save a lot of storage with counting detectors.

https://bio3d.colorado.edu/imod/doc/mrc_format.txt
http://www.ccpem.ac.uk/mrc_format/mrc_proposals.php

I think Motioncur2 should be able to handle the mode 101 movies. So if doing motion correction in cryoSPARC is not absolutely necessary right now, you can send the motion corrected micrographs. If doing motion correction in cryoSPARC is desired then you can use the “newstack” program in IMOD to convert the movies back to mode 0 (int8, signed), at the expense of generating a new dataset with doubled size. I think cryoSPARC can directly read mode 0 MRC.

https://bio3d.colorado.edu/imod/doc/man/newstack.html

For cryoSPARC developers, the serialEM document mentioned:
"For 4-bit data (mode 101), each byte contains a pair of pixels, and the first
pixel (the one with lower coordinate) is in the lower 4 bits, while the second
pixel is in the higher 4 bits. This fill order is the same for little-endian
and big-endian files. "

and that each row of pixel is padded to full bytes if the X-size is odd (which is not common):
“Each line has int((nx + 1) / 2) bytes, with 4-bits of padding at the end of each line when nx is odd.”

Zhijie

stavros · July 1, 2019, 7:54pm

OK, that is the way I deal with the issue at the moment, ie. perform the motion correction outside of cryosparc.
Thanks,
S

stavros · July 24, 2019, 9:05pm

Just an update in case someone else runs into this, standard conversion from 101 needs to convert the storage mode of the output file as 16-bit signed integer, in order to be read correctly by cryosparc. This results in a file that is 4 times bigger than the original. When I tried the “standard” command

newstack input101.mrc output.mrc

This indeed resulted in a file that is twice as big, but when I tried to import said movie stack in cryosparc, the result was a bunch of noise.
Instead the following command reads to a readble file (by both cryosparc & imagej):

newstack input101.mrc -mo 1 output.mrc

I would appreciate any update on this, or if there is something I am missing when converting the movies. As the motion correction details would significantly benefit my particles, is there any other way to import those motionpaths from motioncorr2 for example?

Thanks.

ZhijieLi · July 24, 2019, 10:24pm

Hi stavros,

The newstack document page says that if the input file is mode 101(4-bit) then the default output would be in mode 0 (8-bit signed integer). So getting the 2x sized mrc from the first command is expected. The second command forces generation of mode 1 (16 bit signed int) MRC, which seems to be working (but not ideal, sure).

What is surprising is the noise image you get from command 1. Since you mentioned imageJ later, I guess they also looked like pure noise when checked in imageJ, or were used for motioncur?

Anyways, I am curious about the noise MRC files. Can you send me a small chuck of a mode 101 file and its two derivitives from method 1 and method 2? You can use the linux “head” command to make a copy of first 256k bytes of a file (hopefully not all zeros):

head -c256k input.mrc > 256k_sample1.mrc

Zhijie

stavros · July 24, 2019, 10:32pm

What is surprising is the noise image you get from command 1. Since you mentioned imageJ later, I guess they also looked like pure noise when checked in imageJ, ?

yes basically they looked the same.

I have attached the two mode outputs here: Unique Download Link | WeTransfer

Hope this helps!

ZhijieLi · July 24, 2019, 11:34pm

A quick look into the Mode 0 file suggests that the numbers are all shifted by -127, thus appear as negative numbers.

For example, in the Mode 0 file, the first 4 pixels are:

0x80 0x81 0x80 0x80

And the rest of the nubmers all start with “8”.

In binary 0x80 would be 10000000(bin), that is -127 as signed int8. The original number that it was converted from is probably a zero. Similarly 0x81 is 10000001(bin) and -126 as signed int8, and its origin is probably a 1.

I guess newstack shifted the numbers in order to maximize the range of numbers it can save with signed int. Otherwise most of the 127 negative numbers usable in signed int8 would simply be a waste of space when recording counts.

In thoery this should be OK if the programs reading the files know that the numbers are all shifted by -127.

But for programs that do not know that all these numbers are shifted by -127 from the original values, these numbers probably all look like readings of -0.99, resulting from flipping and scaling.

If we human did not know that these are counting data collected with particular considerations, originally saved as 4-bit signed int, we will be quite confused too.

ZhijieLi · July 25, 2019, 12:02am

The mode 1 (16bit signed int) file looks more interesting.

It saves 0 as 0x0000, which is normal. But for the non-zero numbers here is the conversion table:

1: 0x0888 = 2184
2: 0x1111 = 4369 = 2184 x 2+1
3: 0x1999 = 6553 = 2184 x 3+1

Yes, it largely preserves the original numbers, but not quite exactly.

ZhijieLi · July 25, 2019, 12:55am

Hi Stav,

I think there might be a simple fix: you can use the -byte 0 option to force newstack to write unsigned 8bit number in mode 0 files:

newstack -by 0 t2.mrc byte.mrc

Because the counts should all be small positive numbers, forcing writing unsigned number should has the same effect as writing signed, but not shifted int8 into the mode 0 MRC. (If there are spurious negative numbers, damaging them by forcing poisitivity should not cause much negative effect anyways.)

My test with the small chunk of mode 0 file you sent me seems to have worked : all 0x80 are now 0x00 and 0x81 now 0x01. I do not have 4-bit MRC file so Please try it and let us know.

Zhijie

Some updates after Stav shared a chunk of mode 101 file (thanks, Stav!) :

In the mode 101 file, most pixels have readings of 0, 1 and 2. This is consistent with the serialEM description that when saving in mode 101 the range of counts is 0-15. In other words, the 4-bit integers are unsigned.

K2 super-resolution or K3 unbinned counting mode images are stored with pairs of 4-bit integers (maximum 15) in each byte. (The left pixel in a pair is in the low 4 bits and the right pixel is in the high 4 bits of a packed byte.)

This simplifies the reading of the mode 101 MRC files a lot (as opposed to, if the 4-bit int are signed): we can simply &0x0f to get the left pixel and >>4 to get the right pixel from each byte.

Zhijie

stavros · July 29, 2019, 4:39pm

Glad I could help. So for now the only way to “read” the files by cryosparc is by converting them using the -byte 0 option, which would lead to a file that is twice as big as the original, am I right?

schow · February 22, 2021, 2:34pm

Hi,

I collected a dataset on Krios/K2/Super-res (0.55A/pix, 40 frames). The micrographs were collected on EPU - non-gain corrected, normalized 4-bit encryption. Each file was 1.5GB and would not load into cryosparc directly. To load them into crysosparc, I used: newstack -by 0 command. The resultant files were 1.2GB. This did not make sense to me why the file size would be smaller. I however loaded them into cry0sparc and did patch motion correction and the results were HORRENDOES…

I am wondering if the 4-bit to 8-bit conversion messed up anything. It does not make sense to me why the file size would become smaller. Also, I tried to convert them to 16-bit noted by @stavros in this thread (using the newstack -mo 1 option) and the resultant file size was 1.8GB - again this does not make sense, since I would assume it to be 2.4GB (2x size of the file generated by newstack -by 0)

Do you have any kneww-jerk thoughts or suggestions? I would really appreciate it if you could help me troubleshoot this.

Thank you