Trash all the bad micrographs in the motioncorrected folder
Subtract the gold beads from the *_aligned.mrc and *_aligned_doesweighted.mrc files
Run Patch CTF job on these gold subtracted micrographs
Do the Template Picker.
The weird thing is I tested the step 1 to 4 in a dataset of ~1.5k micrographs, it goes well. But for the dataset of ~3k micrographs, it has this error when doing the step 4. What can I do next?
This again looks like one or some of the micrographs ended up corrupted, but this time it’s the full dose-weighed micrograph (some *_aligned_doseweighted.mrc file).
Can you again try navigating to the motioncorrected directory and run ls -lh to see if any micrographs have file size 0, or differing significantly from the others? You would then have to re-create these files to get the template picker to work (e.g., re-run the Motion Correction job)
Unfortunately right now the Template Picker isn’t set up to skip missing files, so the whole job will fail if any exposure file is missing. You may be able to take them out if you first run the exposures through a “Manually Curate Exposures” job and reject the exposures with missing previews. Try that out and let me know how that goes, if it doesn’t work we’ll try to find an alternate solution.
@CleoShen unfortunately there’s no quick way to pick out exposures without preview from the Curation job. You have to manually click through the list in interactive mode and see if the previews load, then manually accept or reject those. Alternatively, pick out the zero-size files from your file system and search for those in the interactive list.
I tired the way you said, but as I have about 16K micrographs in total and there is no quick way to locate the zero-size files in the interactive list instead of open each micrograph one by one. This solution is so time consuming. If I have other choices, like delete all the zero-size .mrc files and correlated .npy files, then do CTF again?
Okay, I’ve put together a script you can run to modify the results of the motion correction job and filter-out zero-sized files, here’s how to run it:
Paste the following code into a text editor and make the noted subsitutions:
project_uid = '<PROJECT UID HERE>'
datafile = '<OUTPUT DATA FILEPATH HERE>'
passthrough_datafile = '<PASSTHROUGH OUTPUT DATAFILE HERE>'
import os
from shutil import copyfile
from cryosparc_compute.dataset import Dataset
project_dir = cli.get_project_dir_abs(project_uid)
copyfile(datafile, f'{datafile}.bak')
d1 = Dataset().from_file(datafile)
keep_uids = set()
for uid, path in zip(d1.data['uid'], d1.data['micrograph_blob/path']):
full_path = os.path.join(project_dir, path)
if os.path.exists(full_path) and os.stat(full_path).st_size > 0:
keep_uids.add(uid)
print(f'Found {len(keep_uids)}/{len(d1)} exposures to keep ({len(d1) - len(keep_uids)} empty files)')
newd1 = d1.subset_query(lambda item: item['uid'] in keep_uids)
newd1.to_file(datafile)
copyfile(passthrough_datafile, f'{passthrough_datafile}.bak')
d2 = Dataset().from_file(passthrough_datafile)
newd2 = d2.subset_query(lambda item: item['uid'] in keep_uids)
newd2.to_file(passthrough_datafile)
print('Done filtering.')
Replace <PROJECT UID HERE> with the ID of the project you are in, e.g., P3
Replace <OUTPUT DATA FILEPATH HERE> with the path to the micrograph_blob output of your motion correction job. Copy this value from the job’s “Outputs” tab, as indicated:
Hmm, maybe some of micrographs aren’t zero sized? Can you check the the motioncorrected folder again and see if there are any files with a very small size? e.g., less than one 100 bytes?
If you find any try re-running the same script but replace this line:
if os.path.exists(full_path) and os.stat(full_path).st_size > 0:
with this:
if os.path.exists(full_path) and os.stat(full_path).st_size > 100:
Oh, I might make a mistake. If I should trash all the 0 size files before running the script you shared with me? I left the 0 size files for the last error try shown above.
This wouldn’t be related, the code I sent does not affect the contents of the database. This error can happen if cryoSPARC didn’t restart properly and there are some artefacts left over from the previous processes. Can you send me the output of the following commands?
It looks like there’s something else running at the network port that cryoSPARC should be running on and this is preventing the database from starting up.
Can you confirm what your base port is? You should see it inside cryosparc_master/config.sh - look for the line that starts with export CRYOSPARC_BASE_PORT= - the proceeding number will be the base port.
My export CRYOSPARC_BASE_PORT is 39006, and I used 16711 as the local port to connect to cryosparc. I’ve tried all ports below, but nothing is runing any other solution I can try?