Relink existing micrograph to movie

Dear All,

I previously did a typical process flow: import movie->Patch motion->…, then I moved the raw movies to another data server. Now I come back and want to do local motion correction. How can I reimport the movie from another directory and link the movie to the existing Patch motion job?

Thank you

Quickest way is symlinks.

If your import job is J1, then all the movies (and the gain ref/defect file) are symlinks in the J1/imported/ directory.

Either move all of them (for safety/sanity checking) or delete and manually remake pointing to the new location.

Thanks, I should ask earlier. While I was trying different ways, I have cleared my J1 job and lost all previous files. Is there a way to link my J2 job (patch motion) to a new import movie job?

To mess around with linking of jobs, cryosparc-tools would probably be recommended. I’ll leave @hsnyder or @wtempel to explain that better than I. :slight_smile:

The UID prefix that CryoSPARC loves so much is a bit of a pain at times, it makes recovering from SNAFUs like this a bit harder than it otherwise would be. :frowning:

There likely is, but would you alternatively be open to the idea, suggested by a team member, of

  1. repeating movie import and motion correction
  2. then reassigning your existing particles to the newly imported exposures?

Thanks, actually I ended up with trying the idea that you suggested. But it will be much faster if there is a such feature that just links two jobs.

Hi @parrot, I wrote a cryosparc-tools Python script that re-imports the movies and associates them with the motion-corrected micrographs.

Overview of what it does:

  1. Connect to CryoSPARC
  2. Find the relevant project
  3. Find the old Import Movies job, whose data is now missing
  4. Build a new External Job with the old movies as input and the updated movies with the new path as output
  5. Load the old movies dataset input
  6. Update the internal dataset paths to the new location
  7. Save the updated movies dataset as output

The code:

from pathlib import Path
from cryosparc.tools import CryoSPARC

cs = CryoSPARC(...)  # FILL IN CREDENTIALS HERE (https://tools.cryosparc.com/intro.html#usage)

new_movies_folder = Path("/path/to/movies")  # INSERT PATH TO IMPORTS
gain_ref_path = new_movies_folder / "gain.mrc"  # INSERT GAIN REF PATH
movies_glob = "*.tif"  # INSERT FILENAME PATTERN HERE

# FILL IN PROJECT/WORKSPACE/JOB DETAILS HERE:
project_uid = "P251"
workspace_uid = "W11"
old_import_movies_job_uid = "J1"

project = cs.find_project(project_uid)
old_import_movies_job = project.find_job(old_import_movies_job_uid)

# Create a new external job
new_import_job = project.create_external_job(workspace_uid, title="Re-imported movies with new path")

# Create a "old_movies" input and connect to the originally imported movies
new_import_job.connect(
    "old_movies",
    old_import_movies_job.uid,
    "imported_movies",
    slots=["movie_blob", "gain_ref_blob"],
)
# Add an "updated_movies" output
new_import_job.add_output(
    type="exposure",
    name="updated_movies",
    slots=["movie_blob", "gain_ref_blob"],
    passthrough="old_movies",
)

with new_import_job.run():
    # Load the old input dataset
    movies_dset = new_import_job.load_input(
        "old_movies",
        slots=["movie_blob", "gain_ref_blob"],
    )
    # Create a new imports directory and symlink
    new_import_job.mkdir("imported")
    new_import_folder = Path(new_import_job.dir() / "imported")
    for i, mov in enumerate(sorted(new_movies_folder.glob(movies_glob))):
        new_imported_path = new_import_folder / mov.name
        new_imported_path.symlink_to(mov)
        movies_dset["movie_blob/path"][i] = f"{new_import_job.uid}/imported/{mov.name}"
    
    new_imported_gain_path = new_import_folder / gain_ref_path.name
    new_imported_gain_path.symlink_to(gain_ref_path)
    movies_dset["gain_ref_blob/path"][:] = f"{new_import_job.uid}/imported/{gain_ref_path.name}"

    new_import_job.save_output("updated_movies", movies_dset)

To run this, do the following on a machine with the new data folders available (e.g., CryoSPARC workstation or master node):

  1. Install cryosparc-tools as directed in the documentation
  2. Paste the code into a text editor and make the noted substitutions
  3. Save to update_imported_movies.py
  4. Run from the command line with python update_imported_movies.py

Running results in an “External” job in the relevant workspace with the movies as output:

In your Local Motion Correction job, connect the motion-corrected micrographs first, then substitute the movie_blob and gain_ref_blob low-level inputs from the new external job by dragging them into the “Movies” input group:

Hope that helps! Let me know if this works for you or if you have any trouble setting it up.

Edit: bug fix

4 Likes

Hi @nfrasser,

Thank you for your script! I did a run and get the following thing. I cleared my J1 job previously when I tried different ways to solve the problem. Does your script need the untouched old J1 job?

/home/spuser/anaconda3/lib/python3.10/site-packages/cryosparc/job.py:380: UserWarning: *** CommandClient: (http://localhost:39003/load_job_input) HTTP Error 422 UNPROCESSABLE ENTITY; please check cryosparcm log command_vis for additional information.
Response from server: b'{"project_uid": "P7", "job_uid": "J392", "input_name": "old_movies", "slots": ["movie_blob", "gain_ref_blob"]}'
  with make_json_request(self.cs.vis, "/load_job_input", data=data) as response:
Traceback (most recent call last):
  File "/data/youwang/cryosparc_working/20230518_1014/CS-20230518-1014/update_imported_movies.py", line 45, in <module>
    movies_dset = new_import_job.load_input(
  File "/home/spuser/anaconda3/lib/python3.10/site-packages/cryosparc/job.py", line 380, in load_input
    with make_json_request(self.cs.vis, "/load_job_input", data=data) as response:
  File "/home/spuser/anaconda3/lib/python3.10/contextlib.py", line 135, in __enter__
    return next(self.gen)
  File "/home/spuser/anaconda3/lib/python3.10/site-packages/cryosparc/command.py", line 225, in make_request
    raise CommandError(error_reason, url=url, code=code, data=resdata)
cryosparc.errors.CommandError: *** (http://localhost:39003/load_job_input, code 422) HTTP Error 422 UNPROCESSABLE ENTITY; please check cryosparcm log command_vis for additional information.
Response from server: b'{"project_uid": "P7", "job_uid": "J392", "input_name": "old_movies", "slots": ["movie_blob", "gain_ref_blob"]}'

@parrot instead of J1, try using the job number of the full-frame motion corrected micrographs job. It should have the same result

I assume I should make a change: old_import_movies_job_uid = “J2”. J2 is my patch motion job. Here is the output:

Traceback (most recent call last):
File “/data/youwang/cryosparc_working/20230518_1014/CS-20230518-1014/update_imported_movies.py”, line 29, in
new_import_job.connect(
File “/home/spuser/anaconda3/lib/python3.10/site-packages/cryosparc/job.py”, line 1289, in connect
self.cs.vis.connect_external_job( # type: ignore
File “/home/spuser/anaconda3/lib/python3.10/site-packages/cryosparc/command.py”, line 121, in func
raise CommandError(
cryosparc.errors.CommandError: *** (http://localhost:39003, code 400) Encountered ServerError from JSONRPC function “connect_external_job” with params {‘project_uid’: ‘P7’, ‘source_job_uid’: ‘J2’, ‘source_output’: ‘imported_movies’, ‘target_job_uid’: ‘J393’, ‘target_input’: ‘old_movies’, ‘slots’: [‘movie_blob’, ‘gain_ref_blob’], ‘title’: ‘’, ‘desc’: ‘’}:
ServerError: Source job P7-J2 does not have output imported_movies
Traceback (most recent call last):
File “/spshared/apps/cryosparc4/cryosparc_master/cryosparc_command/commandcommon.py”, line 195, in wrapper
res = func(*args, **kwargs)
File “/spshared/apps/cryosparc4/cryosparc_master/cryosparc_command/commandcommon.py”, line 264, in wrapper
return func(*args, **kwargs)
File “/spshared/apps/cryosparc4/cryosparc_master/cryosparc_command/command_vis/snowflake.py”, line 212, in connect_external_job
assert source_output_group, f"Source job {project_uid}-{source_job_uid} does not have output {source_output}"
AssertionError: Source job P7-J2 does not have output imported_movies

Ah my apologies, you also have to replace "imported_movies" in the new_import_job.connect() call with "micrographs" or whatever the output name was for your motion correction job. Example for Patch Motion Correction:

Thanks, It seems run through most part but end with:

Traceback (most recent call last):
File “/data/youwang/cryosparc_working/20230518_1014/CS-20230518-1014/update_imported_movies.py”, line 54, in
new_imported_path.symlink_to(mov)
AttributeError: ‘PurePosixPath’ object has no attribute ‘symlink_to’

Looks like I was using the Python’s pathlib API incorrectly. Try changing these two lines:

        new_imported_path.symlink_to(mov)
    new_imported_gain_path.symlink_to(gain_ref_path)

to these

        Path(new_imported_path).symlink_to(mov)
    Path(new_imported_gain_path).symlink_to(gain_ref_path)

The imported Path type should have the symlink_to method.

Thanks, it works. But then I run local motion correction, it still shows can’t find file:
[libtiff error] TIFFOpen: /data/youwang/cryosparc_working/20230518_1014/CS-20230518-1014/J1/imported/014974633499939006110_20230518_1014_A020_G000_H003_D001.tif: No such file or directory

@parrot I believe this means you didn’t specify the correct path to the new location of the movies (with the new_movies_folder variable) or that this path is not available on the worker where Local Motion Correction is running (e.g., directory not mounted). Edit I noticed that the path still shows J1, that means the movies dataset was not updated correctly, so this last part does not apply.

Should I just edit this part? Here is my code:

new_movies_folder = Path(“/data3/raw_data/20230518_1014/raw”) # INSERT PATH TO IMPORTS
gain_ref_path = new_movies_folder / “20230518_1014_ref.mrc” # INSERT GAIN REF PATH
movies_glob = “*.tif” # INSERT FILENAME PATTERN HERE

This is tough to debug remotely. For the quickest resolution, you may want to fall back on re-doing Patch Motion Correction and then reassigning your existing particles to the newly motion-corrected exposures, as @wtempel suggested. The Reassign Particles to Micrographs job type’s prototypical use case is very similar to your own.

1 Like

Thank you so much for the script! I just follow these steps and was able to re-link previously deleted movies.

Everything appears to work, and I can use the output of this external job for “local motion correction”.

However, the “reference based motion correction” didn’t work, with the following error.
Any suggestions as to why this failed? FYI, the movies are Falcon4 EER files.

Hi @sunch, welcome back to the forum.

I believe this happens when the actual frames of the movie that reference motion is processing does not match the end-frame recorded by an input Patch or Full-frame Motion Correction job. Did you by any chance connect motion-corrected micrographs instead of movies?

To fix this, you can try:

  • Re-run Patch Motion Correction, setting the “End frame” parameter to the minimum number of frames minus one
  • Modify the above script to also load the rigid_motion slot, and set the "rigid_motion/frame_end" column to (minimum number of frames minus one)

Hi @nfrasser! Thank you for your quick response!

For the “reference motion correction” job, I did use the motion-corrected micrographs as the input. In addition, I modified the movie and gain blob sub-inputs to point to external job. These input settings help the “reference motion correction” reached a stage for actual calculation. In comparison, when I use the movie output as the input, the “reference motion correction” would complain about necessary inputs were absent and failed earlier.

I will try to re-run the Patch Motion Correction, which may take some time given the constraints of GPUs.

Regarding to your second suggestion, how do you load the rigid_motion slot in the external job? Does that mean I need to load both the “Import Movies” and the “Patch Motion Correction” jobs in the external job? Thanks!