Corrupted project json files after faulty reattaching

Hi,

I had a project on cryosparc instance 1. I detached the project from there and tried to attach it to another instance (instance 2). During reattaching in instance 2, the database filesystem got full and the reattachment did not complete and cryoSPARC was showing error in connecting. I proceeeded with cleaning up the filesystem to make space and tried to start cryoSPARC again. It started normally.

But when I opened by newly attached project, some of the last jobs in the last workspace were missing. Only half the jobs were showing up. I waited for sometime for it to load, but still didn’t help. I tried to detach and reattach the project. I am guessing that was probably a bad idea, as the project.json, job_manifest.json and workspace.json files got overwritten and do not show the missing jobs either now. However, those job directories and data do exist in the project directory. They are just not reflected in the json file and hence cryoSPARC can’t see it.

I was wondering if there is any way to recreate these json files with all the information about these missing jobs.

Thank you
Adwaith

Continuously monitor the database storage for available storage to avoid a similar problem from arising in the future.

There may be, but the following procedure has not been tested.

  1. Detach the project directory.
  2. Prepare and set aside a copy of the project directory, in case the following steps result in unintended and unwanted changes to the project directory.
  3. Post under this topic any questions you may have regarding the procedure below.
  4. Save the following script to a file build_manifest.py:
    project_dir = "/path/to/unattached/copy/of/projectdir/"
    manifest_path = "/tmp/WpPwiAf5vi.job_manifest.json"
    failed_job_docs_path = "/tmp/WpPwiAf5vi.failed.json"
    
    import pathlib
    import json
    
    jobs = []
    failed_job_dirs = {}
    
    for doc in pathlib.Path(project_dir).glob('J*/job.json'):
        job_dir = doc.parts[-2]
        with open(doc) as handle:
            try:
                assert job_dir[1:].isdigit(), f"{doc} is not a valid job document path."
                data = json.load(handle)
                assert job_dir == data['uid'], f"Job uid does not match directory name {job_dir}."
                jobs.append(data['uid'])
            except Exception as e:
                failed_job_dirs[job_dir] = str(e)
    
    with open(manifest_path, 'w') as mhandle:
        json.dump({'jobs': sorted(jobs)}, mhandle, indent=4) 
    
    if failed_job_dirs:
        with open(failed_job_docs_path, 'w') as fhandle:
            json.dump(failed_job_dirs, fhandle, indent=4)
    
  5. In build_manifest.py, edit the project_dir definition on the first line to point to the unattached copy of the project directory.
  6. Run the command
    python3 build_manifest.py
  7. The command may take some time to complete. After completion, inspect (without modifying) the *.job_manifest.json and *.failed.json files created by the script inside the /tmp/ directory. *.job_manifest.json should include most job IDs in the project. *.failed.json should include records for a few job IDs along with errors that lead to those job’s exclusion from the job manifest.
  8. Copy the *.job_manifest.json to the unattached project directory as a file named job_manifest.json.
  9. Create a “synthetic” workspaces.json file as described in Attach a project without workspace.json - #2 by wtempel, as applicable given the workspace_uids present in all the project’s jobs’ job.json files.
  10. After creating and placing inside the project directory the job_manifest.json and workspaces.json files, try again attaching the project.

Hi,

Thank you for the detailed solution. I tried the steps as suggested. I was able to create job_manifest.json that included all the jobs in the directory. None of them shows any error. I also created a workspaces.json. I did not modify te project.json at all. But when i try to attach it, no workspaces or jobs show up on the project. However, all the directories still exist in the folder.

Best,
Adwaith

@Adwaith99 Please can you email us the

  1. workspaces.json file
  2. job_manifest.json file
  3. the tgz file created by the command
    cryosparcm snaplogs
  4. the project UID that was assigned during the project attachment attempt

@Adwaith99 Thanks for sending the information. The command_core log, which you may browse using the command

cryosparcm log command_core | less

indicated a problem with the supplied workspaces.json file:

2025-09-03 10:34:30,833 import_project_run   ERROR    |     workspaces_doc_data = load_workspaces_document(abs_path_export_project_dir)
[..]
2025-09-03 10:34:30,833 import_project_run   ERROR    | json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 13 column 5 (char 299)

The actual error is the trailing comma in the preceding line 12 of workspaces.json:

     12         "deleted": false,
     13     },

Removing the comma on line 12 (and, similarly, commas on lines 24, 36, 48, 60, 72) should make workspaces.json readable for the JSON decoder.
After a failed attachment attempt, job_manifest.json may be truncated. If truncation occurred (please check), the file may have to be replaced again by a “synthetic” job_manifest.json (see Corrupted project json files after faulty reattaching - #2 by wtempel) after detachment (if needed) and before a renewed import attempt.

Hi @wtempel,

Thanks for the catch! It seems to be importing all jobs properly, and its behaving properly now. Thanks again for the quick response and fix.

Best,
Adwaith