Particles cache issues

nfrasser · March 20, 2024, 4:02pm

Thanks @DerLorenz and @ebirn, we were able to find the issue on our end. These errors appear to happen when there the input particles are located in imported .mrc/.mrcs files, which CryoSPARC tracks as unique symbolic links but resolve to the same original file.

We were able to reproduce with an arrangement like this:

/
├── particles
│   ├── relion_particles_A.mrcs
│   ├── relion_particles_B.mrcs
│   ├── particles1.mrcs
│   ├── particles2.mrcs
│   └── particles3.mrcs
└── projects
    └── P1
        ├── J1
        │   ├── imported
        │   │   ├── particles1.mrcs -> ../../../../particles/particles1.mrcs
        │   │   ├── particles2.mrcs -> ../../../../particles/particles2.mrcs
        │   │   └── particles3.mrcs -> ../../../../particles/particles3.mrcs
        │   └── imported_particles.cs
        ├── J2
        │   ├── imported
        │   │   ├── particles1.mrcs -> ../../../../particles/particles1.mrcs
        │   │   ├── particles2.mrcs -> ../../../../particles/particles2.mrcs
        │   │   └── particles3.mrcs -> ../../../../particles/particles3.mrcs
        │   └── imported_particles.cs
        └── J3
            └── job.log

J1 imports relion_particles_A.star and J2 imports relion_particles_B.star, but both star files point to the particles located the same .mrcs stack files. Both imports are then given as input to J3. Though the particle locations differ, J3 gets the following list of files to cache:

J1/imported/particles1.mrcs
J1/imported/particles2.mrcs
J1/imported/particles3.mrcs
J2/imported/particles1.mrcs
J2/imported/particles2.mrcs
J2/imported/particles3.mrcs

The cache system sees these paths as unique and tries to copy all of them, but because they resolve to the same original files, it tries to copy them twice and runs one of the errors above.

Could you confirm that you have a similar arrangement? Note that J3 can be much further downstream from the original imported particles.

A workaround you could try until our fix is out is to restack the input particles before running the refinement jobs so that they don’t use the same path.

Thanks again and let me know how this works for you or if you have any questions!

PS: I assume that the particle locations in the import jobs are unique, though the paths are not? Careful that you don’t provide the same particle locations to refinement jobs to avoid biasing the results.