Reset file locks

Is there a way for either nonprivileged users, or users who are cryoSPARC admins, but not root users on the master/worker, to reset file locks? And is there a preferred method for root users / the cryosparc UNIX user to do so?

I have some data with spurious file locks that prevent caching on one of our cluster lanes but not another (both on the same SLURM queue, but different node lists), it would be cool if we could fix them without disrupting everything else going on in cryoSPARC.

Please can you post CryoSPARC messages that pertain to the locks that you would like to reset.

The only messages I had were like this:

Do we really need these read locks? If caching is disabled, the job will start immediately regardless anyway.

The lock is intended to prevent modification of file sets that are also being written to cache by another job. If you suspect an inconstency in CryoSPARC’s cache tracking and you know that the cached (or to-be-cached) files are not currently being modified by any other running jobs, various interventions are possible. Unfortunately, the documented interventions assume no jobs are running, whereas you specified

Please can you elaborate more on

Do you suspect the reportedly locked files are not legitimately related to another active job?
Are these data being processed on both sets of nodes, but long-term locking only occurs on one set of nodes?
Are cache devices shared between nodes, but not between the sets of nodes?

Maybe I am confused about what’s going on. Are they actually write locks on the destination cache? Is the intended behavior for multiple jobs supposed to be, go ahead and write if the destinations are two different workers and wait if they’re the same?

Hi @DanielAsarnow,

CryoSPARC cache files are tracked in the database on the master node, so that workers can coordinate writing files using a shared cache space. When a job (on the same or different worker node) starts writing a file to a shared cache space, other jobs using that same file (with the same file path) will wait for the job to finish writing that file, then access the cached file. While jobs are waiting in this manner, SSD cache : requested files are locked for past 26306s, checking again in 5s could appear in the event log to show that another job is currently writing that file to the SSD cache. Once the file is done being written, the job should continue as normal using that file on the cache.

1 Like