Remove duplicates doesn't work

Hello,

After updating 4.1 CryoSparc, remove duplicates doesn’t work.
Even, I cloned previous completed job, and perform it. But it doesn’t work, too.
Please look up below error after remove duplicate failed.

Traceback (most recent call last):
  File "cryosparc_master/cryosparc_compute/run.py", line 93, in cryosparc_compute.run.main
  File "/home/cryosparc_user/cryosparc/cryosparc_master/cryosparc_compute/jobs/utilities/run_remove_duplicates.py", line 96, in run
    use_neither=use_neither)
  File "/home/cryosparc_user/cryosparc/cryosparc_master/cryosparc_compute/geometry.py", line 1024, in remove_duplicate_particles
    reject_psets.append(pset.mask(~keep_mask))
  File "/home/cryosparc_user/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/dataset.py", line 1265, in mask
    return type(self)([(f, self[f][mask]) for f in self])
  File "/home/cryosparc_user/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/dataset.py", line 638, in __init__
    self.add_fields([entry[0] for entry in populate])
  File "/home/cryosparc_user/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/dataset.py", line 933, in add_fields
    ), f"Could not add {field} with dtype {dt}"
AssertionError: Could not add ('uid', '<u8') with dtype uint64

Hi @shinjw1887, could you please send me the contents of the job log in the Metadata tab

And please also send me the .cs dataset file for the particles used in this dataset. You can get this from the Remove Duplicate Particles job’s parent job, in the outputs tab as shown in the screenshot. Click the download button on any of the results that are not marked as “passthrough”

Hi @nfrasser ,

I’m having the same error come up too when I try running an inspect picks job.

Job log currently looks like this:

================= CRYOSPARCW =======  2022-12-18 11:27:34.332588  =========
Project P37 Job J163
Master cbsukellogg.biohpc.cornell.edu Port 8015
===========================================================================
========= monitor process now starting main process
MAINPROCESS PID 3681526
MAIN PID 3681526
interactive.run_inspect_picks_v2 cryosparc_compute.jobs.jobregister
========= monitor process now waiting for main process
***************************************************************
INTERACTIVE JOB STARTED ===  2022-12-18 11:27:56.726596  ==========================
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
uid: invalid handle 562949953454080, wrong generation counter (given 1, expected 5) (errno 1: Operation not permitted)
add column: invalid handle 562949953454080, wrong generation counter (given 1, expected 5) (errno 1: Operation not permitted)
**** handle exception rc
set status to failed
========= main process now complete.
========= monitor process now complete.

Any other files I could send over to help figure out what’s wrong?

Error appears like so:

Traceback (most recent call last):
  File “cryosparc_master/cryosparc_compute/run.py”, line 93, in cryosparc_compute.run.main
  File “/local1/local/storage/software/cryosparc3/cryosparc_master/cryosparc_compute/jobs/interactive/run_inspect_picks_v2.py”, line 171, in run
    mic_uid_to_particle_dset = particles_dset.split_by(‘location/micrograph_uid’)
  File “/local1/local/storage/software/cryosparc3/cryosparc_master/cryosparc_tools/cryosparc/dataset.py”, line 1310, in split_by
    return {val: self.take(idx) for val, idx in idxs.items()}
  File “/local1/local/storage/software/cryosparc3/cryosparc_master/cryosparc_tools/cryosparc/dataset.py”, line 1310, in <dictcomp>
    return {val: self.take(idx) for val, idx in idxs.items()}
  File “/local1/local/storage/software/cryosparc3/cryosparc_master/cryosparc_tools/cryosparc/dataset.py”, line 1251, in take
    return type(self)([(f, self[f][indices]) for f in self])
  File “/local1/local/storage/software/cryosparc3/cryosparc_master/cryosparc_tools/cryosparc/dataset.py”, line 638, in __init__
    self.add_fields([entry[0] for entry in populate])
  File “/local1/local/storage/software/cryosparc3/cryosparc_master/cryosparc_tools/cryosparc/dataset.py”, line 933, in add_fields
    ), f”Could not add {field} with dtype {dt}”
AssertionError: Could not add (‘uid’, ‘<u8’) with dtype uint64```

Thank you for your suggestion. But I already deleted this job.
So, if error occurs, I will do as you said.

I also have the same error. In our case, we are trying to remove duplicate particles from a symmetry expanded particle stack. It works in older CS version. Error and job log in the metadata tab are pasted below:

Error message:
[CPU: 1.31 GB]
Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 93, in cryosparc_compute.run.main
File “/programs/cryosparc/cryosparc_master/cryosparc_compute/jobs/utilities/run_remove_duplicates.py”, line 96, in run
use_neither=use_neither)
File “/programs/cryosparc/cryosparc_master/cryosparc_compute/geometry.py”, line 1022, in remove_duplicate_particles
keep_psets.append(pset.mask(keep_mask))
File “/programs/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/dataset.py”, line 1265, in mask
return type(self)([(f, self[f][mask]) for f in self])
File “/programs/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/dataset.py”, line 638, in init
self.add_fields([entry[0] for entry in populate])
File “/programs/cryosparc/cryosparc_master/cryosparc_tools/cryosparc/dataset.py”, line 933, in add_fields
), f"Could not add {field} with dtype {dt}"
AssertionError: Could not add (‘uid’, ‘<u8’) with dtype uint64

Log in metadata tab:
================= CRYOSPARCW ======= 2022-12-17 20:42:27.191375 =========
Project P7 Job J140
Master bioinform01 Port 39002

========= monitor process now starting main process
MAINPROCESS PID 2742128
MAIN PID 2742128
utilities.run_remove_duplicates cryosparc_compute.jobs.jobregister
========= monitor process now waiting for main process
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
========= sending heartbeat
uid: invalid handle 562949953454080, wrong generation counter (given 1, expected 3) (errno 1: Operation not permitted)
add column: invalid handle 562949953454080, wrong generation counter (given 1, expected 3) (errno 1: Operation not permitted)


**** handle exception rc
set status to failed
========= main process now complete.
========= monitor process now complete.

@mtmj @shinjw1887 @vinhbiochem How many micrographs were connected to jobs that displayed the “wrong generation counter” message?

hi @wtempel , 12,943 micrographs were connected to the job that displated “wrong generation counter” message.

Patch 221221 for CryoSPARC v4.1.1 addresses the “wrong generation counter” issue. Please see the guide for patch instructions.

Hello,

I have the same problem even after updating to CryoSPARC v4.1.1.

I want to remove duplicate particles from a symmetry-expanded stack but keep getting the error:

Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 93, in cryosparc_compute.run.main
File “/X/cryosparc/cryosparc2_master/cryosparc_compute/jobs/utilities/run_remove_duplicates.py”, line 92, in run
particles_kept, particles_rejected = geometry.remove_duplicate_particles(particles_dset,
File “/X/cryosparc/cryosparc2_master/cryosparc_compute/geometry.py”, line 1013, in remove_duplicate_particles
reject_psets.append(pset.mask(~keep_mask))
File “/X/cryosparc/cryosparc2_master/cryosparc_tools/cryosparc/dataset.py”, line 1265, in mask
return type(self)([(f, self[f][mask]) for f in self])
File “/X/cryosparc/cryosparc2_master/cryosparc_tools/cryosparc/dataset.py”, line 638, in init
self.add_fields([entry[0] for entry in populate])
File “/X/cryosparc/cryosparc2_master/cryosparc_tools/cryosparc/dataset.py”, line 931, in add_fields
assert self._data.addcol_scalar(
AssertionError: Could not add (‘uid’, ‘<u8’) with dtype uint64

Thanks a lot for your help.

Update to 4.1.1 may not be sufficient. Did you apply any patches?
You may apply the more recent patch (Patch 230104 is available for CryoSPARC v4.1.1).

I have applied the most recent patch, but my job (remove duplicates) still failed with the same error message.

@CL5678 Please confirm you have CryoSPARC v4.1.1 and either the 221221 or 230104 patch installed and post the content of the job.log file in the job’s directory.

I have the exact same problem!

Not only inspect picks job failed, the 2D classification job also failed right before it finished 20th iteration, and has the same error message. I can’t proceed any further processing.

Hi
here is the log
================= CRYOSPARCW ======= 2023-01-10 08:21:47.219626 =========
Project P139 Job J187
Master svlpcryosparc Port 60202

========= monitor process now starting main process
MAINPROCESS PID 3372514
========= monitor process now waiting for main process
MAIN PID 3372514
utilities.run_remove_duplicates cryosparc_compute.jobs.jobregister
========= sending heartbeat
uid: invalid handle 562949953454080, wrong generation counter (given 1, expected 7) (errno 1: Operation not permitted)
add column: invalid handle 562949953454080, wrong generation counter (given 1, expected 7) (errno 1: Operation not permitted)


**** handle exception rc
Traceback (most recent call last):
File “cryosparc_master/cryosparc_compute/run.py”, line 93, in cryosparc_compute.run.main
File “/home/cl2/software/cryosparc2_hpc/cryosparc_master/cryosparc_compute/jobs/utilities/run_remove_duplicates.py”, line 92, in run
particles_kept, particles_rejected = geometry.remove_duplicate_particles(particles_dset,
File “/home/cl2/software/cryosparc2_hpc/cryosparc_master/cryosparc_compute/geometry.py”, line 1013, in remove_duplicate_particles
reject_psets.append(pset.mask(~keep_mask))
File “/home/cl2/software/cryosparc2_hpc/cryosparc_master/cryosparc_tools/cryosparc/dataset.py”, line 1265, in mask
return type(self)([(f, self[f][mask]) for f in self])
File “/home/cl2/software/cryosparc2_hpc/cryosparc_master/cryosparc_tools/cryosparc/dataset.py”, line 638, in init
self.add_fields([entry[0] for entry in populate])
File “/home/cl2/software/cryosparc2_hpc/cryosparc_master/cryosparc_tools/cryosparc/dataset.py”, line 931, in add_fields
assert self._data.addcol_scalar(
AssertionError: Could not add (‘uid’, ‘<u8’) with dtype uint64
set status to failed
========= main process now complete.
========= monitor process now complete.

I am using 230104 patch

Thanks a lot

We are running into the same issue with the new cryosparc 4.1.1 upgrade. Our patch is 230104. We had no issues with this dataset prior to the upgrade. We run into this problem when we try to re-extract the particles in the upgraded cryosparc.

@SaifH @shinjw1887 @vinhbiochem @mtmj @LTP @CL5678 @wsxdyyd An additional patch, 20230110, has been released for this issue. Patch 230110 is available for CryoSPARC v4.1.1

1 Like

@wtempel, thank you! This seems to have fixed the issues for now.