Project.save_external_result fails

Hi,

I have a dataset “ptcls_masked” like this:

Dataset([  # 4710 items, 77 fields
    ('uid', [ 123296055269007260 3622747081991072140 9910444828157109460 ... 10962604066298558505 760455200813430278 11276460963495182162]),
    ('components_mode_0/component', [0 0 0 ... 0 0 0]),
    ('components_mode_0/value', [ -8.704914 -3.221879 -11.984112 ... -23.04618 -12.7311325 -33.973885 ]),
    ('components_mode_1/component', [1 1 1 ... 1 1 1]),
    ('components_mode_1/value', [-4.290039 0.9124247 12.031804 ... 6.0515804 2.2032514 -91.72039 ]),
    ('components_mode_2/component', [2 2 2 ... 2 2 2]),
    ('components_mode_2/value', [14.849792 13.341529 -6.248942 ... -31.738754 52.94291 -37.237103]),
    ('blob/path', ['imports/jobs/J45_var_3D/J45_particles/J30/extract_small/004406204473697101034_14sep05c_00024sq_00003hl_00002es.frames_patch_aligned_doseweighted_particles.mrc' 'imports/jobs/J45_var_3D/J45_particles/J30/extract_small/004406204473697101034_14sep05c_00024sq_00003hl_00002es.frames_patch_aligned_doseweighted_particles.mrc' 'imports/jobs/J45_var_3D/J45_particles/J30/extract_small/004406204473697101034_14sep05c_00024sq_00003hl_00002es.frames_patch_aligned_doseweighted_particles.mrc' ... 'imports/jobs/J45_var_3D/J45_particles/J30/extract_small/011025076037205811538_14sep05c_c_00003gr_00014sq_00010hl_00002es.frames_patch_aligned_doseweighted_particles.mrc' 'imports/jobs/J45_var_3D/J45_particles/J30/extract_small/011025076037205811538_14sep05c_c_00003gr_00014sq_00010hl_00002es.frames_patch_aligned_doseweighted_particles.mrc' 'imports/jobs/J45_var_3D/J45_particles/J30/extract_small/011025076037205811538_14sep05c_c_00003gr_00014sq_00010hl_00002es.frames_patch_aligned_doseweighted_particles.mrc']),
    ('blob/idx', [0 2 3 ... 612 618 621]),
    ('blob/shape', [[256 256] [256 256] [256 256] ... [256 256] [256 256] [256 256]]),
    ('blob/psize_A', [1.150625 1.150625 1.150625 ... 1.150625 1.150625 1.150625]),
    ('blob/sign', [-1. -1. -1. ... -1. -1. -1.]),
    ('blob/import_sig', [0 0 0 ... 0 0 0]),
    ('ctf/type', ['spline' 'spline' 'spline' ... 'spline' 'spline' 'spline']),
    ('ctf/exp_group_id', [1 1 1 ... 1 1 1]),
    ('ctf/accel_kv', [300. 300. 300. ... 300. 300. 300.]),
    ('ctf/cs_mm', [2.7 2.7 2.7 ... 2.7 2.7 2.7]),
    ('ctf/amp_contrast', [0.1 0.1 0.1 ... 0.1 0.1 0.1]),
    ('ctf/df1_A', [12467.028 12539.958 12590.25 ... 17638.818 17551.236 17273.52 ]),
    ('ctf/df2_A', [12354.263 12427.192 12477.484 ... 17440.477 17352.895 17075.178]),
    ('ctf/df_angle_rad', [4.695779 4.695779 4.695779 ... -1.5039493 -1.5039493 -1.5039493]),
    ('ctf/phase_shift_rad', [0. 0. 0. ... 0. 0. 0.]),
    ('ctf/scale', [1. 1. 1. ... 1. 1. 1.]),
    ('ctf/scale_const', [1. 1. 1. ... 1. 1. 1.]),
    ('ctf/shift_A', [[-0.09577259 0.01277285] [-0.09577259 0.01277285] [-0.09577259 0.01277285] ... [-0.09577259 0.01277285] [-0.09577259 0.01277285] [-0.09577259 0.01277285]]),
    ('ctf/tilt_A', [[-4924.3154 1696.5303] [-4924.3154 1696.5303] [-4924.3154 1696.5303] ... [-4924.3154 1696.5303] [-4924.3154 1696.5303] [-4924.3154 1696.5303]]),
    ('ctf/trefoil_A', [[-3674.3704 1036.2023] [-3674.3704 1036.2023] [-3674.3704 1036.2023] ... [-3674.3704 1036.2023] [-3674.3704 1036.2023] [-3674.3704 1036.2023]]),
    ('ctf/tetra_A', [[0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] ... [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.]]),
    ('ctf/anisomag', [[0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.] ... [0. 0. 0. 0.] [0. 0. 0. 0.] [0. 0. 0. 0.]]),
    ('ctf/bfactor', [0. 0. 0. ... 0. 0. 0.]),
    ('alignments3D/split', [0 0 1 ... 0 1 1]),
    ('alignments3D/shift', [[-10.440625 3.940625] [-20.109375 -5.565625] [ 0.446875 -0.771875] ... [ 2.559375 -20.921875] [ -6.215625 -21.978125] [ 4.590625 -10.765625]]),
    ('alignments3D/pose', [[ 1.2026409 -1.5182027 1.7355897 ] [ 1.9459642 -1.6794899 -2.2755508 ] [ 0.03856866 3.2222362 0.05960611] ... [ 1.784677 1.1605661 -2.3947632 ] [-0.29803056 0.753842 -2.66825 ] [-1.0413538 -0.65566725 0.4943801 ]]),
    ('alignments3D/psize_A', [1.150625 1.150625 1.150625 ... 1.150625 1.150625 1.150625]),
    ('alignments3D/error', [13942.877 13976.445 14079.669 ... 18048.572 18114.234 19174.857]),
    ('alignments3D/error_min', [0. 0. 0. ... 0. 0. 0.]),
    ('alignments3D/resid_pow', [0. 0. 0. ... 162.48398 0. 0. ]),
    ('alignments3D/slice_pow', [120.11532 107.62646 115.94696 ... 105.33157 112.657646 96.94849 ]),
    ('alignments3D/image_pow', [14093.205 14150.566 14241.688 ... 18131.732 18318.223 19361.746]),
    ('alignments3D/cross_cor', [270.44336 281.74707 277.9668 ... 188.49219 316.64648 283.8379 ]),
    ('alignments3D/alpha', [1. 1. 1. ... 1. 1. 1.]),
    ('alignments3D/alpha_min', [1.1257654 1.3089118 1.1986809 ... 0.8947563 1.4053484 1.4638593]),
    ('alignments3D/weight', [0. 0. 0. ... 0. 0. 0.]),
    ('alignments3D/pose_ess', [0. 0. 0. ... 0. 0. 0.]),
    ('alignments3D/shift_ess', [0. 0. 0. ... 0. 0. 0.]),
    ('alignments3D/class_posterior', [1. 1. 1. ... 1. 1. 1.]),
    ('alignments3D/class', [0 0 0 ... 0 0 0]),
    ('alignments3D/class_ess', [1. 1. 1. ... 1. 1. 1.]),
    ('location/micrograph_uid', [4406204473697101034 4406204473697101034 4406204473697101034 ... 11025076037205811538 11025076037205811538 11025076037205811538]),
    ('location/exp_group_id', [0 0 0 ... 0 0 0]),
    ('location/micrograph_path', ['imports/jobs/J45_var_3D/J45_particles/J10/motioncorrected/004406204473697101034_14sep05c_00024sq_00003hl_00002es.frames_patch_aligned_doseweighted.mrc' 'imports/jobs/J45_var_3D/J45_particles/J10/motioncorrected/004406204473697101034_14sep05c_00024sq_00003hl_00002es.frames_patch_aligned_doseweighted.mrc' 'imports/jobs/J45_var_3D/J45_particles/J10/motioncorrected/004406204473697101034_14sep05c_00024sq_00003hl_00002es.frames_patch_aligned_doseweighted.mrc' ... 'imports/jobs/J45_var_3D/J45_particles/J10/motioncorrected/011025076037205811538_14sep05c_c_00003gr_00014sq_00010hl_00002es.frames_patch_aligned_doseweighted.mrc' 'imports/jobs/J45_var_3D/J45_particles/J10/motioncorrected/011025076037205811538_14sep05c_c_00003gr_00014sq_00010hl_00002es.frames_patch_aligned_doseweighted.mrc' 'imports/jobs/J45_var_3D/J45_particles/J10/motioncorrected/011025076037205811538_14sep05c_c_00003gr_00014sq_00010hl_00002es.frames_patch_aligned_doseweighted.mrc']),
    ('location/micrograph_shape', [[7676 7420] [7676 7420] [7676 7420] ... [7676 7420] [7676 7420] [7676 7420]]),
    ('location/center_x_frac', [0.8137931 0.27586207 0.41896552 ... 0.92068964 0.7844828 0.35517243]),
    ('location/center_y_frac', [0.44 0.92333335 0.165 ... 0.7416667 0.395 0.5733333]),
    ('location/min_dist_A', [100. 100. 100. ... 100. 100. 100.]),
    ('alignments2D/split', [0 0 0 ... 0 0 0]),
    ('alignments2D/shift', [[ -5.525 2.275] [-10.075 -2.275] [ -0.325 -0.325] ... [ 0.975 -11.375] [ -2.925 -10.725] [ 1.625 -2.925]]),
    ('alignments2D/pose', [0.5609987 3.7025914 4.952817 ... 2.9332218 0.5609987 0.52894163]),
    ('alignments2D/psize_A', [2.30125 2.30125 2.30125 ... 2.30125 2.30125 2.30125]),
    ('alignments2D/error', [3213.0518 3146.9412 3080.0981 ... 4130.372 4504.757 5044.806]),
    ('alignments2D/error_min', [3089.2969 3080.262 3003.7788 ... 4107.2856 4428.9077 4926.8877]),
    ('alignments2D/resid_pow', [3213.0518 3146.9412 3080.0981 ... 4130.372 4504.757 5044.806]),
    ('alignments2D/slice_pow', [20.662922 20.787586 24.322748 ... 14.416731 11.809891 14.937137]),
    ('alignments2D/image_pow', [3334.851 3242.1895 3190.5903 ... 4181.2764 4576.4253 5143.6807]),
    ('alignments2D/cross_cor', [142.46216 116.03589 134.81494 ... 65.3208 83.478516 113.81152 ]),
    ('alignments2D/alpha', [3.4472897 2.7909899 2.7713757 ... 2.2654512 3.534263 3.8096833]),
    ('alignments2D/alpha_min', [0. 0. 0. ... 0. 0. 0.]),
    ('alignments2D/weight', [0. 0. 0. ... 0. 0. 0.]),
    ('alignments2D/pose_ess', [0. 0. 0. ... 0. 0. 0.]),
    ('alignments2D/shift_ess', [0. 0. 0. ... 0. 0. 0.]),
    ('alignments2D/class_posterior', [1. 1. 1. ... 0.9970746 1. 1. ]),
    ('alignments2D/class', [10 10 31 ... 3 18 4]),
    ('alignments2D/class_ess', [1. 1. 1. ... 1.0058706 1. 1. ]),
    ('pick_stats/ncc_score', [0.56055355 0.5517136 0.5504824 ... 0.3312946 0.3294836 0.32737774]),
    ('pick_stats/power', [1122.9668 1201.025 1106.4856 ... 1088.6405 1463.577 1519.0277]),
    ('pick_stats/template_idx', [2 2 1 ... 2 1 2]),
    ('pick_stats/angle_rad', [5.4105206 5.323254 0.6981317 ... 4.0142574 5.148721 3.2288592]),
])

I wanted to push this dataset to a workspace, so I executed project.save_external_result like this:

project.save_external_result(
    workspace_uid="W1",
    dataset=ptcls_masked,
    type="particle",
    name="particles",
    title="Filtered particles"
)

This op ended up in the following error:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
/data/group1/x09663a/cryosparc/works/3dva/test/test02.ipynb Cell 21 line 1
----> 1 project.save_external_result(
      2     workspace_uid="W1",
      3     dataset=ptcls_masked,
      4     type="particle",
      5     name="particles",
      6     title="Filtered particles"
      7 )

File ~/apps/cryosparc-tools/cryosparc-tools/cryosparc/project.py:278, in Project.save_external_result(self, workspace_uid, dataset, type, name, slots, passthrough, title, desc)
    196 def save_external_result(
    197     self,
    198     workspace_uid: Optional[str],
   (...)
    205     desc: Optional[str] = None,
    206 ) -> str:
    207     """
    208     Save the given result dataset to the project. Specify at least the
    209     dataset to save and the type of data.
   (...)
    276         str: UID of created job where this output was saved
    277     """
--> 278     return self.cs.save_external_result(
    279         self.uid,
    280         workspace_uid,
    281         dataset=dataset,
    282         type=type,
    283         name=name,
    284         slots=slots,
    285         passthrough=passthrough,
    286         title=title,
    287         desc=desc,
    288     )

File ~/apps/cryosparc-tools/cryosparc-tools/cryosparc/tools.py:487, in CryoSPARC.save_external_result(self, project_uid, workspace_uid, dataset, type, name, slots, passthrough, title, desc)
    484 assert slot_names.intersection(prefixes) == slot_names, "Given dataset missing required slots"
    486 passthrough_str = ".".join(passthrough) if passthrough else None
--> 487 job_uid, output = self.vis.create_external_result(  # type: ignore
    488     project_uid=project_uid,
    489     workspace_uid=workspace_uid,
    490     type=type,
    491     name=name,
    492     slots=slots,
    493     passthrough=passthrough_str,
    494     user=self.user_id,
    495     title=title,
    496     desc=desc,
    497 )
    499 job = self.find_external_job(project_uid, job_uid)
    500 with job.run():

File ~/apps/cryosparc-tools/cryosparc-tools/cryosparc/command.py:119, in CommandClient._get_callable.<locals>.func(*args, **kwargs)
    114     raise CommandClient.Error(
    115         self, f'Did not receive a JSON response from method "{key}" with params {params}', url=self._url
    116     ) from err
    118 assert res, f'JSON response not received for method "{key}" with params {params}'
--> 119 assert "error" not in res, f'Error for "{key}" with params {params}:\n' + format_server_error(res["error"])
    120 return res["result"]

AssertionError: Error for "create_external_result" with params {'project_uid': 'P2', 'workspace_uid': 'W1', 'type': 'particle', 'name': 'particles', 'slots': ['components_mode_1', 'blob', 'components_mode_0', 'ctf', 'pick_stats', 'alignments2D', 'components_mode_2', 'location', 'alignments3D'], 'passthrough': None, 'user': '652e80d894f555b61abe3db0', 'title': 'Filtered particles', 'desc': None}:
ServerError: 'particle.components_mode_1'
Traceback (most recent call last):
  File "/home/x09663a/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py", line 195, in wrapper
    res = func(*args, **kwargs)
  File "/home/x09663a/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py", line 261, in wrapper
    return func(*args, **kwargs)
  File "/home/x09663a/cryosparc/cryosparc_master/cryosparc_command/command_vis/snowflake.py", line 339, in create_external_result
    output = add_external_job_output(
  File "/home/x09663a/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py", line 186, in wrapper
    return func(*args, **kwargs)
  File "/home/x09663a/cryosparc/cryosparc_master/cryosparc_command/commandcommon.py", line 229, in wrapper
    return func(*args, **kwargs)
  File "/home/x09663a/cryosparc/cryosparc_master/cryosparc_command/command_vis/snowflake.py", line 175, in add_external_job_output
    builder.add_output_result(name=slot['prefix'], group_name=name, type=dt)
  File "/home/x09663a/cryosparc/cryosparc_master/cryosparc_compute/jobs/buildcommon.py", line 592, in add_output_result
    min_fields = com.known_result_fields[type]
KeyError: 'particle.components_mode_1'

It is hard for me to understand what was the problem.
Is this just a wrong usage of save_external_result method?

I appreciate your kind help.

Thank you in advance,
Kotaro

Hi Kotaro,

This error happens when CryoSPARC can’t infer the type of one of the results included in the dataset, in this case components_mode_1 with type particle.components. This is because all components_mode_X results are dynamic based on the specified number principal component modes in parent job (compared to static results like blob with well-known type particles.blob).

To fix this issue, specify a slots argument with save_external_result:

project.save_external_result(
    workspace_uid="W1",
    dataset=ptcls_masked,
    type="particle",
    name="particles",
    title="Filtered particles",
    slots=[
        "location",
        "pick_stats",
        "ctf",
        "blob",
        "alignments2D",
        "alignments3D",
        {"dtype": "components", "prefix": "components_mode_0"},
        {"dtype": "components", "prefix": "components_mode_1"},
        {"dtype": "components", "prefix": "components_mode_2"},
    ]
)

Alternatively, if you have only filtered your ptcls_masked dataset and not changed any internal values, you can specify the passthrough argument with the output that this dataset was originally-loaded from:

project.save_external_result(
    workspace_uid="W1",
    dataset=ptcls_masked,
    type="particle",
    name="particles",
    title="Filtered particles",
    slots=["location"],  # specify at least one slot
    passthrough=("J#", "particles"),  # specify original output here
)

This second strategy also has the benefit of correctly placing the resulting External Job in context within the workspace’s tree.

Let me know if you run into any trouble with that or have any further questions!

PS: I realize this is a very unclear error message; we plan to improve this in future versions of CryoSPARC.

1 Like

Hi @nfrasser ,

Thank you for the explanation!

I tried the both method, and this time I got another error.

With the former method,

workspace.save_external_result(
    dataset=ptcls_masked,
    type="particle",
    name="particles",
    title="3DVA filtered",
    slots=[
        "alignments2D",
        "alignments3D",
        "blob",
        "ctf",
        "location",
        "pick_stats",
        {"dtype": "components", "prefix": "components_mode_0"},
        {"dtype": "components", "prefix": "components_mode_1"},
        {"dtype": "components", "prefix": "components_mode_2"},
    ]
)

The output is,

---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
/data/group1/x09663a/cryosparc/works/3dva/test/test02.ipynb Cell 15 line 1
----> 1 workspace.save_external_result(
      2     dataset=ptcls_masked,
      3     type="particle",
      4     name="particles",
      5     title="3DVA filtered",
      6     slots=[
      7         "alignments2D",
      8         "alignments3D",
      9         "blob",
     10         "ctf",
     11         "location",
     12         "pick_stats",
     13         {"dtype": "components", "prefix": "components_mode_0"},
     14         {"dtype": "components", "prefix": "components_mode_1"},
     15         {"dtype": "components", "prefix": "components_mode_2"},
     16     ]
     17 )

File ~/apps/cryosparc-tools/cryosparc-tools/cryosparc/workspace.py:192, in Workspace.save_external_result(self, dataset, type, name, slots, passthrough, title, desc)
    123 def save_external_result(
    124     self,
    125     dataset: Dataset[R],
   (...)
    131     desc: Optional[str] = None,
    132 ) -> str:
    133     """
    134     Save the given result dataset to a workspace.
    135 
   (...)
    190         "J45"
    191     """
--> 192     return self.cs.save_external_result(
    193         self.project_uid,
    194         self.uid,
    195         dataset=dataset,
    196         type=type,
    197         name=name,
    198         slots=slots,
    199         passthrough=passthrough,
    200         title=title,
    201         desc=desc,
    202     )

File ~/apps/cryosparc-tools/cryosparc-tools/cryosparc/tools.py:501, in CryoSPARC.save_external_result(self, project_uid, workspace_uid, dataset, type, name, slots, passthrough, title, desc)
    499 job = self.find_external_job(project_uid, job_uid)
    500 with job.run():
--> 501     job.save_output(output, dataset)
    503 return job.uid

File ~/apps/cryosparc-tools/cryosparc-tools/cryosparc/job.py:1355, in ExternalJob.save_output(self, name, dataset, refresh)
   1334 """
   1335 Save output dataset to external job.
   1336 
   (...)
   1352 
   1353 """
   1354 url = f"/external/projects/{self.project_uid}/jobs/{self.uid}/outputs/{name}/dataset"
-> 1355 with make_request(self.cs.vis, url=url, data=dataset.stream()) as res:
   1356     result = res.read().decode()
   1357     assert res.status >= 200 and res.status < 400, f"Save output failed with message: {result}"

File ~/apps/cryosparc-tools/miniconda/envs/cryosparc-tools/lib/python3.10/contextlib.py:135, in _GeneratorContextManager.__enter__(self)
    133 del self.args, self.kwds, self.func
    134 try:
--> 135     return next(self.gen)
    136 except StopIteration:
    137     raise RuntimeError("generator didn't yield") from None

File ~/apps/cryosparc-tools/cryosparc-tools/cryosparc/command.py:210, in make_request(client, method, url, query, data, headers, _stacklevel)
    202         warn(
    203             f"*** {type(client).__name__}: command ({url}) "
    204             f"did not reply within timeout of {client._timeout} seconds, "
    205             f"attempt {attempt} of {MAX_ATTEMPTS}",
    206             stacklevel=_stacklevel,
    207         )
    208         attempt += 1
--> 210 raise CommandClient.Error(client, error_reason, url=url)

Error: *** CommandClient: (http://localhost:39003/external/projects/P2/jobs/J22/outputs/particles/dataset) HTTP Error 422 UNPROCESSABLE ENTITY; please check cryosparcm log command_vis for additional information.
Response from server: b'Invalid dataset stream'

Here is the diff of the command_vis log before and after the execution

> 2023-10-24 11:26:14,199 upload_external_job_output INFO     | Received external job output P2.J24.particles
> 2023-10-24 11:26:14,200 upload_external_job_output ERROR    | Invalid dataset stream
> 2023-10-24 11:26:14,200 upload_external_job_output ERROR    | Traceback (most recent call last):
> 2023-10-24 11:26:14,200 upload_external_job_output ERROR    |   File "/home/x09663a/cryosparc/cryosparc_master/cryospa
rc_command/command_vis/snowflake.py", line 393, in upload_external_job_output
> 2023-10-24 11:26:14,200 upload_external_job_output ERROR    |     dset = Dataset.load(request.stream)
> 2023-10-24 11:26:14,200 upload_external_job_output ERROR    |   File "/home/x09663a/cryosparc/cryosparc_master/cryospa
rc_tools/cryosparc/dataset.py", line 513, in load
> 2023-10-24 11:26:14,200 upload_external_job_output ERROR    |     raise TypeError(f"Could not determine dataset format
 for file {file} (prefix is {prefix})")
> 2023-10-24 11:26:14,200 upload_external_job_output ERROR    | TypeError: Could not determine dataset format for file <
werkzeug.serving.DechunkedInput object at 0x7f46bb9dfbb0> (prefix is b'\x95CSDAT')
> 2023-10-24 11:26:14,202 upload_external_job_output ERROR    | Invalid dataset stream

The same error was raised with the latter method (using passthroughs).

The server communication itsels seems fine.

cs.test_connection()
Connection succeeded to CryoSPARC command_core at http://localhost:39002
Connection succeeded to CryoSPARC command_vis at http://localhost:39003
Connection succeeded to CryoSPARC command_rtp at http://localhost:39005
True

The dataset “ptcls_masked” is just a subset of a 3DVA output “particles” dataset, which was created with the boolean masking like pctls_masked = ptcls.mask(selection_boolean_mask)

Difficult :melting_face:

I appreciate your kind futher help.

Kotaro

It looks like you are running a pre-release of CryoSPARC, is that correct? The latest published version of cryosparc-tools v4.3.1 does not support CryoSPARC pre-releases. You can instead use the latest development version of cryosparc-tools.

If you installed tools via pip, you can get install the development version like this:

pip uninstall cryosparc-tools
git clone --recursive https://github.com/cryoem-uoft/cryosparc-tools.git
cd cryosparc-tools
pip install -e .
1 Like

Ah, I forgot to checkout v4.3.1 of cryosparc-tools. I was using the commit 65f5b0e in the develop branch… Sorry. I reinstalled v4.3.1 and now it works fine.
My cryoSPARC is the released version of v4.3.1.

Thank you much for your help.

1 Like

Great! Glad you got it working!