About Where We Can Read Our Data From

Continuing the discussion from Multi user installation?:

Hello, I’m a new user here, and new to everything from HPC Cluster to CryoSPARC, so I’ll appreciate any and all help I can get.

I’ve been trying to wrap my head around how to connect the storage area on the HPC cluster with my data to the CryoSPARC master instance. If the path is somewhere on the HPC Cluster, how do I make sure that CryoSPARC can access it?

The website of our hpc describing the server I’m using lists these limitations which I think might a bit of a problem from my limited understanding:

  • Specific tools are required to use the object storage. The object storage cannot be properly mounted for local disk-like usage. There are some tools that can do this, but they have their limitations. For example, svfs can be used to mount Swift as a file system, but it uses FUSE which is slow.

  • The data cannot be modified while it is in Allas. It must be downloaded to a server for processing, and the previous version replaced with a new one.

  • In case of swift protocol, files larger than 5 GB are divided into smaller segments. Normally, this is done automatically during the upload.

and lists these as tools with which to use the server

a-commands
swift
python-swiftclient
s3cmd (this is what was used to upload the data to the server in the first place)
Python with S3
Python with SWIFT
rclone
libs3
python-openstackclient
aws-cli
curl
wget

What I’d like to know is if there’s a way to use one of those tools in the GUI of the webapp to add our location on the server to the directories that CryoSPARC can access. Because at the moment I have no idea how to connect our data to the master instance.

And then from my understanding, both the master instance and the workers need to have access to that directory where our data is, so after that, I’m not sure how to create a link between the server and the worker nodes on the HPC Cluster or if I’m misunderstanding something.

Any help or redirection is appreciated because there’s so much to try and learn.
Thank you in advance!

Welcome to the forum @newbie .

Please be aware that in the years that have passed since the linked discussion, CryoSPARC has changed significantly. For current information, please refer to the CryoSPARC Guide.

The tools on the list that I recognize are command line tools, and would likely not be controlled by a user through the CryoSPARC GUI.
It may be best if you approached the staff that supports your HPC cluster and storage and let them know that you would like to

  1. run an application with these architecture and prerequisites

  2. use certain components of the HPC facility, such as

    • raw data storage (read-only)
    • and/or storage for processing output
    • and/or (GPU) compute resources

    as applicable

Together with HPC support staff, you could than develop a plan that defines

  • functional integration between shared HPC and your own infrastructure
  • specific data management procedures

You and your IT support staff would be very welcome to ask on this discussion forum any specific questions as you develop that plan.

1 Like

Thank you for your detailed response!

I contacted the HPC cluster support staff and was directed to people who have used the same resources for CryoSPARC before. They kindly helped by instructing me on their setup, and I have the software up and running!

I am now facing some other issues/questions and might start a new thread with the new questions.

1 Like