Upload Your Data

Uploading a dataset for submission to the SPARC Portal

This document is part of a series related to the Data Submission to SPARC Process:

YOU’RE FINALLY HERE! You’re finally ready…to upload your data.

And depending on the link you chose way back in Prepare Your Metadata, you’re going to do it one of two ways. With or without SODA… If you choose the latter, you will use Pennsieve.

If you have not yet registered with the appropriate group on Pennsieve, you will need to request access here. Remember, this must be done in the correct Pennsieve workspace.

Uploading data to Pennsieve is NOT considered publishing the data. You are uploading the data to a cloud data management platform that SPARC uses (Pennsieve) for curating and publishing datasets. The dataset/model is private at this point (unless the Curation team reserved a DOI for you), and you will have control over who can access it. You can share with the Curation team at any point (technically even before uploading data). It remains private until the dataset owner submits the dataset for review by the Curation team with the "Request to Publish" step, which must be completed before the dataset or model can be published on the SPARC Portal.


Checklist

Open Checklist

But first, let’s check to make sure you’ve done everything up to this point:

  • Talked to curation team
  • Requested access to the appropriate Pennsieve workspace
  • Experimental protocol has been created on Protocols.io
  • All required metadata files have been completed
    • Temporary link to unpublished protocol has been added to dataset_description file
  • All folders/metadata files are named as set forth in the SDS file system
  • All subject & sample names are CONSISTENT across all references in the SDS
    • All human subjects have been de-identified
  • All data, metadata and associated files/info have been organized into the SDS file system
  • All experimental data has been organized by subject and sample in the Primary Folder
  • All required top-level folders include required manifest files
  • Dataset has been uploaded onto Pennsieve
  • Verify the completeness of the upload
  • Dataset has been submitted for review

Completed all the checked boxes? Continue onward!


Time-Sensitive Uploads

Just a quick note. If your dataset is associated with a manuscript that has been submitted to a journal or is to be submitted soon, please provide manuscript information in the “dataset_description” metadata file, including the expected publication date. The SPARC Team will do their best to prioritize your dataset so that you can point your readers to your data on the SPARC Portal. However, please keep in mind that you are a lot more likely to meet your deadline if you start the process early! We suggest starting a month or two before you expect to need the link/DOI to your data/model.


Uploading Your Data… with SODA

You must be sure to officially submit the dataset to initiate the curation process. On SODA, this can be done in the Disseminate Dataset Tab. The help pages and tutorials for the SODA are directly available from the SODA FOR SPARC.

We’re going to give you that link here in a second. But first – do us a favor and read The Next Step section below. You don’t need to read anything else but that. Because there is another step after this. And also because we want to thank you! So go read that real quick, then come back here…

Did you read it? We’ll take your word for it. Alright, are you ready to officially submit your dataset? Then click here, and don't forget to share your dataset with the Curation Team!


Uploading your Data… with Pennsieve

SODA utilizes the Pennsieve Agent to provide guidance and organizational support during your upload. If you are interested in more integrated solutions to uploading data, such as scripting your uploads into a data pipeline, contact DAT-Core to discuss options. All datasets will still need to be organized into the SDS upon upload and will need careful consideration to be sure these organizational standards are met.

Reminder: if you have not yet registered with Pennsieve, you will need to request access here. The help pages and tutorials for the Pennsieve platform are directly available from the Pennsieve documentation. Don't forget to share your dataset with the Curation Team!

And please note, for publication on the SPARC portal, all datasets MUST follow the SDS filing format and include the appropriate metadata.

Additional Resources

A list of additional resources you may find useful when uploading your data.


The Next Step

So you’ve officially submitted your dataset? The next step then is potentially the most important of all – GRAB A DRINK, KICK BACK, AND RELAX!

Now before we get ahead of ourselves, there are a couple more steps. You still need to officially publish your protocol. So once you do receive confirmation from us that your protocol is good to publish, then go back to the Publish Protocol step to see how to do just that.

You should have also shared your dataset with the Curation team so they can access it to review, and request to publish, so they know you are ready to have them review it for publication. You can also always check the status of your dataset in Pennsieve at any time. That can be found in the top left corner of the Dataset screen.

And finally – we’d just like to say thank you so much. Your time, your effort, your research, your data are all greatly appreciated by us here at SPARC, and by the investigators who use the SPARC project to formulate and conduct their own research. We said it before, and we’ll say it again. You’re a giant!