The SPARC Dataset

A brief introduction to how data and models are presented on the SPARC Portal.

Whether it's a dataset or a model, it's curated and published on the SPARC Portal under Data & Models with a corresponding DOI, license, and landing page. Learn how to Navigate SPARC datasets and each of the four dataset categories that can be published through the SPARC Portal.

A SPARC dataset includes everything needed to understand, reuse, and reproduce a research study. It's made up of data files, supporting documents, and detailed descriptions (metadata), all provided by the research team and curated by the SPARC curation team. Optionally, associated analysis and visualization scripts, as well as computational models can also be included.

Most simply, a SPARC dataset is a reflection of the data that you used to support the manuscript that you're submitting, or the data collected for a grant milestone or specific aim. Get more details in the following section: SPARC Data Submission Guide

What Should Be Included in Your Dataset?

Each dataset should include anything the researcher (PI) believes is necessary for someone else to reuse the data or repeat the work. This can include:

  • Raw data (the original measurements or recordings)
  • Experimental protocols (how the study was done)
  • Analysis code or workflows
  • Processed data (results after analysis)
  • Final results
  • Descriptions of the data and its structure

When submitted, a dataset might be complete (no more data will be added), or it could be just one part of a larger study, with more data coming in future updates.