The SPARC dataset

A SPARC dataset includes everything needed to understand, reuse, and reproduce a research study. It's made up of data files, supporting documents, and detailed descriptions (metadata), all provided by the research team and curated by the SPARC curation team. Optionally, associated analysis and visualization scripts, as well as computational models can also be included.

Most simply, a SPARC dataset is a reflection of the data that you used to support the manuscript that you're submitting, or the data collected for a grant milestone or specific aim. Get more details in the following section: SPARC Data Submission Guide

What Should Be Included in Your Dataset?

Each dataset should include anything the researcher (PI) believes is necessary for someone else to reuse the data or repeat the work. This can include:

  • Raw data (the original measurements or recordings)
  • Experimental protocols (how the study was done)
  • Analysis code or workflows
  • Processed data (results after analysis)
  • Final results
  • Descriptions of the data and its structure

When submitted, a dataset might be complete (no more data will be added), or it could be just one part of a larger study, with more data coming in future updates.


SPARC Dataset Revisions

Revisions are small updates within a version of the dataset. There are no separate pages, or files between revisions.

SPARC Dataset Versions

Citing a SPARC Dataset

Understanding SPARC Dataset metrics