SPARC Dataset Structure (SDS)
The SDS is the required FAIR data sharing organizational and naming system for items shared on the SPARC Portal.
The SDS is a flexibile, modular standard for organizing, describing, and sharing diverse biomedical datasets in alignment the FAIR principles (Findable, Accessible, Interoperable, Reusable). It is designed to:
- Facilitate consistency in dataset organization and metadata and promotes reuse
- Accommodate a wide variety of exerimental designs and research outcomes, e.g. imaging, omics, mapping, models, devices and simulations. To allow this, the SDS has a modular design.
- Support human and machine readability, which allows human curation, automated validation, and the development of other tools. Compliance is enforced by an automated validator.
Using this standard is one of the main tools used by SPARC to promote understanding about the experiments, processes, and datasets on the Portal.
Learn more about the rationale for SPARC's adherence to the FAIR standard for biomedical research data and the policy regarding the SPARC dataset structure in the publication SPARC Data Structure: Rationale and Design of a FAIR Standard for Biomedical Research Data.
Abstract
The NIH Common Fund’s Stimulating Peripheral Activity to Relieve Conditions (SPARC) initiative is a large-scale program that seeks to accelerate the development of therapeutic devices that modulate electrical activity in nerves to improve organ function. Integral to the SPARC program are the rich anatomical and functional datasets produced by investigators across the SPARC consortium that provide key details about organ-specific circuitry, including structural and functional connectivity, mapping of cell types and molecular profiling. These datasets are provided to the research community through an open data platform, the SPARC Portal. To ensure SPARC datasets are Findable, Accessible, Interoperable and Reusable (FAIR), they are all submitted to the SPARC portal following a standard scheme established by the SPARC Curation Team, called the SPARC Data Structure (SDS). Inspired by the Brain Imaging Data Structure (BIDS), the SDS has been designed to capture the large variety of data generated by SPARC investigators who are coming from all fields of biomedical research. Here we present the rationale and design of the SDS, including a description of the SPARC curation process and the automated tools for complying with the SDS, including the SDS validator and Software to Organize Data Automatically (SODA) for SPARC. The objective is to provide detailed guidelines for anyone desiring to comply with the SDS. Since the SDS are suitable for any type of biomedical research data, it can be adopted by any group desiring to follow the FAIR data principles for managing their data, even outside of the SPARC consortium. Finally, this manuscript provides a foundational framework that can be used by any organization desiring to either adapt the SDS to suit the specific needs of their data or simply desiring to design their own FAIR data sharing scheme from scratch.
Version 2.0 of the SDS was vetted and selected as an approved standard December 2023, by the INCF, a collaborative network for open, FAIR, and citable neuroscience.
How will you use the SDS?
Investigators prepare their data for submission by organizing and naming files into folders and completing the corresponding SDS Templates. Investigator tools and automated validation support dataset compliance. The SPARC Curation Team works hand in hand with users to ensure a smooth process.
The SPARC Curation Process
is the workflow by which SPARC datasets are reviewed and validated using automated tools and human review to ensure the high caliber of SPARC, FAIR Data & Models
SDS components include:
and follow Naming Requirements
SDS versions and templates
Get the latest SDS templates to enter your metadata for submission and learn more about the evolution of SDS in SDS Versions
See how the SDS is used:
- in the data submission process, visit Organize Your Files
- on the Portal, visit Navigating SPARC Datasets
Updated 4 days ago