SDS Versions
Current Version SDS 3.0.2
Current Version of SDS
The current version of SDS is 3.0.2 and the changelog with information about modifications from older versions can be found here and all releases can be found at this on Github, or download it as a zip file. Note that the SDS is backward compatible. SDS version 2.1 is still supported and can be found here , downloaded as a zip file, and this presentation includes a high-level overview of V 2.1. For an overview of how to use this when submitting your data or model, please see this section of the Submission Walkthrough.
Overview of SDS 3
SDS 3 introduces significant changes and improvements to the SPARC Data Structure, enhancing the organization, validation, and curation of datasets. This version introduces new metadata files, validation rules, and structural changes, tightening data organization rules, and improving metadata handling. These changes are to enhance data consistency, searchability, and reusability within the SPARC ecosystem and to improve data consistency and support future interoperability between different data standards such as BIDS.
SDS 3 enforces stricter naming conventions, introduces new files for better tracking of specimens and sites, and expands the dataset description file with more detailed fields related to funding, device use, and dataset standards. Additionally, several validation rules ensure that data follows strict guidelines for naming, metadata consistency, and file organization.
Key Changes in SDS Version 3.0 from earlier versions
Key Changes for Users
- Stricter file and folder naming conventions.
- More detailed metadata requirements, especially for devices and funding.
- New options for describing relationships between datasets.
- Enhanced support for complex experimental designs with the new
sites
andspecimens
files. - Improved code documentation capabilities.
- Better handling of auxiliary files related to the publication process.
New Features and Improvements
- The folder structure is no longer the only way to map subjects and samples to files. Now this can be handled using the manifest file.
- Data Dictionary Support:
- New
data-dictionary-path
column in the manifest. - Support for specifying data dictionary type and description.
- New
- Enhanced File Naming Rules:
- Stricter character restrictions for file and folder names.
- New validation checks for potentially problematic file names.
- Improved Data Modality Handling:
- New
data-modality
column in the manifest. - File type restrictions enforced by modality.
- New
- Better Cross-Dataset Referencing:
- New columns in the manifest for referencing files in other datasets.
- New relationTypes in dataset_description for dataset-to-dataset relations.
- Unicode Support: Limited support for Unicode characters in the letter and number categories.
- .dss File: New file to specify the data structure standard (SDS) version used.
- Artifact Validation: Enhanced validation rules for nested folders and entity relationships.
Validation and Naming Conventions
SDS 3.0 introduces stricter validation rules for datasets, ensuring that files, folders, and metadata follow a standardized format:
- Sample and Subject IDs: Samples and subjects cannot share the same pool-id, and a given pool-id can only appear in one of the samples or subjects files.
- File Naming Rules:
- Only the following characters are allowed in file and folder names: [0-9A-Za-z,.-_ ].
- Certain special characters (@#$%^&*()+=/|"'~;:<>{}[]?) are no longer allowed, as are non-printing characters.
- Spaces are discouraged but allowed, except at the beginning or end of file names.
- Files and folders mapped to SDS entity IDs (i.e. folders named sam-1, sub01, etc.) must follow the more restrictive rule [A-Za-z0-9-].
- Unicode Restrictions: By default, SDS 3.0 excludes larger Unicode categories but provides an option to extend support for internal datasets with non-ASCII file names.
- File Type Validation by Modality: If a modality (e.g., imaging, electrophysiology) is provided, the system will validate that the file types match the expected format for that modality.
File System Changes
- New .dss File: The .dss file is added to indicate the data structure standard used by a dataset. It defines which validator to apply for different folder structures. This will not be relevant for most datasets.
- New sites and File: This file provides metadata on specific sites (e.g., electrode locations) improving the granularity and consistency of metadata.
- Removed code_parameters File: Functionality moved to code_description
SDS 2.1.0
slide 23:
Updated about 2 hours ago