Guidelines for Effective SPARC Dataset Titles and Descriptions
SPARC dataset titles should describe the contents and provide details to increase findability.
The title is restricted to 256 characters and will be the first information a user sees when accessing your data. A good title of a dataset, like a good title for a research paper, should describe the contents and provide key details that will enable effective retrieval by a search engine. In many search engines, the title is weighted very highly in ranking search results, so it is important to give the title careful thought.
- Do include information like anatomical region, the technique used, species, sex, and purpose.
- Do Not include the lab name or grant number.
- Do Not include abbreviations for entities like anatomical regions.
Good Example of Dataset Title: Influence of direct colon tissue electrical stimulation on colonic motility in anesthetized male Yucatan minipig - Larauche et al., 2020
Bad Example of Dataset Title: Jones’ lab ephys data from colon
The subtitle is restricted to 256 characters (including spaces) and displayed in the search results on the SPARC Portal. Therefore, it should provide additional details that would be helpful to someone browsing search results, e.g., additional details about the type of data, and appropriate methods.
Naming datasets with multiple parts
If your dataset is part of a more extensive study that is being uploaded in batches, we recommend that you use the same title for each subsequent batch with Part 2 and Part 3 appended. The subtitle should indicate the distinguishing features of the particular batch. This will help alert the user that there are multiple parts to the dataset and also help the curators model the data correctly.
A good description, like a good abstract for a scientific paper, lets the user know important details about the dataset to let them make an informed decision about whether to use them or explore them further. We have adopted a structured abstract format for SPARC abstracts. Please use the following format:
STUDY PURPOSE: Why were the data obtained? What was the study designed to do?
DATA COLLECTED: What data are available? What form are they in? How were they obtained? How many conditions are represented? Is this a standalone dataset or part of a larger study? What modalities were collected? What can the data be used for?
PRIMARY CONCLUSION: What was learned from the study producing the data?
Tips: Think about whether your description would pass reviewers if you submitted it as an abstract for a manuscript. Would a reviewer accept a one-line explanation of what you did, e.g., testing from a kidney function monitoring device? Would you accept your description as a reviewer? The more detail you provide, the more useful the data will be. If no conclusions were drawn, enter "not applicable." Remember, however, that you are describing a dataset, not a scientific study. After reading the description, would a reader understand what they are looking at when browsing or downloading your data?
Good Example of Dataset Description from SPARC: (Achanta et al., 2020)
STUDY PURPOSE: This study aimed to create a comprehensive atlas of the cardiac intracardiac neurons in rats at a cellular level providing gene expression profiles of cardiac neurons on a single-cell resolution. We developed an approach to appreciate the 3D organization of the intracardiac neurons, ICN, while at the same time permitting single-cell transcriptomics and connectomics.
DATA COLLECTION: This dataset contains two components. 1. Cardiac neuron gene expression profiling from real-time quantitative PCR generated by the Fluidigm Dynamic Array: For the data presented in this investigation, data from four 96 well chips were combined into a single matrix and processed further in normalization. The transcriptomic data set consists of 23,254 data points from 151 samples of both individual neurons and neuron pools capturing entire clusters. The gene list and primers information is available in HB-ICN-4chiprun-design.xlsx file within the source folder; 2. Images of heart samples collected using the Arcturus Laser Capture Microdissection: Images are single sections from a female rat heart in which the 3D organization of the ICN is mapped. The heart was sectioned in the transverse plane going rostro-caudally between the base and apex.
PRIMARY CONCLUSION: Through serial cryostat sectioning of a cryopreserved heart with imaging of serially collected and stained sections, it is possible to reconstruct the 3D context and collect single neurons using laser microdissection. The transcriptional profiles of these isolated neurons can be determined down to single-cell resolution and mapped back into the 3D context generated by stacking the serial images.
Bad Example of Dataset Description from SPARC: Electrophysiology data collected by the Jones lab (Jones et al., 2018) from colon and SI.
The banner image is designed to entice visitors to find and explore the dataset. Like a graphical abstract required for many manuscript submissions, the banner image summarizes the dataset quickly and easily. Portal visitors can understand what data types are available. By concisely depicting the contents of the dataset in a single image, portal visitors may find your curated data relevant to their research interests based on many factors, including species, organ, experimental methods, and data types.
The image will be displayed on the SPARC Portal in the global search result lists of all datasets, the contents list when viewing data per organ, and serve as the representative image displayed on the individual dataset page itself. Please avoid images that feature animals or graphic/bloody tissues. The banner image must be able to be understood without the dataset description.
Banner Image Specifications:
The source image uploaded can be any size, although the recommended size is 2048 x 2048 pixels at 300 DPI. The banner image will be generated from the source image to be a square image (1:1 ratio) using a cropping tool in either the SODA tool or within Pennsieve and uploaded to your dataset. Use the cropping handle controls to select the relevant area of the image. On the SPARC Portal, the banner image will be proportionally scaled to fit, so ensure that the essential aspects of the figure are visible when scaled to 128 pixels high x 128 pixels wide as this will be utilized for the search results page and 370 pixels high x 128 pixels wide for the dataset page. For the best results, the banner image file size should not exceed 2048 x 2048 pixels.
- This image will form the first impression of your data set, and will also be displayed next to the data set description, so it should be aesthetically pleasing. Please avoid images that feature animals or graphic/bloody tissues.
- Ensure the image accurately reflects the description given for the data set.
- You can repurpose a figure from a manuscript if it is not under copyright by the journal (a good reason to publish open access).
- Do not use a lot of internal labeling, as there is no figure legend.
- Do not use abbreviations, symbols, and lettering that need explanation.
Good Example of Banner Image: (Achanta et al., 2020) This is a good example, because the figure tells a story of the methods used and data produced in the experiment.
Quality Control Checklist
- Does the title inform the user of key details about the data?
- Will it be helpful for users trying to search for the data?
- If your dataset is part of a larger study, did you adhere to the guidelines for naming studies?
- Does the description provide key details about the experimental paradigm and/or techniques that produced the data?
- Does the description adhere to the SPARC description structure, including the formatting?
- Would it help a user understand whether the data were relevant to their use case?
- Does your image convey important details about your dataset?
- Is your image aesthetically pleasing?
- Does your image adhere to the specifications?
- Can your image be understood without a figure legend?
Reach out to the curation support team on the NIH SPARC Slack channel curation-support.
Updated 23 days ago