Using SPARC Datasets

This guide explains the basic SPARC dataset structure, how to interact with datasets, and how to access or download a dataset.

The key offering of the SPARC Portal is the collection of well-curated, high-impact data that is being generated by SPARC-funded researchers. These datasets, along with other SPARC projects and computational simulations, can be found under the "Find Data" section of the SPARC Portal. While the SPARC Portal shows only the publicly available datasets, new datasets are added regularly as research on this program progresses.

About SPARC datasets

You can see how SPARC datasets are formatted on the SPARC Dataset Format Overview page.

For more detailed information about how SPARC datasets are structured and how to navigate them, please see Navigating a SPARC Dataset.

For more detailed information about how to cite SPARC datasets, please see Instructions for Citing Datasets in Manuscripts.

Accessing SPARC datasets

Users can get access to public datasets directly through the SPARC Portal. All datasets, regardless of size, can be accessed on Amazon's S3 service using your own AWS account. For stepwise instructions on how to access and download data, please see the article on Accessing Public Datasets.

Datasets smaller than 5GB can also be downloaded directly through the browser. Please note that the files will be compressed upon download. In order to access the public datasets, go to the Find Data tab and select one of the datasets. Then, click on the Get Dataset button for information on where to get the specific dataset.

Importantly, Datasets larger than 5GB can only be accessed through Amazon's S3 service using your own AWS account. Please see Accessing Public Datasets for detailed instructions.