Public Datasets Download and Access

The SPARC Portal allows access to public datasets through a browser or Amazon S3.

Users can get access to public datasets directly through the SPARC Portal. All datasets, regardless of size, can be accessed on AWS Open Data registry. You may use your own AWS account, or obtain data without an account.

The SPARC Portal displays the most current version of a datasets and provides several ways to access and download files in any version of that dataset.

Navigating a SPARC Dataset

Downloading Full Datasets

Browser Download - Available only for Datasets under 5GB

For datasets and files under 5GB, SPARC provides users with a mechanism to download the dataset directly from the dataset landing page.

Datasets smaller than 5GB can be downloaded directly through the browser. Clicking the Download full dataset button will immediately start the download process through the web browser. Please note that the files will be compressed upon download. Datasets larger than 5GB can only be accessed through the Amazon Web Services Registry of Open Data (see below).

Accessing Current Dataset Versions on Amazon Web Services (AWS) Open Data Registry

The most current version of all datasets are immediately available directly on the Amazon Web Services (AWS) S3 Open Data Registry. Note that the registry refers to the entire SPARC Portal as a AWS Open Data dataset, therefore, to request a specific dataset, you need to amend the access instructions slightly. Utilize the direct links within each dataset or model as described below to retrieve the dataset. For information about AWS API access, see this tutorial To learn how to utilize AWS command line interface, (AWS CLI), view this site


View the entire catalog of SPARC Data on AWS by viewing the SPARC Open Data Registry page Step-by-step tutorials on how to create a free AWS Account and how to access datasets within S3 are available below.

Accessing Older Versions of Full Datasets on Amazon Web Services (AWS) S3

The SPARC Portal's storage mechanism for published datasets are provided by DAT-Core's Pennsieve Platform. It ensures that published files are stored only once in the AWS cloud. Older versions of datasets less than 5GB and all versions of individual files are immediately available for users to access and download through the web browser. However, in order to access any full dataset of an older version, it must be temporarily restored as a whole package before it can accessed on the AWS S3 Requester Pays service in its entirety. To restore an older version of a dataset to access, navigate to the version that you would like to access and Request Access.

After submitting your request, the Pennsieve Platform will get to work! Restoring older versions may take up to 24hrs to complete and depend on the datasets's size.

Once the older version is ready, you will be notified via email from [email protected]with instructions on how to access it on AWS S3. Don't forget to check your spam folder!

📘

The older version of the dataset will be temporarily available for only 14 days.


Downloading Files from a Dataset

Users can also download individual files and folders within a dataset.

Individual File Downloads

To download a single file, click on the down arrow icon in the Action column of a dataset's Files tab. For files smaller than 5GB, the file will immediately begin to download to your local machine. Please note that the files will be compressed upon download. Files larger than 5GB are made available directly on Amazon’s S3 service (see above for details).

For some files types, such as TIFF, JPEG2000, or JSON, clicking on the file name will open a new page listing the file details. From there, you can also click the Download button in the top right.

Multiple File Downloads

To download multiple files and folders at a time, click the checkbox next to each file to be downloaded. Next, click the Download Selected Files and Folders button in the bottom left of the Files tab.

A confirmation screen will appear. Name the ZIP folder and click Download to initiate your immediate download.

📘

Single files will be downloaded and saved in their original file format. Files and/or folders downloaded together will be saved as a ZIP folder containing the selected files/folders.

No Download or Transfer fees

All SPARC datasets, models, and simulations are now freely accessible with zero transfer costs, marking a significant milestone in open science and peripheral neuroscience research. As an accepted partner in the AWS Open Data Partnership, AWS covers the storage and egress fees for the public. Users can choose to transfer datasets to their own S3 bucket, or download the dataset without an AWS account. Read more about the AWS Registry of Open Data here.