Accessing Public Datasets

The SPARC Portal allows access to public datasets through a browser or Amazon S3.

Accessing Public Datasets on the SPARC Portal

The SPARC Portal presents users with the most current version of public datasets and offers several ways to access files in any version of that dataset. To access public datasets, click on the Get Dataset button directly under any dataset's banner image or navigate to the Files tab.

Accessing Full Datasets

For users who want access to full public datasets on SPARC

Browser Download - Available only for Datasets under 5GB

Datasets smaller than 5GB can be downloaded directly through the browser. Clicking the Download full dataset button will immediately start the download process through the web browser. Please note that the files will be compressed upon download. Datasets larger than 5GB can only be accessed from Amazon Web Services (AWS) S3 Requester Pays service (details below!).

Accessing Current Dataset Versions on Amazon Web Services (AWS) S3

The most current version of all datasets are immediately available directly on the Amazon Web Services (AWS) S3 Requester Pays service. This means that any costs associated with downloading the data will be charged to your AWS account. Please note that this is the only way to access full datasets that are larger than 5GB.

For up-to-date transfer pricing, please visit the AWS Pricing documentation. Step-by-step tutorials on how to create a free AWS Account and how to access datasets within S3 are available below.

Accessing Older Versions of Full Datasets on Amazon Web Services (AWS) S3

The SPARC Portal's storage mechanism for published datasets are provided by DAT-Core's Pennsieve Platform. It ensures that published files are stored only once in the AWS cloud. Older versions of datasets less than 5GB and all versions of individual files are immediately available for users to access and download through the web browser. However, in order to access any full dataset of an older version, it must be temporarily restored as a whole package before it can accessed on the AWS S3 Requester Pays service in its entirety. To restore an older version of a dataset to access, navigate to the version that you would like to access and Request Access.

After submitting your request, the Pennsieve Platform will get to work! Restoring older versions may take up to 24hrs to complete and depend on the datasets's size.

Once the older version is ready, you will be notified via email from [email protected]with instructions on how to access it on AWS S3. Don't forget to check your spam folder!

📘

The older version of the dataset will be temporarily available for only 14 days.

AWS Tutorials

Step-by-Step Tutorials are available here to learn how to create a free AWS account and download or transfer dataset files with S3:

You can also view detailed information on how to set up your AWS account and how to download or transfer data can be found using the following links:

Pennsieve provides documentation on how to download and copy public datasets and files directly from AWS S3 using the AWS Requester Pays service.

Downloading Individual Files from a Dataset

Users can also download individual files and folders within a dataset. To download a single file, click on the down arrow icon in the Action column of a dataset's Files tab. For files smaller than 5GB, the file will immediately begin to download to your local machine. Please note that the files will be compressed upon download. Files larger than 5GB are made available directly on Amazon’s S3 service (see above for details).

For some files types, such as TIFF, JPEG2000, or JSON, clicking on the file name will open a new page listing the file details. From there, you can also click the Download button in the top right.

Downloading Multiple Files from a Dataset

To download multiple files and folders at a time, click the checkbox next to each file to be downloaded. Next, click the Download Selected Files and Folders button in the bottom left of the Files tab.

A confirmation screen will appear. Name the ZIP folder and click Download to initiate your immediate download.

📘

Single files will be downloaded and saved in their original file format. Files and/or folders downloaded together will be saved as a ZIP folder containing the selected files/folders.