SPARC Data Persistence Policy

Retention Period

Data submitted to the repository will be retained with high availability for a minimum period of 10 years, or for the duration of available funding for the repository if this exceeds the 10 year period. The Pennsieve data platform has institutional support from the University of Pennsylvania for long-term support and maintenance of the platform. Deposited data are archived for long-term storage using AWS S3 services.

Metadata Persistence

If for any reason a dataset must be deprecated, the metadata describing that dataset will persist. The dataset DOI will continue to resolve to the dataset metadata and a reason for the deprecation will be added to the landing page.

Succession Plans

In case of closure of the repository, best efforts will be made to integrate all content including the enriched metadata into suitable alternative repositories and DOIs will be updated to redirect users to the correct location of the data.

Data Integrity and Security

File integrity is validated using a checksum as part of the ingest process. Daily backups are created of all database tables containing metadata Files are stored using AWS S3 storage which provides 99,999999999% (11 9’s) durability for file retention and integrity and 99.99% for availability. Storage is designed to sustain data in the event of the loss of an entire Amazon S3 availability zone. All files are checked for viruses as part of the data ingestion, and are encrypted at rest.