Data Backup & Preservation Terms

Harvard University Information Technology (HUIT) in collaboration with Harvard Library, the Institute for Quantitative Social Science (IQSS), and Harvard-Smithsonian Center for Astrophysics (CfA) hosts Harvard's Dataverse Networks (IQSS and CfA) and maintains a full backup of all data and directories. This means that there is always full, recent off-site copy of Harvard's Dataverse Networks data repository.


Backup Schedule

HUIT backs up all of the application/system files and databases nightly.  It is stored off-site in Carlstadt, New Jersey for 45 days.
 

All research data files in the repository are replicated hourly to a second off-site storage array at 1 Summer St, Boston, MA.

Since March 2013, HUIT incorporated Harvard's Dataverse Networks' data content into the DRS Storage Infrastructure.  This makes use of the storage management software to create a tape copy of the data to be stored at the Harvard Depository. 

Policy and Procedures for Digital Archiving
Harvard University's policy for digital archiving is part of the institution's general mission to preserve all of its archival collections and to ensure their availability for current and future use. More specifically, this policy for preserving our digital data collections is meant to ensure continued access to born digital and digitized data, to ensure their authenticity, and to maintain data quality using the best digital archival practices. 

Harvard University (in particular with support from IQSS and CfA) commits to best archival practice to ensure that all materials deposited in the archive remain available and usable. This includes: preserving previously deposited versions of materials; deaccessioning (removal) of studies only when legally compelled; maintaining public access to the materials; regularly reviewing risks to materials; and reformatting materials as necessary and if possible to avoid format obsolescence. 

Preservation of Materials Deposited in Harvard's Dataverse Networks

Harvard University supports permanent bit-level preservation of all studies directly deposited in Harvard's Dataverse Networks.

In addition all data deposited in the IQSS Dataverse Network and made available to the public is replicated by the Data-PASS partners for permanent preservation by the partnership.

Notwithstanding Harvard University's commitment to archival and long term access of all data published in the Harvard Dataverse Network, questions about finding and using data distributed by others in the Harvard's Dataverse Network should in general be referred to individual dataset owners. Due to the self-curation nature of the Harvard Dataverse Network, owners or distributors of individual datasets have control over selection of materials, documentation, access policies and data user agreements of their datasets. However, the Harvard Dataverse Network takes data publication very seriously, encouraging good curation practices through metadata, proper documentation and versioning to enable data discovery and reuse. When possible, it extracts automatically metadata from deposited datasets. Also, once a dataset is published (released), the repository guarantees archival and long term access to that dataset, accompanied with its data user agreement. The dataset can only be unpublished (deaccessioned) under extreme circumstances, such as a legal requirement to destroy that dataset.