Writings

2014
Altman M, Crosas M.

The Evolution of Data Citation: From Principles to Implementation (Forthcoming)

. IASSIST Quarterly. 2014.Abstract
Data citation is rapidly emerging as a key practice in support of data access, sharing, reuse, and of sound and reproducible scholarship. In this article we review the evolution of data citation standards and practices – to which Sue Dodd was an early contributor – and the core principles of data citation that have emerged through a collaborative synthesis. We then discuss an example of the current state of the practice, and identify the remaining implementation challenges.
altmancrosasiassistqforthcoming.pdf
2013
Gibbs E, Lin L, Quigley E, Tang R. Dataverse Usability Evaluation: Final Report. Boston: Simmons GSLIS Usability Lab; 2013 pp. 1-18. dataverse_usability_report-participant_omitted.pdf dataverseusabilityevaluationpresentation.pdf
Crosas M, Sweeney L, King G. Dataverse for Big Data. 2013.Abstract
This is the extended abstract of an upcoming paper.
dataverseforbigdata-summary-v1.pdf
Crosas M. A Data Sharing Story. Journal of eScience Librarianship [Internet]. 2013;1(3):173-179. WebsiteAbstract
From the early days of modern science through this century of Big Data, data sharing has enabled some of the greatest advances in science. In the digital age, technology can facilitate more effective and efficient data sharing and preservation practices, and provide incentives for making data easily accessible among researchers. At the Institute for Quantitative Social Science at Harvard University, we have developed an open-source software to share, cite, preserve, discover and analyze data, named the Dataverse Network. We share here the project’s motivation, its growth and successes, and likely evolution.
2011
Christian T-mai, Crabtree J, Mcgovern N, Altman M. Overview of SafeArchive : An Open-Source System for Automatic Policy-Based Collaborative Archival Replication. In iPres Vol. 02 ; 2011. Publisher's VersionAbstract
n/a
Altman M, Crabtree J. Using the SafeArchive System: TRAC-Based Auditing of LOCKSS. Proceedings of Archiving 2011 [Internet]. 2011:165-170. Publisher's Version archiving2011_altman_crabtree.pdf
Crosas M. The Dataverse Network: An Open-source Application for Sharing, Discovering and Preserving Data. D-Lib Magazine [Internet]. 2011;Volume 17(1/2). WebsiteAbstract
The Dataverse Network is an open-source application for publishing, referencing, extracting and analyzing research data. The main goal of the Dataverse Network is to solve the problems of data sharing through building technologies that enable institutions to reduce the burden for researchers and data publishers, and incentivize them to share their data. By installing Dataverse Network software, an institution is able to host multiple individual virtual archives, called "dataverses" for scholars, research groups, or journals, providing a data publication framework that supports author recognition, persistent citation, data discovery and preservation. Dataverses require no hardware or software costs, nor maintenance or backups by the data owner, but still enable all web visibility and credit to devolve to the data owner.
2009
Altman M. Transformative Effects of NDIIPP, the case of the Henry A. Murray Archive. Library Trends [Internet]. 2009;57(3):338-351. Publisher's Version
Altman M, Adams M, Crabtree J, Donakowski D, Maynard M, Pienta A, Young C. Digital Preservation Through Archival Collaboration: The Data Preservation Alliance for the Social Sciences. The American Archivist [Internet]. 2009;72(1):169-182. Publisher's Version sharedpractices.pdf
Gutmann MP, Abrahamson M, Adams MO, Altman M, Arms C, Bollen K, Carlson M, Crabtree J, Donakowski D, King G, et al. From Preserving the Past to Preserving the Future: The Data-PASS Project and the Challenges of Preserving Digital Social Science Data. Library Trends [Internet]. 2009;57:315–337. WebsiteAbstract
Social science data are an unusual part of the past, present, and future of digital preservation. They are both an unqualified success, due to long-lived and sustainable archival organizations, and in need of further development because not all digital content is being preserved. This article is about the Data Preservation Alliance for Social Sciences (Data-PASS), a project supported by the National Digital Information Infrastructure and Preservation Program (NDIIPP), which is a partnership of five major U.S. social science data archives. Broadly speaking, Data-PASS has the goal of ensuring that at-risk social science data are identified, acquired, and preserved, and that we have a future-oriented organization that could collaborate on those preservation tasks for the future. Throughout the life of the Data-PASS project we have worked to identify digital materials that have never been systematically archived, and to appraise and acquire them. As the project has progressed, however, it has increasingly turned its attention from identifying and acquiring legacy and at-risk social science data to identifying on going and future research projects that will produce data. This article is about the project’s history, with an emphasis on the issues that underlay the transition from looking backward to looking forward.
2008
Altman M. A Fingerprint Method for Verification of Scientific Data. In A Fingerprint Method for Verification of Scientific Data Springer-Verlag; 2008. Publisher's Version unfcisse_rev_20_corrected.pdf
2007
Altman M, King G. A Proposed Standard for the Scholarly Citation of Quantitative Data. D-Lib Magazine [Internet]. 2007;13(3/4). WebsiteAbstract
An essential aspect of science is a community of scholars cooperating and competing in the pursuit of common goals. A critical component of this community is the common language of and the universal standards for scholarly citation, credit attribution, and the location and retrieval of articles and books. We propose a similar universal standard for citing quantitative data that retains the advantages of print citations, adds other components made possible by, and needed due to, the digital form and systematic nature of quantitative data sets, and is consistent with most existing subfield-specific approaches. Although the digital library field includes numerous creative ideas, we limit ourselves to only those elements that appear ready for easy practical use by scientists, journal editors, publishers, librarians, and archivists.
a_proposed_standard_for_the_scholarly_citation_of_quantitative_data.pdf
King G. An Introduction to the Dataverse Network as an Infrastructure for Data Sharing. Sociological Methods and Research [Internet]. 2007;36:173-199. Website
2003
King G. The Future of Replication. International Studies Perspectives. 2003;4:443–499.Abstract
Since the replication standard was proposed for political science research, more journals have required or encouraged authors to make data available, and more authors have shared their data. The calls for continuing this trend are more persistent than ever, and the agreement among journal editors in this Symposium continues this trend. In this article, I offer a vision of a possible future of the replication movement. The plan is to implement this vision via the Virtual Data Center project, which – by automating the process of finding, sharing, archiving, subsetting, converting, analyzing, and distributing data – may greatly facilitate adherence to the replication standard.
Altman M, Gill J, McDonald MP. Numerical Issues in Statistical Computing for the Social Scientist. Springer-Verlag; 2003. Publisher's Version
2001
Altman M, Andreev L, Diggory M, King G, Kiskis D, Kolster E, Verba S. A Digital Library for the Dissemination and Replication of Quantitative Social Science Research. [Internet]. 2001;Social Science Computer Review, 19:458-470. WebsiteAbstract
The Virtual Data Center (VDC) software is an open-source, digital library system for quantitative data. We discuss what the software does, and how it provides an infrastructure for the management and dissemination of disturbed collections of quantitative data, and the replication of results derived from this data.
Altman M, Andreev L, Diggory M, King G, Kolster E, Krot M, Verba S, Kiskis D. An Introduction to the Virtual Data Center Project and Software. [Internet]. 2001;Proceedings of The First ACM+IEEE Joint Conference on Digital Libraries:203-204. Website
1995
King G. Replication, Replication. PS: Political Science and Politics [Internet]. 1995;28:443–499. WebsiteAbstract
Political science is a community enterprise and the community of empirical political scientists need access to the body of data necessary to replicate existing studies to understand, evaluate, and especially build on this work. Unfortunately, the norms we have in place now do not encourage, or in some cases even permit, this aim. Following are suggestions that would facilitate replication and are easy to implement – by teachers, students, dissertation writers, graduate programs, authors, reviewers, funding agencies, and journal and book editors.