Software
The Dataverse Network software is written under the Java Platform, Enterprise Edition (Java EE) 5 using the latest Java technologies, including Enterprise Java Beans (EJB) 3 and Java Server Faces. It runs on top of the GlassFish Application Server; refer to the Project GlassFish Community for development information. (We use PostgreSQL for database software, but you can use other databases easily, such as Oracle or MySQL.) The data analysis component uses R and Zelig for statistical computing.
See the following pages for specific details about our software:
Features
Features provided by the DVN software include:
- Search studies cataloging and variable metadata across the Dataverse Network and for each individual dataverse by using our Apache Lucene index server-based search engine.
- Create studies to hold your data, documentation or any other type of files, and generate citations automatically using a Handle as a persistent identifier.
- Upload a data file (STATA or SPSS), generate a UNF and get online subsetting and analysis for your file.
- Create a dataverse to host studies, and collections and subcollections to organize those studies.
- Create collections of studies in other dataverses by selecting specific studies, collecting the results of a search, or linking to existing collections.
- Customize page layouts to brand a dataverse.
- Customize the Network page layouts and organize dataverses into groups.
- Assign user permissions to access, contribute, and manage studies and dataverses.
- Harvest studies from other Dataverse Networks or remote archives that support the Open Archives Initiative (OAI) protocol.
- Export studies in the Network to XML files in Dublin Core and DDI formats, or to USMARC formatted files.
- Support the Z39.50 standard to allow remote searches of our studies.
- Get web usage statistics for your Dataverse Network using the Google Analytics tool.
Releases
The latest release is version 1.2 (April 18th, 2008), which requires Java 6, GlassFish v2, and PostgreSQL 8.1. To download the packages for this release, go to the official DVN SourceForge.net project site.
Features in this release include forgotten password option, FGDC metatdata harvesting, and new Network Admin utilities. See Version 1.2 for full details. See the online Installers Guide for detailed information about how to install the software.
Version 1.3 is planned for release in July of this year. This release includes numerous improvements to workflow and functions:
New Subsetting and Analysis features:
- Improvements made to Analysis Results page
- Analysis can be saved to an R script file
- Analysis support provided for Zelig 3.2 models
New Study features:
- Improvements made to Study File page
- Files can be opened by clicking thumbnail image
- Templates available to create or edit studies
- Map tool provided to enter bounding box in study metadata
Search features:
- New analyzer indexes studies and data to improve searches quality and speed
- Index utilities modified: remove index locks, re-indexed studies that failed indexing
- Fix made to "does not contain" searches in Advanced Search
New Network homepage features:
- Improvements made to homepage
- Sort or browse dataverses alphabetically
- View can be customized to filter and sort groups
New Dataverse features:
- On any web page, copy and paste code for link or button to a dataverse
- On any web page, copy and paste code for searching a dataverse
- Allow to create and modify study templates for each dataverse
New integration with Nesstar to harvest metadata using their API
New technical features:
- DSB component rewrite in Java code
- UNF logic rewrite in Java code
- Custom remote authorization (first stage of Shibboleth implementation)
- Import/ingest of DDI category attribute on varFormat tag
Data Analysis
- Upload subsettable files in R format
- Provide optional weighting when data file is uploaded (Q3 2008)
- Support additional formats for downloads and uploads (Q3 2008 and on-going)
- Simplify statistical analysis UI and expand it for R functions and other packages(2009)
- Data visualization (Q3 2008 and on-going)
Studies and Collections
- Add User Comments to studies
- Download citation in BibTex and other formats (Q3 2008)
- Re-visit and improve collections functionality (Q3 - Q4 2008)
- Group studies as 'downloadable' collections (Q4 2008)
- Credit Card payments support to access data sets (2009)
- Preview Study Edits (Q4, 2008)
- Filter related studies and studies of interest to individual users (2009)
Search
- Search datasets spatially (maps) and chronologically (Q3 - Q4 2008)
- Search documentation files (pdf, txt) (Q3 2008)
- Search all dataverses from a dataverse (Q4 2008)
- More Options in Advanced Search (2009)
- Highlight search terms in results (2009)
Performance, Interoperability and other Projects
- Convert between Scholar and Basic dataverse types
- Integrate Open Journal Systems study file submission with a dataverse
- Account login through Secure Socket Layer
- Support LOCKSS to mirror DVN data (on-going 2008)
- Web services/APIs (R, swivel, ManyEyes, etc) (Q4 2008 and 2009)
- Remote authorization (Shibboleth) (on-going 2008)
- Integration with GenePattern (2008, 2009)
- Internationalization (2009)
- Simplify dataverse customization (Q3 2008)
- Citation of dataverses (Q4 2008)
- Support other global IDs (Q4 2008 - 2009)
- Site accessibility improvements (on-going, 2008)
- Simplify DVN installation (Q3 2008)
- Build developers community (Start in Q3 2008, on-going)
Downloads
You do not need to download our Dataverse Network software to have your own dataverse (see Get Started: DVs for Scholars for details on creating your dataverse).
If you represent a university, archive, or other institution, you might be interested in installing a Dataverse Network at your own facility. Installing your own Network gives you full control of data storage and back ups. It requires you to maintain a production server and a file system, and to upgrade the application as needed. In return, you can control access to all data archived on your network, as well as how data is archived and maintained there.
To download the Dataverse Network software, go to the Dataverse Network project at SourceForge.net and locate the current version's packages. When distribution packages are available, each package will include a README text file with more information about that version's contents.
For a description of requirements to support installation of the Dataverse Network software and step-by-step instructions about how to install the code, see our online Installers Guide.
Developers
We plan to open the DVN software to the development community within the second quarter of 2008. Until that time you can contribute to the Project by sending us your suggestions for review, and we will integrate any valid code into our base. Send your suggestions to dvn_support@help.hmdc.harvard.edu.
When we release the open source DVN software, you can contribute your code here.
For a description of the Dataverse Network software developement environment and step-by-step instructions about how to install the developement code, see our online Developers Guide.
Licenses
Legal Contract
Dataverse Network software is licensed under the Affero General Public License (a version of GPLv3). This license guarantees you the freedom to share, modify, and redistribute the program, and ensures that it remains free software for all users.
The license also guarantees that future versions of the software will remain free and owned by you and the community. Anyone who extends the software and distributes it, or uses it to provide a network-accessible service that you use, must make the source code for those extensions available to you (and give you a royalty-free license to use patents they have incorporated in their extensions, if any). Anyone sharing, modifying, or running the software also must make appropriate attribution to the Dataverse Network Project and its development team.
Social Contract
In return for the effort so many people put into the Dataverse Network Project, we appreciate if you would contribute back to this collective effort in one of these ways:
- If you like the Dataverse Network, please use it by citing data in publications according to our citation standards; sharing your research data through the Dataverse Network; establishing a dataverse on your web page (or installing your own Dataverse Network and allowing other Dataverse Networks to harvest from your installation metadata, and, when feasible, data); or linking to our Project homepage. We also appreciate if you would cite the Dataverse Network Project and our publications in your reports and publications, and tell others about the Project. The more people involved in these ways, the more we all are able to improve the software for everyone.
- If you think of a useful new feature that could improve the Dataverse Network, please suggest improvements, send us code to implement them, or sponsor the development of specific features with grants or gifts.
- If you find a problem, please report the issue so that we can fix it (or even better, send us a patch).
- Commerical profit-making ventures also are welcome to use Dataverse Network software in any way consistent with our licensing requirements. In addition, our Project team can help provide training and technical assistance in support of your commercial and development efforts, when feasible and when we judge that such help improves the general applicability of, or promotes, the overall Project. Support for the Project via grants of funds, coding assistance, or temporary personnel detailed to our team, make it easier for us to assist commercial efforts.