Additional Information

These supplements support version 1.3 of the Dataverse Network software.

Supplemental information available about dataverses and Dataverse Networks includes the following:

Glossary

collection
A collection is a way to group or categorize a set of studies. A dataverse can have a tree of collections and sub-collections. A study can belong to multiple collections. Collections can be defined as a query or as an association of specific studies. When a collection is defined as a query, any new study added that satisfies that query (for example, author: Smith) is added automatically to the collection. The root collection of a dataverse by default contains all studies that are owned by that dataverse. It is defined as an association. A dataverse also can link to an entire collection tree from another dataverse.
customization
You can customize the following components of your dataverse: name and alias, banner and footer (to have the style of your website), homepage layout, and Contact Us e-mail address. You also can set additional fields to be displayed in Search results.
dataverse
A Dataverse Network can contain multiple dataverses. Each dataverse is a virtual archive or organizer of research data. It can contain data sets (studies) uploaded specifically to that dataverse, or data sets that belong to other dataverses. The data sets can be organized by collections and sub-collections. In addition to uploading your own studies and setting up your own collections, you can customize a dataverse in the following ways:

  • Modify the banner and footer.
  • Choose to display announcements or descriptions on the homepage.
  • Choose to display a subset of the most recent studies uploaded to the dataverse.
  • Set up a description of the dataverse in the About page.
  • Set up a Contact Us e-mail so users can send messages to the dataverse administrator.
restricted versus public
When a dataverse is first created is set as restricted. A restricted dataverse cannot be access by any users unless they are admins, curators or contributors of that dataverse, or they are granted special permission to access it. When a user browses or searches a dataverse network, all studies and collections in a restricted dataverse are ommitted from any view or search results.
study
A study is a logical grouping of one or more data sets. A study contains cataloging information. Only Title and ID information is required in the catalog; however, there are nearly 100 cataloging fields available to specify a study, including details for authors, producers and distributors, the scope of the study, the methodology used, and more. A study typically includes a set of electronic files. Some files might be documentation related to the study and other files might be data.
study fields
Study fields is another name for the citation fields that appear on a study's Cataloging Information page.
study files
The Files page of a study lists all electronic files associated with that study, and are provided by the author or curator. The study might contain documentation files and data files. In the Dataverse Network, data files (sometimes called subsettable files) are files that you can subset and analyze online by using the Dataverse Network tools. You can differentiate a data file from other files because an analysis icon and the number of variables and categories are displayed next to the file. Other files might also contain data, but the Dataverse Netwok application does not recognize them as data (subsettable) files.
subsettable
The Dataverse Network currently treats STATA (.dta) and SPSS (.sav or .por) formatted data files as subsettable. When a file is subsettable, you can analyze it online or download a subset (selection) of the variables in the file. You then can recode the variables and apply descriptive statistics, or use any of the models provided by the Zelig statistical package. See Enter Catalog Information and Upload Study Files for more information.

List of Metadata

The Dataverse Network metadata is compliant with the DDI schema version 2. The Cataloging Information fields associated with each study contain most of the fields in the study description section of the DDI. That way the DVN metadata can be mapped easily to a DDI, and be exported into XML format for preservation and interoperability.

DVN data also is compliant with Simple Dublin Core (DC) requirements. For imports only, DVN data is compliant with the Content Standard for Digital Geospatial Metadata (CSDGM), Vers. 2 (FGDC-STD-001-1998) (FGDC).

Attached is a PDF file that defines and maps all DVN Cataloging Information fields. Information provided in the file includes the following:

Zelig Interface Schema

Zelig is statistical software for everyone: researchers, instructors, and students. It is a front-end and back-end for R (Zelig is written in R). The Zellig software:

Zelig is distributed under the GNU General Public License, Version 2. After installation, the source code is located in your R library directory. You can download a tarball of the latest Zelig source code from http://gking.harvard.edu/src/contrib/.

The Dataverse Network software uses Zelig to perform advanced statistical analysis functions. The current interface schema used by the DVN for Zelig processes is in the following location:
http://thedata.org/files/thedata/schema/ZeligInterfaceDefinition_1_1.xsd

Criteria for Model Availability

Three factors determine which Zelig models are available for analysis in the DVN: