I want to search


We've launched a new website!

You're currently accessing the archived version of the DataONE website. To see our new design and keep up to date with the latest DataONE news, visit our new website at https://dataone.org

Software Tools


PSPP is a program for statistical analysis of sampled data, and is a free replacement for the proprietary program SPSS. PSPP can perform descriptive statistics, T-tests, linear regression and non-parametric tests. Its back-end is designed to perform its analyses as fast as possible, regardless of the size of the input data. You can use PSPP with its graphical interface or the more traditional syntax commands. Some benefits are that PSPP uses SPSS files and is compatible with OpenOffice and can support 1 billion data observations.

Additional Information: Tags: graphics,statistics Contributor: Cost: Free
Custom text:

Protege is an open source ontology editor. An ontology is similar to a taxonomy in that it presents a controlled vocabulary for a given area of knowledge. However the relationships between the different objects can be far more complex and richly described.

It allows users to create ontologies in both the Frames and Web Ontology Language (OWL) frameworks. Protege allows users to

  • Import, edit and save existing ontologies written in OWL or RDF (Resource Definition Framework).
  • Create new ontologies.
  • Save ontologies in several formats, including XML expressions of RDF and OWL
  • Visualize ontologies in graphical form, showing the functional relationships between classes.
  • Populate ontologies with concrete instances of classes.
  • Execute reasoners that can perform inferences on an ontology (i.e. classify instances based on their properties)

Intended audience: Protege is designed for those in the field of ontology and knowledge modeling, since some degree of knowledge about the underlying axioms is nearly always required. Some plugins area available that shield a user from these to some degree.

For working scientists, the most useful plugins and views will be those that present complex knowledge models graphically. In addition, it might might be helpful for those wishing to use the RDF/XML expression of Dublin Core to annotate their data with metadata. There are numerous plugins written by other projects. There are many thousands of registered users and a wiki.

Protege can be used to edit simpler vocabulary systems such as Simple Knowledge Organization Schema (SKOS), but generally, its power is overkill for this use.

Additional Information:

An ontology is a formal reprentation of concepts in a discipline, and the relationships between them. It is used to reason about the entities in that disipline. Ontologies are the structural frameworks for organizing information and, in the realm of data publication, are the basis of the "semantic web".
An introduction to ontology can be found at: http://protege.stanford.edu/publications/ontology_development/ontology10...

Tags: metadata,models,programming,XML Contributor: MO, RO Cost: Free
Custom text:

PRONOM is an online registry of technical information about file formats, maintained by The National Archives (UK). The PRONOM database contains information about the properties of over 600 file formats, and is used by repository managers to understand, document and manage file formats stored in repositories. Information in the database includes extensions associated with file types, software required to render files, version histories of file types, signature types and compression information.

In addition to searching the registry via the web interface, PRONOM provides two important services related to file type identification and metadata extraction. The DROID tool, provides both a command line and GUI interface to the PRONOM registry allowing for easy documentation of file types. The PRONOM Unique Indentifier (PUID) tool allows unambiguous reference to data in the PRONOM database.

Additional Information: Tags: metadata,web services Contributor: PW, RO Cost: Free
Custom text:
Project Trident

Project Trident is a scientific workflow workbench that allows users to author workflows visually by using a catalog of existing activities and complete workflows. The workflow workbench provides a tiered library that hides the complexity of different workflow activities and services for ease of use. Trident supports: analysis and visualization worksflows; composing, running, cataloging experiments as workflows, as well as capturing of provenance information. Workflows can be scheduled over high performance clusters or cloud computimg resources.

Additional Information:
  • Yogesh Simmhan, Roger Barga, Catharine van Ingen, Ed Lazowska, Alex Szalay, "Building the Trident Scientific Workflow Workbench for Data Management in the Cloud," advcomp, pp.41-50, 2009 Third International Conference on Advanced Engineering Computing and Applications in Sciences, 2009
  • Roger Barga, Jared Jackson, Nelson Araujo, Dean Guo, Nitin Gautam, Yogesh Simmhan, "The Trident Scientific Workflow Workbench," escience, pp.317-318, 2008 Fourth IEEE International Conference on eScience, 2008
Tags: analyze,computing,GUI Contributor: Cost: Free
Custom text:

ProCite is a tool for creating citations in the users' preferred citation standard.

ProCite features include:

  • Collecting and managing your references
  • Creating bibliographies
  • Formatting citations and bibliograpies for many different journal styles
  • Sharing reference with other users
Additional Information:
Tags: citation Contributor: RL, GW Cost: Cost-basis
Custom text:

PowerDesigner is a tool for creating business-process models, and conceptual, logical, and physical data models for database design, including relational and dimensional models. PowerDesigner can coordinate the business process model with the database design, ensuring that the process steps that create data have data representations in the logical model. PowerDesigner can create the actual database from the physical model, and create different physical implementations from a single logical model. PowerDesigner can also reverse-engineer existing databases into a model diagram. PowerDesigner works with many database management systems (DBMS). Major outputs from the tool include entity-relationship (ER) diagrams, impact analysis reports on design changes, and standard or custom reports on all objects in the design (tables, fields, relationships).

Additional Information:
Tags: database,models Contributor: ST, TB Cost: Cost-basis
Custom text:

The combination of PostgreSQL and PostGIS provides a robust database platform that supports the integrated management of both geospatial data and attributes associated with those data in a database system that is supported by a large number of client applications, including GIS and mapping applications. PostgreSQL is an open source object-relational database server that implements the Structured Query Language (SQL) for database design, management, and use. PostGIS is an implementation of the Open Geospatial Consortium's "Simple Features Specification for SQL" standard which defines data types and functions that may be implemented in a SQL database for the storage and management of geospatial data within the database.

Additional Information:
  • PostGIS:http://postgis.refractions.net/
  • OGC Simple Features Specification for SQL: http://www.opengeospatial.org/standards/sfs
  • Jens Basanow , Pascal Neis , Steffen Neubauer , Arne Schilling and Alexander Zipf (2008). Towards 3D Spatial Data Infrastructures (3D-SDI) based on open standards — experiences, results and future issues . Advances in 3D Geoinformation Systems. pp. 65-86
  • Sisi Zlatanova (2008). 3D Geometries in Spatial DBMS.Advances in 3D Geoinformation Systems. pp. 1-14
  • Open Source Community Page for PostGIS: http://postgis.refractions.net/
Tags: Contributor: Cost: Free
Custom text:
Polar Information Commons (PIC) Rights Badging Tool

The Polar Information Commons (PIC) Rights Badging Tool allows you to use the Creative Commons tools to create a graphic badge. This badge asserts that digital content is available in the Polar Information Commons (PIC) with minimal restrictions and in adherence with community guidelines or norms of behavior for ethical data sharing. Once created, a badge may be placed on a website describing your data set or within the use constraints field of its metadata.

Additional Information:

The Polar Information Commons (PIC) serves as an open, virtual repository for vital scientific data and information about the polar regions, and provides a shared, community-based cyber-infrastructure.

Tags: access,metadata Contributor: RD, GW Cost: Free
Custom text:

PLATO is a planning and decision support tool that implements a solid preservation planning process and integrates services for content characterisation, preservation action and automatic object comparison in a service-oriented architecture to provide maximum support for preservation planning endeavours.

Additional Information:
Tags: data management plan,preserve Contributor: TH, KC Cost: Free
Custom text:
Platform LSF

Platform LSF is a workload manager designed for use in large, high-performance computing environments. This commercial tool can be used to schedule complex scientific workflows and manage very large (up to petaFLOP scale) compute resources. It provides application support across distributed and heterogeneous platforms.

Additional Information:
Tags: computing,workflow Contributor: SA, CS Cost: Cost-basis
Custom text: