I want to search


We've launched a new website!

You're currently accessing the archived version of the DataONE website. To see our new design and keep up to date with the latest DataONE news, visit our new website at https://dataone.org

Software Tools


Taverna is an open source family of tools for designing and executing workflows, created by the myGrid project. Written in Java, the family consists of the Taverna Engine (the workhorse), and the Taverna Workbench (desktop client) and Taverna Server (remote workflow execution server) that sit on top of the Engine.

Taverna allows for the automation of experimental methods through the use of a number of different services (such as Web services) from a very diverse set of domains – from biology, chemistry and medicine to music, meteorology and social sciences. Effectively, Taverna allows a scientist with limited computing background and limited technical resources and support to construct highly complex analyses over public and private data and computational resources.

Taverna Workbench 2.1.2 supports: copy/paste, shortcuts, undo/redo, drag and drop; animated workflow diagram; remembers added/removed services; secure Web services support; secure access to resources on the Web; up-to-date R support; intermediate values during workflow runs; myExperiment integration; and Excel and csv spreadsheet support.

Additional Information:

D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M. Pocock, P. Li, and T. Oinn, Taverna: a tool for building and running workflows of services., Nucleic Acids Research, vol. 34, iss. Web Server issue, pp. 729-732, 2006.

T. Oinn, M. Greenwood, M. Addis, N. Alpdemir, J. Ferris, K. Glover, C. Goble, A. Goderis, D. Hull, D. Marvin, P. Li, P. Lord, M. Pocock, M. Senger, R. Stevens, A. Wipat, and C. Wroe, “Taverna: lessons in creating a workflow environment for the life sciences,” Concurrency and Computation: Practice and Experience, vol. 18, iss. 10, pp. 1067-1100, 2006.

J. Sroka, J. Hidders, P. Missier, and C. Goble, "A formal semantics for the Taverna 2 workflow model," Journal of Computer and System Sciences, vol. 76, iss. 6, pp. 490-508, 2009.

J. Zhao, C. Goble, R. Stevens, and D. Turi, "Mining Taverna's semantic web of provenance," Concurrency and Computation: Practice and Experience, vol. 20, iss. 5, pp. 463-472, 2008.

T. Oinn, P. Li, D. Kell, C. Goble, A. Goderis, M. Greenwood, D. Hull, R. Stevens, D. Turi, and J. Zhao, "Taverna/myGrid: Aligning a Workflow System with the Life Sciences Community," , Taylor, I., Gannon, D., and Shields, M., Eds., Springer-Verlag London Ltd, 2006.

Tags: GUI,models,provenance Contributor: Cost: Free
Custom text:

Tableau supports the analysis of tabular data from spreadsheets and relational databases. The tool provides a visual interface that allows users to import data and interactively explore the data through visualizations. These visualizations are created through a graphical user interface that allows users to build queries by dragging and dropping attribute names from tables and spreadsheets.

Tableau also has Tableau Public, which is free visualization software that can be published to the web.

Additional Information:
Tags: analyze,database,visualization Contributor: JF, DG Cost: Cost-basis
Custom text:
SVN - Subversion

SVN (an abbreviation for "subversion") is an open source version control package of the Apache Foundation. Version control is a process whereby: 1) versions of a document are saved for later retrieval, even if the document is later deleted; 2) versions of a document may be compared for differences; 3) multiple authors may edit and build the document version chain, with software support for avoiding, managing, and resolving collisions; 4) catastrophic failure recovery mechanisms are in place to maintain document and version integrity across a wide class of possible threats.

Version control systems such as SVN differ significantly from the version comparison features common in word processors. SVN and similar systems (e.g., CVS, GIT) are focused on program language source code and related text-based documents; they are not optimized for binary documents.

Version control systems' emphasis on audit trails and catastrophe recovery yield them common platforms for backups and managing data integrity. SVN is often implemented in a host/client architecture, whereby the document repository is physically distinct from the development environment. In this model, users install a separate SVN client or use a web client to interact with the system. Version control is a best practice for software development.

Additional Information:
Tags: programming,version control Contributor: DG, MO Cost: Free
Custom text:
STELLA (Systems Thinking for Education and Research)

STELLA (Systems Thinking for Education and Research; from isee Systems) is a modeling software package that diagrams, charts, and uses animation help visual learners discover relationships between variables and helps simplify model building. Stella handles time series, sensitivity, and simulation models well and has a 'drag and drop' modeling interface. Users can download a free trial that has significant features.

Additional Information:
Tags: models,simulation,visualization Contributor: Cost: Cost-basis
Custom text:

STATISTICA is a proprietary analytical software package developed by StatSoft that includes data visualization, data analysis, data management, and data mining tools. It is a primarily graphical user interface (GUI) application.

Additional Information:
Tags: Contributor: TB, EL Cost: Cost-basis
Custom text:

Stata 11 is software for data management, statistics, and graphics. Stata uses point-and-click interaction and help to guide users through tasks. Logs can be created and stored as repeatable scripts, so that data management and analysis are completely documented. Users can perform statistical analyses ranging from basic statistical summaries and linear regression models to multilevel mixed-effects modeling, generalized linear modeling, resampling and simulation, and many multivariate analyses. A graph editor allows users to produce figures based on the data and statistical models. Stata also includes a custom programming language (Mata) for programming customizations. At this time, Stata 11 is the latest version.

Stata comes in four different application "packages" which vary based on size of dataset and processing need. A "Small Stata" is available only to educational purchasers including students, with a limited number of variables and observations permitted in the dataset.

Many local as well as national users groups for Stata exist and hold regular meetings in addition to creating online support communities.

Additional Information:
Tags: graphics,statistics,visualization Contributor: EL, SA Cost: Cost-basis
Custom text:

SQLite is a software library that implements a self-contained SQL database engine. SQLite can be used as a database underlying a website, or as a substitute for a Relational Database Management System (RDBMS).

SQLite supports atomic, consistent, isolated, and durable (ACID) transactions and has easy setup and administration. A complete SQLite database is stored in a single disk file. SQLite supports large databases (up to 1 TB in size). It supports a relatively simple application programmer’s interface (API) and has no external dependencies. SQLite comes with a stand-alone command line interface client that can be used to administer SQLite databases.

An additional extension for SQLite called SpatiaLite exists for adding support for spatial data to SQLite databases. SpatiaLite is conformant with OpenGIS specifications. See the SpatiaLite tool entry for additional information.

The source code for SQLite is in the public domain and implements most of the SQL Standard. Ongoing development and maintenance of SQLite is sponsored by the SQLite Consortium, which includes Mozilla, Bloomberg, Oracle, Nokia, and Adobe. Because of its small size and overhead SQLite is suitable for use in mobile devices such as cellular phones.

Additional Information:

A free plugin called SQLite Manager for viewing and editing SQLite databases is available for Mozilla Firefox at https://addons.mozilla.org/en-US/firefox/addon/sqlite-manager-webext/.
The SpatialLite Extensions for SQLite can be accessed at http://www.gaia-gis.it/spatialite/.

Tags: Contributor: JH, SA Cost: Free
Custom text:
SQL Server

The SQL Server is a relational model database server produced by Microsoft that provides a high performance database platform that’s reliable, scalable, and easy to manage. Its primary query languages are T-SQL and ANSI SQL. There are several Editions of the Server available, which differ depending on the services they provide.

Additional Information:
Tags: database,server Contributor: Cost: Cost-basis
Custom text:

IBM SPSS Amos is a tool used for structural equation modeling. It features drag-and-drop drawing tools and produces graphics of final models for presentation.

Amos uses standard methods – including regression, factor analysis, correlation and analysis of variance. It can be used to create models to test hypotheses and confirm relationships amongst variables.

Additional Information:
Tags: analyze,models,statistics Contributor: RL, MG Cost: Cost-basis
Custom text:

SPSS is a desktop statistical software package that is centered around modeling and statistics. SPSS can access data from many different proprietary and open source data sets and has decent graphing and very good statistical modeling capabilities. One weakness (Up to version 17), is the presentation quality of graphs. Other packages do a much better job at data presentation.

Additional Information: Tags: analyze,models,statistics Contributor: Cost: Cost-basis
Custom text: