I want to search


We've launched a new website!

You're currently accessing the archived version of the DataONE website. To see our new design and keep up to date with the latest DataONE news, visit our new website at https://dataone.org


Metacat is a flexible, open source metadata catalog and data repository that targets scientific data, particularly from ecology and environmental science. Metacat accepts XML as a common syntax for representing the large number of metadata content standards that are relevant to ecology and other sciences. Thus, Metacat is a generic XML database that allows storage, query, and retrieval of arbitrary XML documents without prior knowledge of the XML schema.

Metacat is designed and implemented as a Java servlet application that utilizes a relational database management system to store XML and associated meta-level information. Installation of Metacat recommends the use of Apache Tomcat for servlet management and PostgreSQL as the underlying RDBMS, although other configurations are possible. Metacat provides a rich client Application Programming Interface (API) and supports a variety of languages, including Java, Python, and Perl.

Metacat is being used extensively throughout the world to manage environmental data. It is a key infrastructure component for the NCEAS data catalog, the Knowledge Network for Biocomplexity (KNB) data catalog, and for the DataONE system, among others.

Technical Expertise Required: 
Basic programming skills
Additional Information: 
  • Berkley, C., M. Jones, J. Bojilova, and D. Higgins, 2001. Metacat: A schema-independent XML database system. 13th Intl. Conference on Scientific and Statistical Database Management: 171.
  • Jones, M.B., C. Berkley, J. Bojilova, M. Schildhauer, 2001. Managing scientific metadata, IEEE Internet Computing 5(5): 59-68.
  • Metacat Administrator's Guide (http://knb.ecoinformatics.org/software/dist/MetacatAdministratorGuide.pdf)