I want to search


We've launched a new website!

You're currently accessing the archived version of the DataONE website. To see our new design and keep up to date with the latest DataONE news, visit our new website at https://dataone.org

2019 Interns

DataONE 2019 Intern Audrey McCombs

Audrey McCombs


Audrey McCombs is a co-major PhD student in Ecology and Statistics at Iowa State University. She holds a Master's degree in philosophy and an MFA in creative writing, and worked for many years in natural resources management before heading back to grad school. Her research interests involve applying complexity science to ecological systems, specifically the study of graph-theoretic characteristics of empirical ecological networks. Originally from the San Francisco Bay Area, she misses the ocean but enjoys the summer thunderstorms and fireflies of the Midwest.

Project Description:

With over 800,000 datasets accessible through programmatic interfaces, DataONE provides a rich corpus of machine readable metadata that is also expressed as a linked open data (LOD) graph. The goal of this project is to explore the LOD graph of DataONE and provide a network analysis on the graph and how the network differs from the content available through the traditional DataONE Application Programming Interface (API). For example: How interconnected are data sets and researchers? How many individual authors contributed to how many data sets? Can fields such as keywords be normalized to a small set of controlled vocabularies? How do network analysis measures differ by metadata standard, year of publication, or other facets?

Primary Mentor: Bryce Mecum
Secondary Mentor: Dave Vieglais
DataONE 2019 Intern Yilin Xia

Yilin Xia


Yilin Xia is a second-year master student in School of Information Science at the University of Illinois at Urbana Champaign (UIUC).
He received his Bachelor Degree in Information Management and Information System (Financial Intelligence) in 2018 from Southwestern University of Finance and Economics. His research interests lie primarily in the interdisciplinary field of machine learning, data visualization, and data provenance. In his free time, Yilin enjoys cooking, traveling and reading.

Project Description:

Data provenance is an important form of metadata that captures the lineage and processing history of data products resulting from data-driven analyses and workflows. Provenance information can increase the transparency, reproducibility, and reuse of data products. Recent years have seen considerable research and development efforts devoted to standards, tools, and applications that capture, store, query, and visualize provenance.
The goal of this project is to study contemporary use of provenance in different stages of the data life-cycle in order to answer questions such as: Who is creating or using provenance and for what purposes? Is provenance capture and use already ingrained and best practice in some domains, or is it viewed as yet another “metadata chore” that scientists reluctantly deal with.
This project consists of two parts: (i) an “environmental scan” / survey of the research literature on data provenance with a focus on provenance tools and applications (possibly including some limited survey work), and (ii) a hands-on part whose goal is to use commonly mentioned tools in their prototypical settings. A key outcome is a report with findings and recommendations based on the literature survey and the intern’s own hands-on experiences.

Primary Mentor: Bertram Ludäscher
Secondary Mentor: Michael Gryk, Robert Sandusky
DataONE 2019 Intern Paige Alfonso

Paige Alfonso


Paige Alfonzo is a Ph.D. candidate at the University of Denver studying Research Methods and Statistics. She received my M.S. in Library Information Science in 2010 from the University of North Texas. She is a researcher whose expertise in social media methods and analysis draws from a variety of fields including statistics, higher education, information science, media studies, communication studies, and literary criticism. Having been trained in a variety of areas related to technology, society, and culture, she currently considers herself an Internet studies scholar grounded within the humanities and social sciences.

Project Description:

As part of our transition to a sustainable future (https://www.dataone.org/future), DataONE seeks to develop a comprehensive understanding of the way in which the organization is discussed and referenced in the broader community. This information will support strategic communication and outreach planning and provide insights into future collaborations and partnerships to be pursued.

Scholarly communications are one method of assessing recognition and DataONE maintains a database of articles published by DataONE and also articles citing DataONE. However, many references are non formal citations and exist on web pages or in blogs and other communications. Additionally, even within the published literature, there is variation in how and where DataONE is cited resulting in some articles not be accurately indexed.

This project will undertake several activities. First, the current database of publications and citing articles will be reviewed and transferred to a public bibliographic manager such as zotero, enabling community contributions to the library moving forward. In doing so, a thorough search of the literature will be conducted to ensure the library is up-to-date. Second, building from a previous project in DataONE that explored ARL library citation of DataONE, this internship will investigate incidence of DataONE citations and links on pages across the web. These will be categorized by various factors such as type of page, type of mention, where the link directs to etc. Research of this type will augment current usage data to help us understand which products and services are value by the community and to explore variation across stakeholder types.

Primary Mentor: Amber Budden
Secondary Mentor: Amanda Whitmire
DataONE 2019 Intern Rhea Peddinti

Rhea Peddinti


Rhea Peddinti is in her third year pursuing a Bachelor of Science in Information and Decision Sciences from the University of Illinois at Chicago. Her research interests include information management, data analysis, and strategic marketing. Rhea enjoys graphic design, web design, and photography which she hopes to be able to integrate into her work as much as possible. In her free time, Rhea loves to bake, try out new restaurants, and spend time with her dog.

Project Description:

The DataONE Users Group (DUG) is the worldwide community of Earth observation data authors, users, and diverse stakeholders that makeup the DataONE partnership communities. The primary function of the DUG has been to represent the needs and interests of these communities in the activities of DataONE. Members of the DUG include representatives of the member repositories, coordinating nodes, researchers and other relevant groups (e.g. research networks, professional societies, libraries, academic institutions, data centers etc.).

As DataONE moves towards a sustainable future (https://www.dataone.org/future) the user community will become increasingly important in contributing to a community-driven organizational structure and in advocating for DataONE products and services. To support this distributed advocacy, we seek to develop an outreach kit for the user group members. The outreach kit content will be grounded in the data compiled from interviews, surveys, and other sources during the previous year.

The intern will design a DataONE outreach kit in collaboration with the primary mentors, who are the current chairs of the DataONE Users Group. The intern will collaborate on development of communication materials for each of the topics / products in formats identified as valuable to the user community (e.g. PDF downloads, slide presentations, image directories etc). These materials will leverage previous materials developed by the DataONE team and be consistent with current DataONE branding.

Primary Mentor: Robert Sandusky, Karl Benedict
Secondary Mentor: Amber Budden, Megan Mach
DataONE 2019 Intern Saraneh Fitzgerald

Saraneh Fitzgerald


Saraneh Fitzgerald is a recent graduate of the Clark University Geography program in Worcester, MA. Her research interests include melt ponding on Arctic sea ice, citizen science, tropical islands and atolls, and the effects of climate change on land-ocean boundaries. Her passion for open-source practices and community-driven data first developed during an undergraduate internship funded by the National Science Foundation in a project called “Citizen Science GIS”. She hopes to make more contributions to the open-source community throughout her career. Saraneh is a candidate for her Master’s degree in G.I.S., expected in Summer of 2019. Other activities and interests include running, snorkeling coral reefs, and nudibranchs.

Project Description:

About DataONE
DataONE supports synthesis research through enhanced search and discovery of Earth and environmental science data from across a network of integrated data repositories. Efficiencies to researchers can include reduced time in data discovery, refined search function resulting in more relevant data results and the ability to download data from multiple repositories among others. Researchers working in synthesis science, conducting systematic reviews or meta-analyses will benefit from using DataONE as a data search engine.

The problem
Over the past ten years DataONE has focused on both making earth and environmental data accessible, and also highlighting the importance of strong skills in data management for researchers. We have published data management education modules and led workshops to develop best practices. As a next step we have moved many of these materials to a community-based platform to increase their use and usability by the research community. Education materials are now being hosted through GitHub on our Data Management Skillbuilding Hub and we want users downloading, editing, and contributing to keeping them updated.

The project
This project will develop technical tutorials concerning submission of material to the Skillbuilding Hub infrastructure (in GitHub) for a non-technical audience. These tutorials will be written in markdown as pages within the Hub and will provide support for the contribution of several different types of content. To ensure the longevity of these materials this internship will also support development of a method for assigning a DOI to each specific education resource in collaboration with the DataONE team. Usability of the tutorial and DOI creation will be assessed through the creation a survey, administered to the DataONE User Group and other identified parties. Feedback will be incorporated in finalized materials, to be launched on the Skillbuilding Hub.

Primary Mentor: Megan Mach
Secondary Mentor: Dave Vieglais