I want to search


We've launched a new website!

You're currently accessing the archived version of the DataONE website. To see our new design and keep up to date with the latest DataONE news, visit our new website at https://dataone.org

William: Late-career plant taxonomist


Photo credit: http://bit.ly/1OuRJil
(CC BY-NC 2.0).
Picture is of Charles Stirton.
The person represented here is not affiliated with DataONE and use of their image does not reflect endorsement of DataONE services.

Name, age, and education: 

William is a plant taxonomist working in the University Herbarium at University of Michigan. He is 68 and is looking forward to retiring after a long and productive career.

Life or career goals, fears, hopes, and attitudes: 

William is nearing retirement. He has an office full of paper, photographs, and pressed plant specimens that represents his life’s work. If he doesn’t move the contents of his office after he retires to the limited space in his basement at home, the university will throw it out. He does not want this to happen, but doesn’t know how to stop it. Some of his data have been published in monographs that are accessible in a few libraries around the world, but much of it is not associated with a publication. William would like to make the contents of his office, his life’s work, available to early-career taxonomists who could (potentially) put it to good use.

A day in the life: 

William no longer does much field work himself, but he has a wealth of data from his career: collections of species occurrences, measurements and images (20–30,000 35mm slides). Most of the data are in the many field notebooks he has amassed over his lifetime, include daily notes about where he was and when, indications of pictures taken, collector’s numbers of the specimens he has collected, and descriptions of habitats visited, including comments on soils, local distribution, species abundance, and phenology of species not collected. His locality notes for each collected specimen are recorded on the label for each specimen. The rest of the comments in his notebooks are habitat descriptions and organism occurrence observations that are not tied to collected specimens. Little of the data has been digitized: the species data are in an Excel spreadsheet and some of the images are stored on a portable HD (500GB); he has no system for annotating these other than folder and file names. He collaborates regularly with colleagues in Spain who are interested in the same species.

Reasons for using DataONE to share and to reuse data
Needs and expectations of DataONE tools: 

William would like to have help digitizing his data and a place to put it where it will be used and where he could get credit for it and could see how it is being used. There is a plant taxonomy web site where some of his colleagues have uploaded data, but functionality is limited and it is hard to find. He would prefer to register his data, so people would know about it, but he wants people to ask for permission to use it. This way he could prevent misuse or misinterpretation of his data.

Intellectual and physical skills that can be applied: 

William knows how to use a scanner, but is discouraged by the amount of time it would take to scan the contents of his office. On the other hand, if he only needed to invest a little time beyond the actual act of using the scanner itself in order to deposit his data at a DataONE member node, he would probably start chipping away at his collection.

Technical support available: 

William has little to no technical knowledge and little technical support. He is also unsure how to go about identifying possible sources of support for such activities.

Personal biases about data sharing and reuse (and data management more generally): 

Williams is generally suspicious of raw data sharing. He is more accustomed to publishing his findings in monographs. He is convinced that there is enormous value locked up in the contents of his office, but he is not sure who is best positioned to realize that value or how to do it.

William has obviously been a prolific collector; he has his own processes that have worked over the years for data assurance. Description has been dependent on whether there have been other demands for the data, e.g., for publication. He has a desire for his data to be preserved for future use, but limited motivation due to some suspicion about digitized data and no real knowledge of how to go about the large amount of work it will be to deposit the data in a repository.

DataONE has the potential to inspire William to begin the arduous task of processing the mountain of paper that is his office, making parts of it available through a data repository. When the graduate students who are working with him in his current data collection become aware of DataONE and the plant taxonomy data that has already been deposited there, they will start to understand the gaps that could be filled in current knowledge by digitizing, describing, and depositing some of William’s specimens and notes. With the help of the data librarian at the University of Michigan they can help William map out a plan for evaluating and describing his collection of slides and field notes.

Comparison of current and DataONE-enabled practices:
Current data collection: 

Collects plant taxonomy data (field notes, photographs, specimen slides).

DataONE enabled data collection: 

No change.

Current data assurance: 

Validates data using own standards.

DataONE enabled assurance: 

Data could be assured using standard tools as part of a digitization project.

Current data description: 

Data has been described where published.

DataONE enabled description: 

Training: Graduate students working with William learn Specify v.6 to describe William’s specimens using Darwin Core.

Current data preservation: 


DataONE enabled preservation: 
  • Data Preservation: Graduate students deposit data and metadata in the University of Michigan (UM) data repository
  • Data Preservation: Preservation functions of the UM repository are enhanced by acceptance as a DataONE Member Node
Current data discovery: 

Does not use other researchers’ data.

DataONE enabled discovery: 

Data Discovery, Access, Use and Dissemination: Other researchers discover and use William’s data through DataONE.

Current data integration: 

Does not use other researchers’ data.

DataONE enabled integration: 
  • Data Discovery, Access, Use and Dissemination: Other researchers discover combine William’s data with their own for more complete coverage of the region.
  • Citation: Combined datasets are published and William’s data is cited.

Based on Data Conservancy Pedro persona by Anne Thessen: Comments from David Patterson; revised by Kevin Crowston