Senckenberg Research Institute and Museum, Frankfurt |
Botanical Garden and Botanical Museum Berlin-Dahlem |
TDWG 2000: Digitising Biological Collections
Taxonomic Databases Working Group, 16th Annual Meeting
Senckenberg Museum, Frankfurt, Germany, November 10-12, 2000
* Department of Computer Science, University of Manchester, UK.
** Department of Botany, The Natural History Museum, London, UK.
[Poster presentation]
Floras hold vast resources of botanical data, locked in multiple overlapping natural language texts. MultiFlora aims to provide proof of concept that by applying "Information Extraction" techniques to parallel descriptions of a taxon and correlating the resulting partial datasets, we can derive a usefully complete and accurate description.
Initial hand analysis has produced three-dimensional data matrices (Flora x characters x species ) for five species of Ranunculus, across six Floras. Variations in terminology, and in use of mean values or ranges, are common but genuine disagreements are rare. The GATE system (University of Sheffield) will be used to provide automatic IE, and correlation heuristics will be implemented.
TDWG | Participants | Presentations | Senckenberg Museum | BGBM Biodiversity Informatics
This meeting was co-sponsored by the Committee on Data for Science and Technology (CODATA) |
Page editor: W. Berendsohn, wgb@zedat.fu-berlin.de. © BGBM 2000.