Natural Substances in the Compositae: The Bohlmann FilesIntroduction - Starting PointMembers of the plant family of Compositae are renown for their contents of a variety of physiologically active compounds [1], for example, they are used as remedies as chamomile, insecticides such as pyrethrinoids, or pharmaceuticals such as the anti-malaria agent Artemisinin from Artemisia annua. In the course of their research on the chemistry of the Compositae at the Technical University of Berlin, Prof. Bohlmann and his assistant C. Zdero started in the early 60ies to compile a card index on these natural substances and taxa of the Compositae. Dr. J. Jakupovic was entrusted with the file by Bohlmann before his untimely death and C. Zdero kept the card index up to date. The files cover all natural substances occurring in Compositae, with some reference made to other families for compounds that are of particular chemotaxonomic relevance for the Compositae. Two card indexes existed: about 18,000 cards with structures plus taxon names and original literature references, and about 6400 cards by taxon name, with all compounds found in the respective literature references. The data stem from literature revisions and Bohlmann and Jakupovic's own work, about 95% of the data are published. The literature revision was considered complete (the estimate was that less than 2% of references may have been missing). In 1994, Zdero and Jakupovic started to transfer the data to an ISIS/PC database, including a partial revision of literature references. Berendsohn, a botanist specialised in biodiversity informatics working at the Botanical Garden and Botanical Museum Berlin-Dahlem, joined the team in 1996 to assist in questions of database design, project execution, and botanical data. Data entry was almost exclusively executed by C. Zdero. The flat-file format ISIS database consisted of the following attributes:(1) (Chemical) structure, (2) (trivial) name of the compound (if present),(3) molecular weight and (4) formula, both calculated from (1), (5) taxon or taxa where the compound was found with (6) reference to the literature, (7) reference citation for the original publication of the compound, (8) revision note(s), and (9) other notes. Literature citations (6-7) were referenced by number and kept in a word-processor list of references. At the outset of the project, the ISIS database held data on about 5000 taxa including a total of 19,351 chemical structures. The ISISBase program allows for a search on structures and partial structures as well as on the contents of the text fields. However, ISISBase's hierarchical data structure is rather inflexible if the database was to be extended or linked to external data. Therefore, the data were to be converted to a relational format. The software add-in Accord for Access was selected to manage the structure data, because it allows searching on chemical substructures by drawing a diagram (but unfortunately this software was not up to the task and had to be replaced at a later stage).. In addition to the data content of the nine fields held in the ISIS database, the relational data model was to include fields for a quality assessment of the assignation of a compound to a taxon name, as well as taxonomic status and synonymy of the names cited. The database was initially to be created using Microsoft Access 2000 and later be upgraded to an SQL-Server system. |
[Bohlmann: Abstract | Introduction | Data Structure | Objectives | Results | Challenges and Solutions | Perspectives | Publications | People | Database |