Natural Substances in the Compositae: The Bohlmann FilesResultsThe project database has initially been set up under MS Access 2000, adapting the taxonomic module of the BoGART database system for the treatment of identification and nomenclature. Data from the flat-file format ISIS database were parsed to separate and atomise plant names, synonyms, and literature data. Access forms were adapted to the task of nomenclatural editing. For checking and input of chemical data, the ISIS database was amended to hold additional information and to provide a better starting point for final conversion. The new relational database has the following features (see Data structure of the Bohlmann Files Database): Chemical structures are now stored using the JChem software package from ChemAxon. Its structural search capabilities allow searches via the WWW and thus form the core of the web publishing component. After experiencing problems with the webserver component of the software initially chosen for this purpose, switching to the new software package brought about a complete re-implementation of the web interface as well as the migration of the database from Access 2000 to MS SQL Server. Search modes allow for search on chemical structures, including search on sub-structures and structural similarity, and on plant names. A chemical (sub)-structure can be copied from a chemical structure editor in graphic form into the search field, be assembled graphically with the aid of an integrated structure editor, or entered in a chemical text coding language (SMILES, mol-file). As a result of the search, the complete structures of all secondary metabolites are shown that contain the searched structure, plus their trivial names and chemical class and names of the plants from which they were isolated. Further data (voucher information, notes) may be sensitive information, so publication will have to be handled selectively. The search for plant names uses a text input field for names or parts thereof, or a selection from a list of higher name categories (tribes). Fig.: Searching for a substructure in the database, query interface. A (sub-)structure can be drawn or input in textual form in the query form. Fig.: Searching for a substructure in the database, result. The result consists of secondary metabolites containing the structure and associated data. Pressing the green button "Q" (query) leads back to the query interface for a new search on this specific structure. More detailed information can be accessed by pressing the blue button "D" (detail). Plant names have undergone a revision process including spell checking, standardisation (including author citations), and a check against major web-based databases on plant names [2, 3]. They now reflect the recent classification of the Compositae [4]. The data conform to the TDWG standard on names [5] complete with source citations for taxonomic data. 6258 taxa from 839 genera represent all 17 tribes within the three subfamilies. Consistent synonomization within the database (1018 synonyms) is achieved with the help of on-line databases [2, 3]. Literature citations for additional taxonomic information are kept in a separate reference list and are standardized according to the bibliographic citation standard of Willdenowia (scientific Journal published at the BGBM). The chemical information was revised as well. Chemical structures were individually checked, verified against the literature where necessary, and commented where appropriate. Records were completed according to current literature, and where available voucher information was added such as collection number, place of deposit of a voucher specimen, collecting place and date. Literature citations from chemical literature are standardized in accordance with the rules for the Journal of Natural Products. By the end of the year 2001, the data were already considered to be reasonably complete for secondary metabolites found in the Compositae. The database comprised a total of 24,000 compounds, about 10,000 references from chemical publications, and 9,284 names. The exact number of taxa will be known once the synonymisation process has been completed. The web interface has been set up under the URL [https://bohlmann.bgbm.org/bohlmann/]. |
[Bohlmann: Abstract | Introduction | Data Structure | Objectives | Results | Challenges and Solutions | Perspectives | Publications | People | Database |