BSPP Presidential Meeting 1996
The incredible pace of change - Information Technology in support of
Peter R Scott
CAB INTERNATIONAL, Wallingford OX10 8DE, UK.
Handling facts to produce information
Handling the information explosion in plant pathology
Data management: bibliographic; nomenclatural; molecular; etc.
Interpreting information to produce knowledge
Taxonomic Information Systems
Geographic Information Systems
Using knowledge to support decision making
Expert Systems for disease management; etc.
Using knowledge to make predictions
Passing on knowledge in education and training
Distance learning; etc.
Storing and disseminating information
Intranets, the Internet, World Wide Web
Molecular Plant Pathology Online
Opportunities for developing countries.
Development of computer-based systems in systematics
Peter H A Sneath
Department of Microbiology and Immunology, University of Leicester, Leicester LE1 9HN, UK.
Systematics Association keynote paper
An overview will be presented of the way computers have been applied to systematic biology. Before the development of digital computers there were very few numerical applications to systematics. Systematics is a subject that requires the marshalling of numerous pieces of information, ranging from the estimation of resemblances between organisms to the preparation of regional checklists. Data processing based on information theory was therefore an essential.
These developments first required a logical strategy. Most early workers viewed systematics as so complex an art that it would be infeasible to reduce it to a science with logical rules. Numerical taxonomy defined a logical progression: from organisms and their characters to measures of resemblance between organisms; and then to finding taxonomic structure by cluster analysis, ordination and phylogenetic trees. These steps led on to databases on organisms; and thence to numerical methods for identification.
A growing field is the preparation of descriptions of taxa by computer methods, with extensions
to ecology, pathology, geography and other areas. A good deal of computing is directed to
molecular sequences, mainly for the study of phylogeny. But it is numerical identification that is
especially relevant for the present meeting, which commemorates the 21st anniversary of a key
symposium on this topic. Computing has made the early concepts practicable, and the field has
diversified into sophisticated methods for diagnostic keys, polyclaves and taxonomic distance
models. One particular benefit is the way in which, through probabilistic concepts, the reliability
of identification systems can be assured.
Handling the information explosion: the challenge of data management
John E Anderson
BIOSIS, 2100 Arch Street, Philadelphia, PA 19103, USA.
"Rise up, my fellow biologists, and throw off the bonds of techno-tyranny!" Information technology is a servant, at most a colleague, not a ruler or tyrant.
Most scientists have suffered the plight of working with a knowledgeable colleague in a foreign language. Untoward time and energy is expended on communication, thereby reducing the effort and knowledge available for the issue at hand. For such a collaboration to grow, both parties must work toward learning the language of the other, otherwise the labour imbalance may become detrimental, even causing termination. Regarding an information/knowledge database as a servant or colleague gives rise to a healthy assessment of effort expenditure. Why should computer databanks not come as much to us (learning the language of science) as we to them (e.g. we have all learned some of the language of technology)?
In part because of the continuing information explosion, science MUST transfer more of the burden of communication WITH information technology TO information technology. If we, as the ultimate life form, do not delegate more responsibility to our tools (computer knowledge banks), we'll not have time and energy to do what only scientists can do: SCIENCE.
What then can life science expect of its colleague, information technology? Life scientists can expect, and should demand:
- To communicate with information technology in a manner more akin to that of humans - "natural language".
- To reach the knowledge in real time, beyond the abstract of a publication, beyond the publication, to the data/information/knowledge from which the publication was written, and even to the author/scientist.
- To communicate via subject matter concepts with data technology, not merely the scientific "baby talk" of co-occurrence of a few terms which may or may not capture the concept.
The last of these three points is the subject of this paper. It is related to, and even dependent upon, the first two.
Keeping pests in their place: international plant quarantine data bases
European and Mediterranean Plant Protection Organization, 1 rue Le Ntre, 75016 Paris, France
The objective of plant quarantine is to prevent the spread of pests to areas where they do not occur. According to the International Plant Protection Convention (IPPC), phytosanitary measures are aimed principally at "quarantine pests", i.e. pests of known economic importance, known to be absent from the area to be protected. These measures will be the more effective if they address pests of known identity, whose geographical distribution is precisely known, together with the pathways (usually consignments of plants or plant products) by which they are likely be moved through international trade. Moreover, official phytosanitary measures are non-tariff barriers to trade, and have therefore, for member countries of the World Trade Organization (formerly GATT), to be scientifically justified and commensurate with the risks involved. Today's plant quarantine can no longer be based on highly restrictive measures against ill-defined risks; it has on the contrary to be based on detailed and transparent information, with due consideration to the quality, reliability, timeliness and significance of data. It is not enough, for example, to record that according to a stated bibliographic reference or a collected specimen a pest occurred in a country at some date in the past. The identity of the pest, the circumstances of the find, its relevance today, can all be called into question. On this basis, dubious poorly documented records may be included, provided they are appropriately assessed.
Informatic and experimental approaches to the elucidation of novel gene functions
Department of Biochemistry and Applied Molecular Biology, UMIST, PO Box 88, Manchester M60 1QD
The genome of the yeast, Saccharomyces cerevisiae, has now been
completely sequenced. The picture of biology which is emerging from
yeast and the other genome projects is that c. 50% of the genes
identified are of completely unknown function. This is a sobering
statistic and one must question why molecular genetics has failed to
find these genes previously by its classical, "function first",
approach. A high level of redundancy is revealed by the informatic
analysis of the genome and this undoubtedly has a lot to do with our
previous failure to find these genes. In many cases, redundancy may
be more apparent than real. I think we must consider the
possibility that the way we do our science may have placed
constraints on the set of genes that we have been able to discover.
Most molecular genetics experiments are designed to provide
qualitative answers. It may be that this has caused us to miss many
genes whose contribution is a more quantitative one. If this is
true, it suggests that a systematic approach, involving both
bioinformatics and 'wet' experiments, will be required in order to
elucidate the function of all of these novel genes. Like genome
sequencing, the systematic analysis of gene function is certain to
provide interesting and unexpected findings which, we hope, will
lead us to new biological truths.
Handling facts to produce information - Where from here?
Simon B Jones
CAB INTERNATIONAL, Wallingford OX10 8DE, UK.
The paper will review developments in database and information retrieval technology to highlight how they will impact biological research in the future. The likelihood of successful application of each technology will be assessed through comparisons with early adopters. The paper will focus on three key developments:
- Data warehousing, in which large collections of data from many sources are collected and managed together. Different approaches will be examined in which the data are pooled in a single database, or managed as separate, linked databases. The feasibility of "virtual" data warehouses with networked links between geographically separate sources will be explored.
- The application of "data mining" technology, sometimes in conjunction with data warehousing, to derive conclusions which are not anticipated by the database structure.
- New paradigms for database management, especially object oriented systems, which allow greater flexibility for handling multiple and varying data types.
The emphasis in the paper will be on how to match the technology to the way in which scientific research work is carried out.