BSPP Presidential Meeting 1996

Unlocking the Future: Information Technology in Plant Pathology

Parallel session 8B: Computer-based species identification: Novel developments

Development of neural networks for identification
Lynne Boddy*, Colin Morris** & Alexandra Morgan*
* School of Pure and Applied Biology, University of Wales, Cardiff, UK.
** Department of Computer Studies, University of Glamorgan, UK

Artificial neural networks (ANN) were initially developed to mimic the brain, but have been used successfully in many fields, including pattern recognition. Unlike most identification approaches they are not rule-based, but rather 'learn/train' from patterns presented to them. They can cope with incomplete, partially contradictory 'fuzzy' data, and hence are well suited to biological problems, including identification. A variety of ANN paradigms are suitable for identification, and the way in which some of these (multilayer perceptron, radial basis function and learning vector quantization) operate and the relative merits of each will be reviewed. Training and testing ANNs will be illustrated with data from the genus Pestalotiopsis (fungal pathogens of plants) and from marine phytoplankton. The ability to cope with missing data and novel data, and the pros and cons of using ANNs for identification will be discussed. 


Automated analysis of insect sounds using TESPAR and expert systems - a new method for species identification
E D Chesmore, M D Swarbrick and O P Femminella
Environmental Electronics Research Group, Department of Electronic Engineering, University of Hull, Hull HU6 7RX, UK.

The production of sound in insects can be derived from several sources - deliberately generated signals for communication as is found in the Orthoptera, sounds created by movement such as flying and those produced by eating. In many cases, it is considered that the acoustic signals produced can be used to identify to group or even species level; it is this potential that is the subject of the research.

The paper describes work carried out in 1995/6 as a final-year undergraduate project in the Electronic Engineering Department at Hull. The project investigated the analysis and recognition of British Orthoptera although the techniques developed are equally applicable to other insect groups (Hemiptera, Coleoptera, etc.) and other phyla. The project made extensive use of digital signal processing (DSP) and expert system techniques to achieve the following:

  • Extraction of information on the sound production mechanism, e.g. tooth impact rate, resonances, etc.
  • Compression of the information by "recognizing" structure such as syllables
  • Recognition of species and discrimination between call types.

The software package, written in 'C' under Windows, is called ISAR (Insect Sound Analysis and Recognition) and utilizes various DSP tools to analyse signals, passing the results to a multiple expert system, known as a 'blackboard system', for further analysis and recognition. Conventional frequency domain analysis and time domain characterization (e.g. average zero-crossing rate, average energy, amplitude density, autocorrelation) together with Time Encoded Signal Processing and Recognition (TESPAR) are applied to sampled sounds to extract a set of statistical biometric features. The features are then passed to the blackboard system for hierarchical determination of the sound, e.g. syllable, chirp/trill and phrase/song type. Hypotheses about these song units are posted to the blackboard for examination by knowledge sources (KS) which may in turn post further hypotheses. The top level of the blackboard is species.

The paper will describe the techniques involved in the ISAR system, concentrating on the TESPAR signal analysis which has been used successfully in the past for speech analysis and acoustic condition monitoring of machinery. The results of the analyses for Orthoptera will be presented. These indicate that species-level identification is feasible with low-cost computing. The applicability of this approach will also be discussed. A demonstration will also be given if time permits. 


Mixing elements from different identification systems
Paul Bridge
CABI International Mycological Institute, Bakeham Lane, Egham, Surrey TW20 9TY, UK.

The identification process, whether computer assisted or manual, must involve a final decision making step. This final step is arrived at either through a series of earlier decisions, as in keys and expert systems, or as a result of a numerical value in probability and identification score based schemes. Many manual schemes such as some dichotomous keys cannot fail to give a final identification, whereas numerical systems generally employ some form of critical cut-off value in order to validate the final identification. Increasingly, computer- assisted identification systems will use combinations of elements from keys, probability scores or profile matching to make a final decision on the validity or likelihood of the final identification. This process may not only validate the identification but may also be able to provide insights into the reasons for a poor or unsatisfactory identification.

Any identification system will always be placed under limitations imposed by the quality of the reference data, particularly in biology where the reference data must be sufficiently comprehensive in order to reflect the natural variablity within a taxon. All identification systems are subject to this shortcoming, and numerical systems that rely on a critical cutoff level can be particularly sensitive. Different identification coefficients will react differently to aberrant or atypical data, and combining the results from more than one coefficient can result in a more "robust" scheme.

This paper demonstrates how results obtained from different numerical coefficients can be correlated to give further information on the validity of an identification. The relationship of the numerical schemes to both profile matching and key character approaches will also be considered. The combination of these different elements can be used to provide a single robust identification program which will be demonstrated with a frequency matrix of taxonomic characters for a defined group of fungi.


The role of the user in computer-based species identification
GM Tardivel and DR Morse
Computing Laboratory, University of Kent at Canterbury, Canterbury, Kent CT2 7NZ, UK. 

Other papers at this conference will review many of the technologies which are used in computer-based species identification. These techniques range from the multi-access key to neural networks and expert systems. While their relative merits have, and will be, hotly debated from a technical viewpoint, it appears that rather less work has been done on comparing and evaluating their usability, efficiency and effectiveness from the user's perspective. This paper represents an attempt to redress that balance.

The paper describes a series of empirical evaluation experiments where a paper dichotomous key to woodlice, several versions of a hypertext version of the key and a multi-access key were compared. In each experiment, subjects were asked to identify one specimen of woodlouse. Each subject's ability to reach an identification, the accuracy of that identification and the time taken to reach it were recorded, as were their opinions of the key they had used. In one experiment, with second-year undergraduate zoology students, the students were also asked to record the confidence which they placed in the accuracy of their identification.

In the latter experiment, all students achieved an identification, of which 74% were correct. The success rate was similar for all three identification methods, with students using the paper key being slightly more accurate than the two computer-based approaches. The latter were slower than the paper key. In all media the correct identifications were on average 4 minutes slower than those which were incorrect. Students using the multi-access and paper keys had equal confidence in their identification but their confidence was more often misplaced using the paper key. Students were generally most confident of their identification using the hypertext key, although this medium was slightly less accurate.

The comments which have occurred most frequently in all our evaluation experiments have concerned usability and navigation issues. In particular, our volunteers have commented on the frequently encountered problem of finding their way round the paper key. In general, volunteers found both computer-based keys easy to use but the diagrams and colour plates in both keys were criticized, either because they weren't there, or because of the poor quality of some of the images. The former comment concerning usability was one or our original motivations for developing hypertext versions of paper keys. The latter comments could be addressed in future versions of the computer-based keys and do not pertain to all such keys.

Finally, an attempt has been made to discover where and how subjects went wrong in their identification and which decisions they found difficult, in order that the efficiency and accuracy of the identification process can be improved. Better software monitoring techniques are required in order to find out exactly what people are doing during the identification process. However, the experimental protocols and results described in this paper are a useful first step in putting the user back at the centre of the debate on the development of computer-based species identification tools.


OPEN FORUM: Biology and Information Technology: The Road Ahead

The IT revolution: how it may support or disturb biology
Peter Cochrane
British Telecom Laboratories, Martlesham Heath, Ipswich IP5 7RE, UK.

A member of the Artificial Life community recently pleaded that we should not anthropomorphise machines because they might not like it. Well, I feel as if I am being Silicomorphised by technology - and I don't like it either. Even before the start of the electronics revolution in 1915 we have been bending people into technology. Well, it really seems high time to start bending technology into people - it is supposed to help and not hinder us. But our technology has been limited, and only recently have we been able to start the process of humanising machines. If information access equates to power, then usability becomes a moral issue, and the have and have-nots are all at risk. The question is - can we rise to the challenge - do we know enough about humans to realise better environments? Artificial life, or machines that think, provide the way ahead for human-machine interaction. They will create a technology interface for a far broader range of user, they will perform ever more complex tasks on our behalf, and they will enable us to live and work in our increasingly chaotic world.


THE BOOK OF THE CONFERENCE

A book, Information Technology, Plant Pathology and Biodiversity edited by Peter Scott, Paul Bridge, Peter Jeffries and David Morse, will be based on the Conference. It will be published in 1997 by CAB INTERNATIONAL, and will be available at a special price to those who attend the meeting and to Members of the British Society for Plant Pathology and the Systematics Association. For further information or to place an order, contact Lorraine Rogers, CAB INTERNATIONAL, Wallingford OX10 8DE, UK. Fax: +44 (0)1491 833508. E-mail: L.Rogers@cabi.org