The goal of the ICTVdB is to describe all viruses of animals (vertebrates, invertebrates, protozoa), plants (higher plants and algae), bacteria, fungi, and archaea from the family level down to strains and isolates. The lower levels of classification have important applications in medicine and agriculture, but also give insight into evolutionary trends. The database will thus benefit research and applications at all levels of expertise.
The DELTA system is capable of producing high-quality printed descriptions as well as descriptions in HyperText Markup Language (HTML) for display on the Web. DELTA data can include any amount of text to qualify or amplify the coded information, and this text can be carried through into the descriptions. Common features can be omitted from the data and the descriptions, while remaining available for identification and analysis. These attributes are exemplified in books such as Viruses of Plants in Australia (Büchen-Osmond et al, 1988) and The Grass Genera of the World (Watson & Dallwitz,1992), which were generated automatically from DELTA databases.
The DELTA system is particularly useful in the international context envisaged by ICTV. For example, Intkey packages can be prepared in different spoken languages simply by translating the character list. Intkey is particularly easy to translate into other languages, as all of the program text (menus, commands, prompts, diagnostic messages, and help) are in simple text files separate from the program files. English, French, German, Malay, Portuguese, and Spanish versions are currently available.
This type of numbering system allows the user to follow the path linking the various features of one genus or family, and permits the presentation of similarities between different groups at more than one level. The numbering system also gives an internal structure to the database that indicates the descriptors needed for completing the coding of a specific virus or data set. With this numbering system the database is structured such that we do not need to repeat the same information between levels. When calling up a description of a strain we can supply the pointers to the next higher level where the full description of the species is stored as illustrated in Table 2.
The numbers assigned to each virus also serve as locators numbers within the database and as an unchanging reference. The locator number is easily transformed into a file accession number by adding htm after the accession number example of the family Parvoviridae where the family level becomes 00.050.htm , the genus level, that is the genus Parvovirus, 00.050.1.01.htm, and for the type species Minute virus of mice 00.050.1.01.001.htm. These accession numbers are used throughout the ICTVdB as file names to access the computer generated virus descriptions. Thus they can also be used as pointers to link from other databases to a particular virus description at this server.
Home Page - Introduction to the virus databases on-line. The user can best access the ICTVdB Web site through the
Home Pagewhich also introduces other virus databases on-line that have been developed by researchers in the Molecular Evolution and Systematics Group, Research School of Biological Sciences, Australian National University. The information belonging to the ICTVdB project is marked with the ICTV logo, and is contained in about 800 files. Other databases on the Web relevant to the ICTVdB and which have been used to establish links, for example, to electron micrographs or genomic sequence data, are also listed on the Home Page.
Indexes. The ability to select records is a quintessential feature of a functional database. As shown in Figure 1, in the ICTVdB Web site facilitates this by making data accessible both alphabetically, in the
Index of Viruses
Index of Viruses(the complete lists of virus families, genera and species from the VIth ICTV Report) and deci-numerically in the ICTVdB Index. For example, the Index of Viruses may be searched from a species name or synonym in alphabetical order, or from a family name in alphabetical order, by host range , or from nucleic acid composition. As pointed out above, classification beyond family in the ICTVdB is numerically based following alphabetic listing. It is clear in Figure 1 that a variety of paths all lead to the current natural language translation of the ICTVdB descriptions.
The Index of Viruses is the main entry point to the Classification and Nomenclature section of the ICTVdB. It provides easy access to the records describing an individual virus, as well as to information on its relatives. In addition to the Index of Viruses there is an index to the virus descriptions that have been generated from DELTA records. This index is automatically regenerated and updated every time new information is added to the database and translated into hypertext.
Figure 1: ICTVdB on the Web. This diagram displays the various index files that guide the browser through the different parts of the ICTVdB to the virus descriptions.
Thus Figure 1 summarises the present functional basis of ICTVdB, which will be greatly expanded by future data input, and improved interoperability with other related databases. Already, in cases where the genomic sequence accession number for a virus species is listed, links to either GenBank (NCBI) or EMBL (EBI) have been established. It is planned to have mirror sites in Europe and North America, so that the access time can be reduced for the user and the mirror sites will have links to appropriate genome databank. Figure 1 further indicates that it is planned to add a search engine to improve the speed and flexibility with which a particular virus can be retrieved, even if the user does not know the correct name.
Intkey. In future, Intkey will be available to interrogate the database on the Web. Using Intkey, a virus can be identified by comparing its attributes with stored descriptions of taxa. At present, the required data files and images must be down loaded to the users PC (via ftp, gopher, WWW), but a future version of Intkey will be able to access these files directly from the WWW. Images which are part of the database are used by Intkey. All images in the database can be accessed via hyperlinks from other databases. By the same token, images from other databases can be linked to the ICTVdB and will thus become accessible through the database, without becoming physically a part of it.
The WWW format should now greatly facilitate the formerly cumbersome process of data acquisition by printed questionnaires, posted to the potential suppliers of data. This method, used to construct the VIDE plant virus database (eg. Büchen-Osmond et al., 1988) often did not attract enthusiastic cooperation of colleagues through the sheer complexity of the questionnaires, and the repetitive handling of data.
The first task will be to devise a new way of data acquisition. To this end, an electronic questionnaire based on standardised characters, will be devised as the primary data sheet. It will contain a highly structured index of keywords and headings to the different sections within the character list, that are collapsible or expandable. From previous experience we know that it helps greatly if the questionnaire contains already available data, the expert recipient being invited to fill in the gaps, and review existing data. It also helps if expert opinion is restricted to characters appropriate to the particular virus family. The data will be transformed into DELTA format and will go through the same reviewing process engaged for any description prepared for the ICTV Reports. Only after the reviewing committee is satisfied with the new submission, the coded description will be placed permanently into the database. The Web accessibility of the virus descriptions at all stages of preparation will also facilitate the reviewing process by the Study Groups of the ICTV.
The second task is to maintain and coordinate data acquisition and entry. Data entries must be checked and regularly updated, to take account of developments in the field. However, the coordinator will not be able to keep up with all the latest movements in virus research. New findings must be provided to the ICTVdB from the virological community regularly. This is essential if the database is to provide the community with a reliable up-to-date source of information.
A third task is to use ICTVdB to generate future ICTV Reports on the Nomenclature and Classification of Viruses, now laboriously compiled by the ICTV Study Groups. It is envisaged that the future Reports will be generated from the DELTA database. In future, most of the descriptions in the database will be of species and strains, as a DELTA program can summarise the characteristics of all species of one genus, for example, thus generating an accurate summary that will reflect much more objectively the features of a genus. The description of genera and families in the future reports can be based on these summaries.
Judging from recent comments on other microbial databases (Wertheim, 1995), the ICTVdB may be already one of the most advanced, interoperable databases in biology, in structural terms at least. A major effort is now required to complete the database, drawing on the expertise of the virological community as a whole. The future success of ICTVdB thus depends heavily on the help and goodwill of all virologists, and the continued enthusiasm of the experts in the ICTV Study Groups.
Büchen-Osmond C, Blaine L, Horzinek MC (1996). In: Data and Knowledge in a Changing World: The Quest for a Healthier Environment; Chambéry, 94 CODATA Conference, Ed PS Glaeser. CODATA, Paris 8 pp (in press).
Büchen-Osmond C (1995) http://life.anu.edu.au/viruses/welcome.htm
Dallwitz MJ (1980). Taxon 29, 41-46.
Dallwitz MJ, Paine TA, Zurcher EJ (1993). DELTA User's Guide: a general system for processing taxonomic descriptions. 4th edition, CSIRO, Canberra. 136 pp.
Murhpy FA, Fauquet CM, Mayo MA, Jarvis AW, Ghabrial SA, Summers MD, Martelli GP, Bishop DHL (1995). Sixth Report on the International Committee on Taxonomy of Viruses. Springer Verlag, Wien, New York
Watson L, Dallwitz MJ (1992). Grass Genera of the World. CABI, Wallingford 1083 pp.
Wertheim M (1995). Science 269, 1516.