International Journal of Medical Informatics (2005) 74, 267—277
Implementing health care systems using XML standards Ralf Schweigera,∗, Matthias Brumhardb, Simon Hoelzerc, Joachim Dudecka a
Institute for Medical Informatics, Justus-Liebig-University, Heinrich-Buff-Ring 44, 35392 Giessen, Germany b University Hospital Pharmacy, Giessen, Germany c H+ The Swiss Hospitals, Berne, Switzerland Received 8 January 2004; accepted 16 April 2004 KEYWORDS XML; Topic Maps; Semantic Web
Summary Most healthcare data is narrative text and often not accessible and easy to find at the clinical workstation. XML related standards (XML schema, XForms, XSL, Topic Maps, etc.) provide an infrastructure that might change the situation. Yet, it is up to the application developers to combine the given standards and tools into a running system. The cost of development is often underestimated and may explain the absence of comprehensive XML applications. Our goal is the clinical application of these standards. We have, therefore, implemented the idea of ‘‘plug-and-play XML’’, i.e. the development of new applications by means of XML standards. This paper will communicate our experience using such an approach at the example of a clinical drug information system. © 2004 Elsevier Ireland Ltd. All rights reserved.
1. Introduction The German drug formulary (‘‘Rote Liste’’) comprises about 10,000 different drugs (brand names). Only a subset of approximately 1000 drugs is actually prescribed at the University Hospital of Giessen. A local drug commission composed of the hospital’s pharmacy and several clinics meets regularly and selects the drugs according to several * Corresponding author. Tel.: +49 641 9941370; fax: +49 641 9941359. E-mail address:
[email protected] . de (R. Schweiger)
criteria such as the price, the availability and the quality of a drug. Moreover, the commission records clinical experience which is then published in the form of clinical drug guidelines. Minutes of the commissions’ meetings are published as well. The result of these activities is a local drug formulary (‘‘Hausliste’’) that is part of the University Hospital’s clinical network. The Giessen drug formulary is substance oriented, i.e. the clinics can order drug substances in a pharmaceutical form, e.g. acetylsalicylic acid 100 mg 24 pills, and the hospital’s pharmacy determines the corresponding drugs in stock, e.g. Aspirin® . Drug substances are classified into application groups such as analgesics (pain
1386-5056/$ — see front matter © 2004 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.ijmedinf.2004.04.019
268
R. Schweiger et al. pharmacy decided to base the hospital’s formulary on an Internet platform.
2. Methods 2.1. Structuring drug information using XML
Fig. 1 Text document describing drug information (Hausliste).
relievers), which are represented as text files (Fig. 1). A local drug information system provides the clinical users with an index of application groups, drug substances and available drug brand names. The drug index helps the clinical users to quickly find the item of interest. However, in the Internet era people are used to multimedia and hyperlinks. The plain text formulary has now become out-dated because plain text is limited in this respect. The user acceptance of the text based drug information system became worse and clinics started to use different drug formularies. As a consequence, the pharmacy had to map a lot of drugs that were not listed in the hospital’s drug formulary. In cooperation with the Giessen Institute for Medical Informatics, which has been implementing XML standards since 1998, the
The Internet platform of the hospital’s drug formulary must enable future applications, e.g. a clinical drug order entry system (‘‘electronic drug store’’) that will run within a Web browser. An application developer must consequently be able to extract meaningful formulary items such as drug substances and pharmaceutical forms. On the other hand, the formulary comprises a variety of data: XML documents (drug application groups), HTML documents (minutes of the drug commission) and PDF documents (clinical drug guidelines). Heterogeneous data is difficult to structure along a single schema. XML [1] provides a way to combine structure and heterogeneity. The application developer is able to identify meaningful items within the text. At the same time XML documents can coexist with differently structured and formatted data such as HTML and PDF. Fig. 2 shows one of several hundred files that have been semi-automatically converted from plain text to XML. The formulary contains application groups
thyroid hormons , drug substances
levothyroxin-natrium , drug brand names
l-Thyroxin Henning® and pharmaceutical forms
. Cross-
Fig. 2 XML document describing drug information.
Implementing health care systems using XML standards references to other documents are marked with elements. If necessary the author can also attach comments, hints and other textual narrations to the elements. Compared to the plain text formulary the XML formulary reveals new possibilities of processing. In a first step, the presentation of the formulary has been improved using XSL style sheets [2] (Fig. 1 ⇒ Fig. 6). Beyond the optical improvements the clinical user can easily jump to related documents using hyperlinks. An authoring tool automatically produces an index of drugs for clinical users, which the pharmacist had to be update manually in the plain text formulary. More advanced application services such as the electronic drug store will be implemented step by step. The clinical user simply clicks onto a pharmaceutical form and specifies the quantity. The order entry system automatically retrieves the corresponding drug substance and other items to complete the order. The application can also remember the profiles of single clinics or even individual users and send the electronically signed order to the pharmacy.
2.2. Maintaining drug information using XML forms The conversion of the drug formulary from plain text to XML was the initial effort. The continuous update of the XML formulary, however, was more challenging. It is the pharmacist who has to update the formulary’s content. These kind of authors are usually not interested in technical details. We have therefore developed a reusable authoring tool that hides the XML source in Fig. 2 and runs within a Web browser. The overall concept behind our tools is ‘‘plug-and-play XML’’, i.e. the development of new applications by means of XML related standards [3]. The pharmacist selects, for example, the XML document in Fig. 2 and the authoring tool automatically generates a corresponding HTML form (Fig. 5) that allows the pharmacist to enter and change the data. The XML document may be linked to a document type definition (DTD) or XML schema. As it comes to user interfaces we also need to describe interactive elements such as human readable labels and form controls (text areas, select boxes, radio buttons, etc.). The XForms standard allows one to define and relate interactive elements to XML elements [4]. The XML form in Fig. 3, for example, refers to an external XML schema ‘‘hausliste.xsd’’ that describes the XML structure of the hospital’s drug formulary. The XML schema contains the optional elements (comment) and (drug store location,
269 e.g. refrigerator) that do not occur in the instance document of Fig. 2. XML elements are addressed using the XPath standard [5]. The XPath expression ‘‘/hausliste/wirkstoff/artikel/lagerort’’, for example, identifies the elements in an XML document. The element is used to define a choice of possible values and to attach the value list to the element in the drug formulary. Human readable labels of elements and values allow the storage format (tags, coded values) to be hidden from the user. A key feature of our authoring tool is structural flexibility, i.e. the author may change the structure of the documents to meet the latest documentation requirements. The pharmacy has added, for example, a unique product number to the XML formulary that allows the integration with another drug system. Another aspect of flexibility is the support of different XML standards. Our authoring tool works with XML documents, DTDs, XML schemas and XML forms [6]. In the simplest case the application developer only creates an empty XML document and the author (domain expert) can immediately start to enter and change structured data within a Web browser.
2.3. Relating and searching drug information using Topic Maps The authoring tool is only one part of the drug information system. We also need a search engine that is able to index and search XML documents. In this context we posed the question: How can a search engine exploit XML related standards to improve search quality? We investigated several search parameters. The search should be precise (most relevant information first), complete (find all relevant information), tolerant with respect to query notations (incomplete search terms, spelling errors) and, above all, simple (the information is only a click or search term away) and fast despite a huge amount of data. We distinguish three levels of XML exploitation: exploitation of (1) XML documents, (2) XML schemas and (3) XML metadata. (1) XML documents XML markup subdivides a text document into meaningful topics (drug applications, drug substances, drug brand names, pharmaceutical forms) that can be perceived as separate resources. Due to the XML markup a search engine can reliably identify the topics in the text. A topic within a document is a more precise search result than the document as a whole. For this kind of exploitation the name and the meaning of the XML elements does not
270
R. Schweiger et al.
Fig. 3 XML form relating labels and form controls with XML elements.
matter. The XML element pain reliever , for example, already establishes a meaningful relationship between the terms ‘‘pain’’ and ‘‘reliever’’ independent of what the tag means. Existing text matching methods try to relate the search terms using the proximity measure, i.e. the closer the search terms occur in the text the better. However, proximity is a less reliable relationship indicator than XML markup. XML does not only improve the precision of a search but also the robustness and the completeness of a search. Let us assume a user enters the character string ‘‘painreliever’’. The word fragments ‘‘pain’’ and ‘‘reliever’’ represent separate topics which is obvious to a human but not to a machine. XML markup, however, enables a search engine to count the number of topics in which the word ‘‘pain’’ occurs. Such a statistical analysis of the topic space is language independent and allows a search engine to reliably parse the query into meaningful atoms that can be topically related with each other. The query ‘‘painreliever’’ is consequently equivalent to the query ‘‘pain reliever’’. The decomposition of a query into meaningful atoms also improves the completeness of a search because all possible notations of a query can be composed out of it. (2) XML schemas The precision of a search can be further improved if the developer knows the XML schema,
i.e. the meaning of the XML elements. Using XPath expressions the developer can exactly identify the XML elements that the engine is to search for. A drug formulary may contain, for example, information about contraindications. If a physician enters a disease he or she usually wants to find those drugs that apply to the disease. The developer would consequently exclude contraindications from the search. At least, the search engine should not mix up indicated and contraindicated drugs with each other. (3) XML metadata The International Organization for Standardization (ISO) provides a standard notation, called Topic Maps (ISO 13250:1999), for interchangeably defining topics, and the relationships between topics. XML is used as a base notation for Topic Maps [7]. Fig. 4 shows a simple XML topic map that associates the substance ‘‘Levothyroxin-Natrium’’ with the drug ‘‘Berlthyrox’’. We could further specify the type of relationship, e.g. ‘‘contains’’, and the respective roles of the topics within the . Furthermore, the ‘‘Levothyroxin-Natrium’’ has an in the XML document ‘‘12951.xml’’ (Fig. 2). Topics, documents and other resources are identified and referenced by Uniform Resource Identifiers (URIs), i.e. XML metadata may also describe X-ray images and other multimedia
Implementing health care systems using XML standards
271
Fig. 4 XML topic map relating drug, substance and document.
data. XML metadata are more expressive than thesauri which usually establish synonymous relationships between two words or phrases. Topical relationships improve the completeness of a search. The drug ‘‘Berlthyrox’’, for example, is not listed in the hospital’s drug formulary. The XML topic map in Fig. 4, however, enables a search engine to reason that the drug ‘‘Berlthyrox’’ contains the substance ‘‘Levothyroxin-Natrium’’, which is listed in the hospital’s drug formulary. As a result, the clinical user will find drugs that are equivalent to ‘‘Berlthyrox’’. Topical relationships are important because physicians and nurses usually search for drugs and drug applications rather than substances. About 40 application groups (analgesics, diabetics, etc.) have therefore been described using XML metadata. In addition, users no longer need to ask the question ‘‘Where can I find the information?’’ because a search engine can mediate between user and information across different information resources (topical integration). The German drug formulary, for example, provides detailed information about the indication and administration of a drug that is not represented in the hospital’s drug formulary. If a clinical user searches for a drug in the hospital’s formulary he or she will also find related indications and administrations of the drug in the German formulary. From the user perspective the search gets a lot simpler. The Topic Maps standard allows the representation of sophisticated relationships between resources. We have developed an inference method that utilizes the given relationships.
The inference method is subdivided into two steps. The ‘‘association step’’ finds a set of topics that relate the search terms meaningfully with each other. The subsequent ‘‘occurrence step’’ relates the topics to resources such as documents and images. The search method consequently finds meaningful topics rather than meaningless words, i.e. the linguistic intelligence of the search engine improves. We refer to the inference method as ‘‘topic matching’’. The key features of our approach are pragmatism and flexibility. The relationships between the data are no longer fixed in the application logic. As a result, we can start with little structural requirements and establish new relationships between the data, as they become available. New data relationships enable new ways of reasoning; i.e. the intelligence of the search engine grows continuously.
3. Results As described in the methods section we have developed an authoring tool and a search engine for XML documents. These XML tools have been applied to medical resources (drug information, clinical guidelines, medical classification systems) and other application domains (libraries and law) with encouraging results. The key concept behind our tools is ‘‘plug-and-play XML’’, i.e. the development of new applications by means of XML related standards. We simply describe the structure of the XML documents by a reference document, DTD or XML schema and the author can immediately start to
272
R. Schweiger et al.
Fig. 5 Pharmacist updates XML drug information using a Web browser.
enter and change XML structured data using a Web browser. The XForms and XSL standard are used to develop customized user interfaces. The meta standards RDF and Topic Maps allow to establish intelligent search pathways. In the simplest case the developer starts with an exemplary XML document, i.e. developers do not have to learn all XML related standards at once. Such a pragmatic approach turned out very successful in the implementation of XML applications and allows to develop ‘‘specialty information systems’’ very quickly. There are countless ‘‘specialties’’ beyond drug information. The original goal of our XML tools was the clinical application of XML, i.e. pharmacists, physicians and nurses are to create and search XML documents. Ease of development (‘‘plug-and-play XML’’) is a means of getting rapid end user feed-
back. Our access statistics prove that clinical users accept XML applications if they perceive an added value. Several interviews with physicians, nurses and pharmacists revealed that clinical users appreciate, most of all, the ease of information retrieval (Web browser, information is only a click or search term away). In addition, medical people are used to filling forms, which allow the XML markup to be hidden. Fig. 5 shows the authoring tool that runs in the University Hospital’s pharmacy. An XML schema describes the structure of the XML documents and the XML form in Fig. 3 defines corresponding labels and form controls. The authoring tool enables the pharmacist to create, search (Suche), index (Index), save (Speichern) and render (Vorschau) XML structured drug information. The HTML form is dynamic, i.e. users can collapse (zuklappen), change (ver¨ andern), create (neu anlegen), copy (kopieren)
Implementing health care systems using XML standards and empty (leeren) single XML elements and XML attributes. In the section ‘‘Relating and searching drug information using Topic Maps’’ we use XML standards to improve several search parameters. The difficulty is to improve all the parameters at the same time. XML and Topic Maps provide a means to achieving this goal. Optimizing a search method requires considerable application feedback. It took us more than 2 years to fully satisfy the users’ expectations. Fig. 6 shows the user interface of our clinical drug information system. The search engine is similar to a Google search, i.e. physicians, nurses and other clinical users enter search terms into a Web browser and a search engine will retrieve the documents of interest. Clinical users are not interested in the technical details behind such an approach. Yet, it is the XML topic map in Fig. 4 that enables the search engine to find the drug ‘‘Berlthyrox’’ that is not listed in the hospital’s drug formulary. The search is robust even with incomplete and misspelled search terms (‘‘levo berltx’’). Search terms that directly occur in the document text will be highlighted by the search engine. The screenshot also shows the rendition of the XML document in Fig. 2 using XSL and CSS style sheets.
273 We had to overcome two main difficulties in implementing such an approach. The authoring tool has to cope with a variety of XML structures. Sequences of XML elements are easy to implement. The implementation of choices of XML elements with forms is more difficult. The markup of XML elements in textual narrations (mixed content model) can no longer be represented by forms and requires intelligent XML editors. Even more challenging was the number of topics and topical relationships a search engine has to deal with. The German drug formulary, for example, comprises about 10,000 documents and 500,000 topical relationships. From our experience, users get impatient if the search time exceeds a second. Search time is a crucial factor in achieving broad user acceptance. The search time in Fig. 6 is less than a millisecond that is no longer resolved by the search engine. We have developed sophisticated algorithms that can search even large topic networks (1,000,000 topics and more) in an efficient way using a single computer. Some algorithmic details have been described in technical papers [8]. In addition, our search engine has been designed to distribute queries across many computers
Fig. 6 Physicians and nurses search for drug information using a Web browser.
274 (grid computing), i.e. we are able to manage any number of topics and topical relationships (scalability).
4. Discussion 4.1. Approaching the Semantic Web The search for specific information on the Web is often time consuming. The information of interest may exist on the Web but is deeply hidden in one of the countless Web pages. The ability to filter out unwanted information must keep pace with the everincreasing quantity of information. Google, for example, prioritizes Web pages that are directly or indirectly cited from many places around the world. This citation importance or page rank corresponds well with people’s subjective idea of the relevance of a page [9]. Another technique is able to identify topically related Web pages, so-called Web communities, by analyzing the link structure of the Web [10]. Web communities can be characterized and identified by specific text features, which enable users to identify the communities. Apparently, the Web self-organizes to some extent. Links, i.e. meaningful relationships between Web resources seem to be the key to improve search quality. However, most of the Web’s content today is designed for humans to read, not for computer programs to manipulate meaningfully. XML related standards will change the situation [11,12]. Healthcare systems can be compared to the Web, which is a little organized system of distributed data. Developments of the Web consequently apply to healthcare systems. Semantic Web applications will not only link and render multimedia data. Documents such as clinical guidelines and scientific publications contain knowledge that only a human reader is able to identify. A significant part of this knowledge remains unused because it is difficult for clinical users to keep pace with the increasing amount of data. XML related standards, on the other hand, provide an infrastructure that enables a machine to capture and utilize the knowledge and to relieve the users with respect to information retrieval. Standard ontologies, standard schemas and standard vocabularies are another prerequisite for the Semantic Web. Standard ontologies provide standard identifiers (e.g. URIs) for classes of topics and topic relationships that are used in a specific application field. A drug ontology, for example, could provide the classes ‘‘drug’’, ‘‘substance’’ and ‘‘contains’’ to establish typed relationships
R. Schweiger et al. between drugs and substances. Topic Maps refer to standard identifiers as Public Subject Indicators (PSIs). A standard schema may describe a standard document structure. The summary of product characteristics (SPC) of the European Community, for example, defines XML elements that can be used to describe drug characteristics. Fig. 7 shows a SPC document that represents some characteristics of the drug ‘‘l-Thyroxin Henning® ’’ (see also Fig. 2). The Anatomical Therapeutic Chemical (ATC) classification system of the WHO provides a global standard for classifying medical substances and is an example for a standard vocabulary. Many standards are already in place or under construction. The HL7/ANSI Clinical Document Architecture (CDA) is just another example [13]. It is not yet clear how the Semantic Web will look. So-called search agents will be able to organize an appointment between patient and physician or even plan a travel including flight, hotel accommodation and recreational activities. Many Web services already exist today and it is doubtful that users will leave everything to search agents. Running applications are needed to illustrate the prospect of the Semantic Web to both developers and users. Many ‘‘ingredients’’ for the Semantic Web have already been developed. Yet it is still costly to ‘‘cook’’ running applications. The fact that developers have to learn a variety of specifications and tools before they can even start to develop is only one reason. Two problems are frequently encountered and underestimated when developing XML applications. (1) It is the pharmacist, physician, nurse or secretary who will have to manage the content. The implementation of the Semantic Web is therefore a question of simple user interfaces. (2) Let us assume that users (not developers) really create XML documents. The exploitation of the XML documents requires the development of specific and schema-aware applications, i.e. there is no immediate benefit that goes beyond the simple rendition of XML documents using XSL/CSS style sheets. Standard models (ontologies, schemas, vocabularies) are the key to make independent applications communicate. However, such applications will not emerge unless we provide an infrastructure that enables users to populate and exploit XML related standards. The idea of ‘‘plug-and-play XML’’ has been applied to drug information and other application fields and turned out very successful. Our authoring tool and search engine allow us to quickly develop user interfaces and to immediately search the resulting XML documents in an intelligent way using a Web browser. Furthermore, Topic Maps and RDF (resource description framework) can be used to establish application specific search pathways.
Implementing health care systems using XML standards
275
Fig. 7 SPC represented drug information.
4.2. Prospects and limitations of our approach The pivotal problem of medical data is the absence of machine readable structures. The university hospital’s drug formulary, for example, was represented as plain text with some implicit and only human readable structure. XML related standards (XML, Topic Maps, RDF) provide the means to structure the data in a machine readable way. The document-oriented view of XML corresponds well with the organization of healthcare data. We use XML to store medical documents. Using XML as an interchange format does not add structure to the healthcare system because XML messages are usually created from databases. In the end it is always the document author, i.e. the pharmacist, physician, nurse or secretary who has to structure the documents. Standards bodies may define and relate topics using XML meta standards. Yet it is the author who has to relate the topics to his/her documents. Up to now, XML is primarily used by developers. We consequently need to offer user-friendly authoring tools to change the situation. Our authoring tool translates XML documents, DTDs, XML schemas and XML forms into HTML forms that can be easily filled by the author using a Web browser. The tool can also retrieve data from other applications and fill the form with data. Several ap-
plications prove that such an approach is well accepted by medical people. The authoring tool has to cope with a variety of XML structures. Fig. 8 shows a form that represents a pathology report. The XML schema of the pathology report defines the structure ‘‘Telecommunication’’, which is a choice of the XML elements , and . Forms do not allow to mark XML elements within textual narrations (mixed content). In many cases, however, smart XML modeling can avoid mixed content models. In addition, we usually start with simple structures that evolve over time. The three-level architecture of the CDA, for example, also supports a structural evolution of clinical documents. Structural flexibility is another feature of our approach and turned out very useful. The pharmacy, for example, has identified new elements and values while editing the drug information. Another success factor was the combination of the authoring tool with a search engine that allows users to immediately exploit the structure produced (cost—benefit ratio). The search engine does not depend on a specific structure or schema. By default the engine will search all XML elements. The application engineer may decide, however, to search only specific elements. Our method of ‘‘topic matching’’ utilizes the given structure to improve search quality. In most cases the informa-
276
R. Schweiger et al.
Fig. 8 Representing a choice of XML elements (telecommunication) using forms.
tion is only a search term and click away from the user. Medical people usually have little time for information retrieval and appreciate search facilities. The search engine logs failed queries, which can be used to discover new relationships between the data and to continuously improve the search. Moreover, the search engine acts as an integration engine that searches across distributed and differently structured information resources. Our drug application, for example, is a federation of the hospital’s drug formulary, the German drug formulary, clinical drug guidelines and other resources. The University of Giessen has been developing data dictionaries for decades. The weak point of the dictionary approach, however, was the central architecture and the cost of development. Decentralization and ease of development is another feature of our search engine. We simply need to list the locations of the information resources and the search engine will index the documents and exploit the given structure. The overall approach allows to quickly structure textual and multimedia data and to utilize ‘‘data cemeteries’’. We currently develop an application that helps users to edit and find clinical guidelines for multiple sclerosis. Other applications will follow. Experts might object that we have not used the SPC standard for the University Hospital’s drug formulary. Making the given structure explicit seemed to be more pragmatic than introducing a new structure. Moreover, it is not very difficult to transform the local structure into a standard structure using
XML tools. The need for transformation will probably not arise because the Hospital’s data are not supposed to be processed by other applications. For clinical guidelines, on the other hand, we will consider existing standards.
5. Conclusion The absence of machine readable relationships between data (structure) is the key problem, not only in healthcare systems. XML related standards provide the means to change the situation. Text matching methods fail to fully exploit the given relationships and often produce inaccurate and incomplete search results. We have therefore developed a different search method called ‘‘topic matching’’ that relates the search terms meaningfully with each other. The search method requires a flexible model that is able to represent sophisticated relationships between topics, documents, images, services and other resources. The ISO standard Topic Maps turned out to be a good choice for a search model and allows us to represent typed relationships such as ‘‘is synonym of’’ or ‘‘is topically related to’’. However, structure will not proliferate unless we also provide tools that support the population of XML related standards and the development of XML-aware applications. Our approach of ‘‘plugand-play XML’’ has been applied to drug information and other application fields with promising results.
Implementing health care systems using XML standards
References [1] T. Bray, J. Paoli, Sperberg-McQueen, E. Maler (Eds.), Extensible Markup Language (XML), Version 1.0, 2nd ed., W3C Recommendation, October 6, 2000. [2] J. Clark (Ed.), XSL Transformations (XSLT), Version 1.0, W3C Recommendation, 1999, November 16. [3] R. Schweiger, S. Hoelzer, U. Altmann, J. Rieger, J. Dudeck, Plug-and-play XML: a health care perspective, J. Am. Med. Inform. Assoc. 9 (1) (2002) 37—48. [4] M. Dubinko, L.L. Klotz, R. Merrick, T.V. Raman, XForms, Version 1.0, W3C Recommendation, 2003, October 14. [5] J. Clark, S. DeRose (Eds.), XML Path Language (XPath), Version 1.0, W3C Recommendation, 1999, November 16. [6] XML editor X2U, accessed on January 8, 2004. http://www. w3.org/XML/Schema-X2U.
277
[7] S. Pepper, G. Moore (Eds.), XML Topic Maps (XTM), Version 1.0, accessed on January 8, 2004. http://www. topicmaps.org/xtm/1.0/. [8] R. Schweiger, S. Hoelzer, D. Rudolf, J. Rieger, J. Dudeck, Linking clinical data using XML Topic Maps, Artif. Intell. Med. 28 (2003) 105—115. [9] S. Brin, L. Page, The anatomy of a large-scale hypertextual Web search engine, accessed on January 8, 2004. http://www.db.stanford.edu/∼backrub/google.html. [10] G.W. Flake, S. Lawrence, C.L. Giles, F.M. Coetzee, Selforganization and identification of Web communities, IEEE Comput. 35 (3) (2002) 66—71. [11] J. Bosak, T. Bray, XML and the Second-generation Web, Scientific American, May 1999. [12] T. Berners-Lee, J. Hendler, O. Lassila, The Semantic Web, Scientific American, May 2001. [13] Clinical Document Architecture Framework Release 1.0, ANSI/HL7 CDA R1.0-2000, Health Level Seven, Inc.