COMPUTERS IN OTOLARYNGOLOGY
0030-6665 /98 $8.00
+ .OO
SEARCHING THE MEDICAL LITERATURE Michael G. Stewart, MD, MPH, and Aletta S. Moore, BA, Dip Lib
One of the many uses of a computer is to facilitate searching the medical literature. The number of physicians performing their own searches is increasing significantly: and several converging trends have contributed to this increase. First, the database represented by the compendium of published medical literature is enormous, and is not immediately accessible to most clinicians. Additionally, the volume of new information is increasing at a rapid pace. Further, practicing physicians are becoming more computer literate, and medical schools and residency training programs are teaching students and residents how to use computers in their practice. Concurrently, the older, cumbersome access to searching these databases has evolved into easier "point and click" or menu-driven interfaces. Finally, the importance of "evidence-based medicine is becoming more apparent? In the past, literature searches have been used primarily for research and manuscript preparation. Today, however, computer-assisted literature searches are frequently used to assist in patient care and clinical problem s01ving.~-~ Users of literature searches have reported that they are extremely useful in severaI aspects of patient care, including choosing the most appropriate diagnostic test, properly diagnosing a medical condition, developing an appropriate treatment plan, and implementing a treatment ~ l a n .Users ~ , ~ also have reported that findings from literature searches have resulted in improved patient outcomes overall, and in some cases have contributed to the implementation of life- or organ-saving treatments9In fact, in one study the use of on-line literature searches early in the course of hospitalization resulted in significantly lower hospital costs and charges, and shorter lengths of stay, than in matched control From The Bobby R. Alford Department of Otorhinolaryngology and Communicative Sciences, Baylor College of Medicine, Houston, Texas
OTOLARYNGOLOGIC CLINICS OF NORTH AMERICA ~
~~
___
VOLUME 31 *NUMBER2 APRIL 1998
277
278
STEWART & MOORE
patients with the same diagnoses where literature searches were not performed? Even more impressive, the patients in the group where literature searches were performed had a higher average level of illness severity than patients in the control group, which would usually result in higher costs and a longer length of stay in the literature-search group. CLINICIAN VERSUS LIBRARIAN SEARCHES
The quality of literature searches is traditionally assessed with two variables: recall and precision.'O Recall refers to the fraction of relevant citations retrieved with a search when compared to the actual number of relevant citations present in the literature. Precision refers to the number of relevant citations retrieved with a search, when compared to the t o t d number of citations retrieved with the search. Searches performed by medical librarians, who are knowledgeable and trained in search techniques, are considered the "gold standard for literature searches. However, studies have shown that after a relatively short period of training and practice, clinician searches can approach the quality of librarian searches? AVAILABLE SOURCES OF LITERATURE SEARCHES Searches through the National Library of Medicine
Medical Literature Analysis and Retrieval System (MEDLARS)is the information retrieval system compiled by the United States National Library of Medicine (NLM). MEDLARS consists of 40 different on-line databases; MEDLINE is one of the available databases. MEDLINE contains citations to approximately 8.6 million articles in over 3800 biomedical journals, beginning in 1966, and was accessed about 18,000 times per day in early 1997. Selected other on-line databases maintained by the NLM are listed in Table 1. The NLM can be accessed at http:www/nlm.nih.gov. To facilitate searching, the NLM developed Grateful Med in 1986. Grateful Med is a popular, interactive software package that allows users to perform effective searches using MEDLINE and other databases. Studies of Grateful Med users have shown that MEDLINE has been the most widely used database.' Grateful Med is available in several formats, all of which are accessiblethrough the NLMs home page on the Internet. Grateful Med for Windows may be downloaded at http://www.nlm.nih.gov/ databases.gmwin.htm; there are also versions for DOS-based and Macintosh computers available at http:/ /www.nlm.nih.gov/databases/gmorig.htm1. Beginning in 1996, on-line Internet access to Grateful Med was provided at http:/ /igm.nlm.nih.gov through Internet Grateful Med. Although each of these software packages and the Internet access were free, registration was required and user fees were charged for most databases. These charges were based on connect time and the number of documents downloaded.
SEARCHING THE MEDICAL LITERATURE
279
Table 1. SELECTED ON-LINE DATABASES MAINTAINED BY THE NATIONAL LIBRARY OF MEDICINE AIDSLINE AIDSDRUGS CANCERLIT HealthSTAR HISTLINE MEDLINE PREMEDLINE MeSH PDQ SPACELINE
Citationsto the AIDS literature Lists of substances being tested in AIDS-related clinical trials Citationsto the cancer literature Citationsto the health services technology, assessment, and research literature Citationsto history of medicine literature Citationsto the biomedicalliterature; in part to the InternationalNursing Index Citationsto new biomedical literaturebefore it enters MEDLINE Thesaurus of biomedical-relatedterms that occur in the medical literature (Physician Data Query) Current information on cancer treatment and clinical trials Citationsto space-life sciences literature
Beginning June 26, 1997, the NLM began providing free, unlimited access to MEDLINE for all Americans over the World Wide Web. It has been estimated that more than one third of all Internet searches are performed for health-related reasons and that, with more than 10,000 health/ medical web sites, the Internet is the largest source of medical information. The NLM believes that by providing free access to current, reliable medical information through MEDLINE, they can aid both the public as well as physicians and other health care professionals. The free access is available only through Internet Grateful Med, or a new search interface called PubMed. Users of the older Grateful Med products who access the NLM through telnet or a modem will continue to pay the same charges. Initially, Internet Grateful Med will continue to provide access to the same four databases: MEDLINE, AIDSLINE, HealthSTAR, and PREMEDLINE.PubMed provides access to MEDLINE and PREMEDLINE. In the future, the NLM intends to expand the number of databases included. A feature that has been part of the Grateful Med family, Loansome Doc, will not initially be available through PubMed. All Grateful Med users can continue this service, which allows users to request complete copies of scientific articles for a fee. Because the articles are downloaded to a library, not your personal computer, you must make arrangements with a local medical library in order to use Loansome Doc. The NLM provides information on the location of participating medical libraries through its web site. The NLM anticipates adding to PubMed a Loansome Doc-similar feature, but some document delivery will be mediated through the original publishers. The newest access to MEDLINE information from the NLM, PubMed, is a development project in association with the National Center for Biotechnology Information at the NLM, in conjunction with publishers of biomedical literature. It is available from the NLM homepage, or at http: //www4.ncbi.nlm.nih.gov/PubMed/.The new search engine interface, Entrez, is flexible and powerful and should allow users to perform both simple and complex searches. Several new and useful features are intro-
280
STEWART & MOORE
duced, including the “see related and “neighbor” features, which allow extensive groups of articles to be retrieved using a relevance ranking algorithm. Links are provided from the citations to the participating publishers’ full text servers. It is anticipated that further changes will be made for accessing databases through the NLM. Internet Grateful Med and PubMed will be made to ”talk to each other,” and eventually the best features of the two systems will be merged into one.
Other Providers of Search Software and Access
Besides the NLM, other vendors license MEDLINE as well as other medical databases, and offer search interfaces either through dial-up through a modem or telnet, or through the Internet. Some of these are listed in Table 2. Some services are free, and some require membership in an organization. Several of these services are commercial ventures and charge user fees, which range from per-hour charges to yearly charges for unlimited use, or minimum monthly fees. Some also charge download fees based on the number of citations or abstracts downloaded. Complete copies of articles are available from some of these services, and are supplied by mail, fax, or downloaded onto your computer. The user should be aware that the free MEDLINE sites often have advertising banners on some of their screens. Additionally, the user should be aware that these services use a variety of front-end search programs, ranging from simple to sophisticated. Search results often will be dependent on the quality of the front-end software. Servicesalso vary in the years of coverage of MEDLINE that are provided, as well as the frequency of updates that are made to the database. Each user will need to evaluate which access best suits their needs, by evaluating the services that each provides. Some of the factors to consider are currency, reliability, and management of the service. Users also should consider other features of the search interface that will increase the productivity of their searches. Ease of access and personal preferences Table 2. SELECTED SUPPLIERS OF FRONT-END SOFTWARE TO FACILITATE ON-LINE LITERATURE SEARCHES Product
Grateful Med NLM’s PubMed Ovid Technology Dialog PaperChase Avicbnna Community of Science, Inc. Healthgate Helix
Supplier
NLM http://igm .nlm.hin .gov NLM http://www4.ncbi.nlm.nih.gov/PubMed/ http://www.ovid.com http://www.krinof.com/products/dialog http://enterprise.bih.harvard.edu/paperchasel http://www.avicenna.com http://muscat.gdb.org/repos/medl/ http://www. healthgate.com http://www. helix.com
SEARCHING THE MEDICAL LITERATURE
281
also will play a role in each person’s decision, as well as the availability of training and support. Changes to all of these services will certainly take place as a result of the NLMs decision to provide free Internet access to MEDLINE, especially as the NLMs service evolves and improves. At the time of publication there were several sites on the Internet that provided a comparison of the different sources of MEDLINE information available: www.hsc.missouri.edu/library/docs/mla97.html,which is a poster presentation by Edwards et al, presented at the Medical Library Association’s 1997 meetings; http:/ /www.docnet.org.uk./drfelix,which is Dr. Felix’s Free Medline page by Helga Perry, maintained by the Gloucestershire Royal Hospital Library; http://www.medmatrix.org/SPages/medline. stm, from Healthtel Corporation’s Medical Matrix; and http: / /biomednet. com/cgi-bin/fulltext/deliver.pl?uid =bb75mb&picshow= no from Current Biology on-line. Haynes et al. studied both the costs and the quality of searches provided by several commercially available on-line and CD-ROM software packages? The authors asked clinicians and librarians to perform literature searches on a set of clinical questions, and then ranked the searches performed by each software package using recall, precision, and cost per relevant citation. The authors found substantial differences in the performance of different software packages. Overall, PaperChase software performed best for clinician searches, and was least costly per relevant citation. PaperChase was developed by physicians at Beth Israel Hospital in Boston, and has been commercially available since 1984.One popular feature of PaperChase is ‘its interactive nature; the program assists the user to perform better searches by prompting for appropriate medical subject headings (MeSH) terms, and so forth. Although MEDLINE is the largest and most widely used citation database, there are other databases available on-line. For instance, EMBASE is a bibliography of medical subjects from 3500 journals, with an emphasis on drug information, and the Comprehensive Core Medical Library (CCML) contains the complete texts of 15 textbooks, 30 yearbooks, and 70 journals of general interest. Databases available at no charge from the National Cancer Institute include CancerNet, Cancer Information Service, and CancerFax.These are useful adjuncts to MEDLINE, and many physicians may find that their patients are increasingly aware of such information sources. The quality of medical literature is based on the peer-review and editorial process. Articles take time to be published, and even once published, take more time to be indexed by databases such as MEDLINE. There is a time lapse of 2 to 7 months between publication and the time the citation reaches MEDLINE. Sometimes this currency becomes an issue in the search for information. This problem may be solved partially by the inclusion of PREMEDLINE to such services as PubMed. Publishers of medical and scientific journals submit prepublication information to the database allowing this information to be searched, although without MeSH terms. In addition, there are some databases, such as AIDSTRIALS and PDQ, that provide information about experimental treatments. Sub-
282
STEWART & MOORE
scription to a service such as Current Contents allows the user to access the table of contents and citations for current issues of the standard medical and scientific journals. CD-ROM Databases
The huge amount of data that can be stored on a single CD-ROM disc make it an ideal medium for databases of medical literature citations. There are several commercially available CD-ROM databases, although most are fairly expensive for an individual user. In large groups or academic departments, if a computer network is available, or a CD-ROM tower system (which enables several users to access the CD at once), a MEDLINE collection on CD-ROM is one alternative for literature database searches. A complete file of MEDLINE references (back to 1966) may be purchased on CD-ROM, and there are several subscription services that regularly send new CD-ROMs with current citations to subscribers. A list of CD-ROM services is shown in Table 3. Selective Disseminationof Information
Selective dissemination of information (SDI) helps to keep physicians abreast and is a service available through some MEDLRVE providers. A search is formulated on the subject of primary interest to the physician. That search is stored and then rerun periodically, either manually or in some systems automatically, to alert the physician to new literature. Note that the new NLM service PubMed does not allow the user to store searches. But new developments in "push" technology will certainly allow physicians to be alerted to new literature in their areas of interest. Another useful tool for the physician is ALERTS, available at no charge through the NLM web site. ALERTS notifies physicians about findings from selected NIH-funded clinical trials that are considered too urgent to wait for normal publication. SEARCH TECHNIQUES
The search process will depend on the front-end software the user chooses. Each "gateway" has its own search interfaces, and the searcher should take advantage of all training materials that are provided to obtain Table 3. SELECTED LIST OF CD-ROM LITERATURE DATABASES EBSCO Elsevier Silver Platter Knight-Rider
http://www.ebsco.com/ Variety of databases available http://www.elsevier.com/inca/homepage/sah/sp~embase/cdromsl. htm Silver Platter MEDLINE and EMBASE http://www.silverplatter.com/ MEDLINE and subsets http://www.krinfo.coml Dialog and OnDisc Medline
SEARCHING THE MEDICAL LITERATURE
283
the best possible search results. Most systems provide either a user manual or on-line or contextual help features. But there are also some general principles common to most systems that will help any searcher.
Free-Text Searches Searching free text means looking for words that are used in either the title or abstract of a citation. To do a comprehensive search (one with high recall), the user must enter all possible synonyms of a term: for example, caustic ingestion, corrosive burns, caustic stenosis, esophageal bum, and so forth. Some systems also require the user to enter both singular and plural forms, and variant spellings of each word. Some MEDLINE search systems use a type of free-text searching called natural language processing (NLP). Using word stemming, the expansion of synonyms, and term weighting, the system tries to improve the search results. Although free-text searching is sometimes a useful approach for locating an article, it is used most often in conjunction with a controlled vocabulary, such as MeSH.
Mesh Terms The entire search process relies on an accurate formulation of the search question. To translate a question into a search query, the user should break down the question into concepts. MeSH terms are the ”vocabulary” used to categorize concepts in MEDLINE and other MeSHbased databases (such as AIDSLINE, CANCERLIT, etc.) These MeSH terms constitute a thesaurus that contains all the concepts that appear in the literature; as significant new or modified concepts appear in the literature, new MeSH terms are created and added to the thesaurus. There were about 17,000 MeSH terms in 1994.1°All scientific articles are indexed into MEDLINE using, on average, 10 to 12 MeSH terms that describe the articles’ content. Articles are indexed to the most specific MeSH terms that are applicable. To make searching easier, articles also are classified by their ”major concept” MeSH terms. When executing a search, if the searcher indicates only ”major concept” articles, then only articles that address the MeSH term as their primary focus will be retrieved. Although major concept searching can reduce significantly the number of citations yielded by a search, in some instances a searcher may need every article that has even mentioned the concept of interest. In this case, major concept searching is not advised. Some search interfaces have default automatic features that “map” search queries to the closest MeSH term. For example, the system would match the query ”cancer” to the MeSH term ”neoplasm.” Users should be aware of whether their service does this, and know how to use both MeSH and free-text searching. Associated with MeSH major concept terms are a list of approxi-
284
STEWART & MOORE
mately 80 topical subheadings (e.g., diagnosis, drug therapy, or etiology). Certain search interfaces allow users to limit their search using these subheadings, making the search more narrowly focused. Because MeSH terms are organized into a hierarchical arrangement, sometimes called a tree structure, and because the articles are indexed to the most specific MeSH term possible, a very powerful search strategy is to "explode" the MeSH term. This means that all the underlying terms will be included. For example, by exploding the MeSH term "ear," all articles on inner, middle, and outer ear would be included. Some search programs automatically explode any MeSH term the user enters, although in other programs you must specify this option. Again, the user needs to be aware which option is the default in any system. There are MeSH terms for publication type, which indicate the type of article, rather than its concept (e.g., review, clinical trial, meta-analysis). Also, searches may be limited by language, geographic terms, age groups, gender, and a variety of other terms that are called "check-tags." Many systems also allow the searcher to specify what journals are to be searched, allowing the physician to find articles that are readily available. All of these features allow the search to be more narrowly focused. Using the MeSH vocabulary requires some learning, because MeSH terms are not always intuitive.However, the effort put into familiarization with the MeSH vocabulary will pay off in improved search quality. Authors have compared literature searches performed using either free text or MeSH terms, and MeSH-based searches have higher recall and precision than free text-based searches.'O As stated before,.many available software packages will guide the searcher into using only MeSH terms to improve search quality. In addition, the National Library of Medicine has developed the Unified Medical Language System (UMLS), which contains a large set of links between MeSH terms and commonly used medical terms, such as those in the International Classification of Diseases (ICD) and Current Procedural Terminology (CPT)manuals.'O Information on UMLS and a browseable MeSH thesaurus both can be accessed through the National Library of Medicine's web site. Internet Grateful Med has links between the search functions and the MeSH megathesaurus, which provides information on the scope of each term and its related subheadings, as well as information on various check-tags that are available. Despite training and familiarization with search techniques and terminology, there still may be instances when a clinician or researcher is unable to retrieve a satisfactorynumber of relevant articles after designing and performing the search themselves. In those circumstances, a local medical library is a valuable resource for assistance and guidance in performing literature searches. Boolean Search Techniques
MEDLINE and most other databases allow the use of Boolean logic to improve search specificity. Once the search query has been broken down into concepts (usually MeSH terms) and each of these searched
SEARCHING THE MEDICAL LITERATURE
285
separately, these can then be connected with the following Boolean operators: AND, OR, or NOT. The operator AND retrieves only those articles in which both concepts (connected by the word AND) appear. The operator OR retrieves any article in which either of the two terms appear. By using NOT, you eliminate any article that contains the specified concept. For example, if a clinician is searching for clinical trials in sinusitis, an appropriate starting point for the search would be ”sinusitis” AND “clinical trial,” both of which are MeSH subject headings. Or, if the clinician is searching for articles on sarcoidosis and sinusitis, but is concerned that all potentially relevant articles that address sarcoidosismight not be indexed under ”sarcoidosis,” the following strategy could be used. Search #1: ”sinusitis” Search #2: ”sarcoidosis”OR ”granulomatousdisease” Search #3: ”Search #1” AND ”Search #2“ Although this search might result in retrieving too many nonrelevant articles (e.g., sinusitis and tuberculosis)-in other words, low search precision-this strategy might yield some articles of interest that would not have been identified using only “sarcoidosisand sinusitis.” If the clinician wished to rule out those articles on tuberculosis, he might add this search statement: Search #4: ”Search #3” NOT ”tuberculosis.” The searcher should be cautioned in the use of the Boolean operator NOT, especially in free-text searches. If the last search was a free-text search, using ”NOT tuberculosis” would rule out any article that mentioned the term tuberculosis, even if it focused on sarcoidosis. Additionally, the searcher should be aware that different search packages require different methods of using Boolean operators. For example, PubMed requires that the operators be typed in uppercase. Most instruction manuals about literature searching and search software will assist the searcher in using Boolean strategy to improve their searches. Clinical Queries
Clinical Queries is a new feature of PubMed that supplies a specialized search intended for clinicians. Clinical Queries has built-in ”filters” . ~ searcher can that are based primarily upon the work of H a y n e ~The specify one of four study categories: therapy, diagnosis, etiology, or prognosis. The searcher also should indicate whether the search should be more sensitive (retrieving more relevant articles, but probably including some that are less relevant) or more specific (more narrowly focused, but probably omitting a few articles). IMPROVING THE SEARCH STRATEGY
An effective literature search is an iterative process. Continuousevaluation of the search results are important. Several authors have described multistep techniques to improve the efficiency and effectiveness of liter-
286
STEWART & MOORE
ature ~ e a r c h e s .Although ~,~ some of the steps are fairly self-evident, here are some helpful tips towards improving search strategy. 1. Carefully read on-line user manuals about the front-end software you are using. Each search system is different, and familiarizing oneself with the system used will significantly improve results. On-line systems also provide contextual help screens that should be used. Or, enroll in a short instructional course (given by a local medicaI library or medical society) on how to perform literature searches. One or two hours of instruction before beginning to perform searches can save dozens of future hours spent on inefficient searches. 2. Plan a search strategy before going on-line, but be willing to amend the strategy as search results are reviewed. 3. Familiarize oneself with the appropriate MeSH terms while planning the search. 4. Display the results of the searches as one goes, and review them for relevance. Display the MeSH terms for articles that are "on the m a r k and use these MeSH terms with a "see related term or a "neighbor" feature to find other similar articles. 5. It is often more useful to begin with a broader subject approach and then to narrow the search, rather than to try to widen a too narrowly focused search. 6 . If almost nothing is found, be cautious. Try several approaches before coming to the conclusion that there is no literature on a topic. MEDLINEs presence on the Internet means that the physician now has many options to choose from in searching the medical literature. It is too early to predict what kinds of changes will result from the NLMs decision to provide free access to MEDLINE, or how successful their new interface,PubMed, will be. Each searcher needs to be proactive in selecting the system that is best for them, and in acquiring adequate training that will enable them to achieve satisfactory search results. References 1. Blecic D D Comparison of fixed-free grateful med database use and searching success
2. 3.
4. 5. 6.
rates eiven the continued availabilitv of MEDLINE in other formats. Bull Med Libr Assoc 84:50?-512,1996 Chambliss ML: Personal computer access to MEDLINE: An introduction. J Fam Pract 323414419,1991 Evidence-Based Medicine Working Group: Evidence-based medicine: A new approach to teaching the practice of medicine. JAMA 268:2420-2425,1992 Haynes RB, Ramsden MF, McKibbon KA, et al: On-line access to MEDLINE in clinical settings: Impact of user fees. Bull Med Libr Assoc 79:377-381,1991 Haynes RB, Walker CJ, McKibbon KA, et al: Performances of 27 MEDLINE systems tested by searches with clinical questions. J Am Med Inform Assoc 1:275-295,1994 Haynes RB, Wilczynski N, McKibbon KA, et a1 Developing optimal search strategies for detecting clinically sound studies in MEDLINE. J Am Med Inform Assoc 147-458, 1994
SEARCHING THE MEDICAL LITERATJRE
287
7. Klein MS, Ross FV,Adams DL, et a1 Effect of on-line literature searching on length of stay and patient care costs. Acad Med 69:489495,1994 8. Lindberg DAB, Siegel E R On assessing the impact of medical information: Does MEDLINE make a difference? Methods Inf Med 30239-240,1991 9. Lindberg DAB, Siegel ER, Rapp BA, et al: Use of MEDLINE by physicians for clinical problem solving. JAMA 26931263129,1993 10. Lowe HJ, Bamett G O Understanding and using the medical subject heading (MeSH) vocabulary to perform literature searches. JAMA 271:1103-1108,1994 Address reprint requests to Michael G. Stewart, MD, MPH Bobby R. Alford Department of Otorhinolaryngology and Communicative Sciences Baylor College of Medicine One Baylor Plaza, SM-1727 Houston, TX 77030