Medical World Search Logo
Medical World Search Banner
Medical Intelligence At Your Fingertips

How Does Medical World Search Work ?

Medical World Search has three components: the Web crawler, the indexer, and the query processor. The Web crawler seeks out medical sites on the World Wide Web, starting from some of the major entry points for clinical medicine, then retrieves them and stores them on Medical World Search's disk system. The indexer recognizes medical concepts in the pages retrieved by the Web crawler, and generates a large index of all medical concepts and words in the Web pages; this index shows in which pages each concept and word appears. The query processor allows the user to specify his information needs and then attempts to match the query optimally to Web pages using the index generated previously. Results are ranked and returned to the user.

The Web Crawler

Medical World Search's Web crawler uses a combination of automated retrieval of Web sites and manual selection to retrieve and store only Web pages that have valuable clinical information. This task is made easier by the ability to determine the presence of medical concepts in pages and rank the importance of the page accordingly.

The Indexer

Medical World Search's indexer uses an optimized algorithm to recognize over 200,000 different medical concepts in Web pages. The index built by the indexer contains nearly all words and medical concepts present in a Web page. Indexing by medical concepts allows the indexer to represent, for instance, heart attack and myocardial infarction as the same concept. These medical concepts are represented in the index with the full information about their relationships. For instance heart attack is represented as a heart disease, allowing easy searching for all documents about heart diseases.

The Query Processor and Results Ranking

The query processor is the portion of Medical World Search with which medical professionals interact directly. The user interface allows users to easily specify the desired query, as a combination of medical concepts and words. Boolean queries can be formulated. By default, more specific terms are directly added to the search, so that a query on heart diseases will result on searching for heart attack, coronary artery disease, angina pectoris, and so forth. But the user can specify not to add the more specific terms. When a user of Medical World Search submits the query, medical concepts and words are quickly looked up in the index generated previously. The result is a list of Web pages containing medical concepts and words in the query. Medical World Search then ranks the Web pages by order of importance to the user's query. Here, knowledge about medical concepts and their relationships is used for optimal ranking, as well as the number of times a medical concept appears in the page and the length of the page.

The Medical Intelligence

Medical World Search has knowledge of over 500,000 medical terms including relationships between these terms, such as synonyms, more specific or more general terms, and definitions. Users of Medical World Search can easily browse and search this knowledge base. Indeed, Medical World Search incorporates the medical thesaurus developed by the National Library of Medicine (NLM) as part of their Unified Medical Language System (UMLS) project. The UMLS thesaurus integrates disparate medical vocabularies, such as the NLM's own Medical Subject Headings (MeSH), the International Classification of Diseases (ICD-9- CM), the Systematic Nomenclature for Medicicine (SNOMED), and the Current Procedural Terminology (CPT).


All documents made publicly available on this server are Copyright © 1997 Medical World Search .
In addition, for materials from the Unified Medical Language System® of the National Library of Medicine , additional copyright restrictions apply. Use of this information service is subject to the disclaimer and the terms and conditions .