1. Background |
    The rapid growth of literature in MEDLINE database gives benefit invaluably to biomedical researchers.  On the other hand, unfettered introduction of new abbreviations in the literature such as gene or protein names hinders efficient use of the database.  Because in the biomedical literature, abbreviations are highly ambiguous:  one abbreviation may represent multiple expansions.  In this situation, a support system by which those researchers identify abbreviations in the literature is strongly required. |
2. ALICE system |
    To extract abbreviations and their expansions from biomedical literature, we propose an algorithm called ALICE (Abbreviation LIfter using Corpus-based Extraction).  ALICE is composed of three phases, that is, the Inner Search (IS), the Outer Extraction (OE), and the Validity Judgment (VJ).  The IS phase is for searching a candidate abbreviation and recognizing whether the candidate is an abbreviation or not, the OE phase is for extracting of its expansion, and the VJ phase is for judging the propriety of the pair of an abbreviation and its expansion. |
3. Evaluation |
    Our algorithm solved various limitations, which other algorithms had, by carefully constructed many patterns and rules, and many stop words lists.  They are based on reiterated examinations to a vast amount of biomedical literature.  ALICE tries to recognize all patterns of abbreviations and extract their expansions in the literature with high precision and recall.  It achieved 95% precision and 96% recall on the randomly selected literature from MEDLINE database.  This achievement helps to construct a useful abbreviation dictionary, which also leads to making a new algorithm to retrieve literature from MEDLINE database. |
4. How to use |
   
ALICE can accept two types of your request.      <<< Notice >>>  More than 1M byte file will be discarded (about 450-500 entries).      <<< Notice >>>  IDs need to be delimited by a single sapace (e.g., PMID1 PMID2 PMID3 ...).      <<< Notice >>>  Please don't break lines. |
5. Publication |
   
ALICE: An Algorithm to Extract Abbreviations from MEDLINE.      Ao H, Takagi T.      J Am Med Inform Assoc 2005; 12: 576-586. PrePrint published May 19 2005; doi:10.1197/jamia.M1757     MEDLINE ABSTRACT |
6. Contact |
    We welcome comments and suggestions.  Please send an e-mail to aohiroko@hgc.jp with them. |
TOP | HOME |