Announcement

Collapse
No announcement yet.

J Am Med Inform Assoc: Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • J Am Med Inform Assoc: Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers

    J Am Med Inform Assoc doi:10.1136/amiajnl-2013-001934

    Research and applications

    Influenza detection from emergency department reports using natural language processing and Bayesian network classifiers

    Ye Ye1,2,
    Fuchiang (Rich) Tsui1,2,
    Michael Wagner1,2,
    Jeremy U Espino1,
    Qi Li3

    + Author Affiliations

    1Real-time Outbreak and Disease Surveillance Laboratory (RODS), Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
    2Intelligent Systems Program, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
    3Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, USA

    Correspondence to Dr Fuchiang (Rich) Tsui, Real-time Outbreak and Disease Surveillance Laboratory (RODS), Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd, 4th floor, Pittsburgh, PA 15206-3701, USA; tsui2@pitt.edu

    Received 16 April 2013
    Revised 25 September 2013
    Accepted 11 December 2013
    Published Online First 9 January 2014

    Abstract

    Objectives To evaluate factors affecting performance of influenza detection, including accuracy of natural language processing (NLP), discriminative ability of Bayesian network (BN) classifiers, and feature selection.

    Methods We derived a testing dataset of 124 influenza patients and 87 non-influenza (shigellosis) patients. To assess NLP finding-extraction performance, we measured the overall accuracy, recall, and precision of Topaz and MedLEE parsers for 31 influenza-related findings against a reference standard established by three physician reviewers. To elucidate the relative contribution of NLP and BN classifier to classification performance, we compared the discriminative ability of nine combinations of finding-extraction methods (expert, Topaz, and MedLEE) and classifiers (one human-parameterized BN and two machine-parameterized BNs). To assess the effects of feature selection, we conducted secondary analyses of discriminative ability using the most influential findings defined by their likelihood ratios.

    Results The overall accuracy of Topaz was significantly better than MedLEE (with post-processing) (0.78 vs 0.71, p<0.0001). Classifiers using human-annotated findings were superior to classifiers using Topaz/MedLEE-extracted findings (average area under the receiver operating characteristic (AUROC): 0.75 vs 0.68, p=0.0113), and machine-parameterized classifiers were superior to the human-parameterized classifier (average AUROC: 0.73 vs 0.66, p=0.0059). The classifiers using the 17 ?most influential? findings were more accurate than classifiers using all 31 subject-matter expert-identified findings (average AUROC: 0.76>0.70, p<0.05).

    Conclusions Using a three-component evaluation method we demonstrated how one could elucidate the relative contributions of components under an integrated framework. To improve classification performance, this study encourages researchers to improve NLP accuracy, use a machine-parameterized classifier, and apply feature selection methods.


Working...
X