A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews

  • Wallace, Byron C
  • Paul, Michael J
  • Sarkar, Urmimala
  • Trikalinos, Thomas A
  • Dredze, Mark
JAMIA - Journal of the American Medical Informatics Association 21(6):p 1098-1103, November 2014. | DOI: 10.1136/amiajnl-2014-002711

Online physician reviews are a massive and potentially rich source of information capturing patient sentiment regarding healthcare. We analyze a corpus comprising nearly 60 000 such reviews with a state-of-the-art probabilistic model of text. We describe a probabilistic generative model that captures latent sentiment across aspects of care (eg, interpersonal manner). We target specific aspects by leveraging a small set of manually annotated reviews. We perform regression analysis to assess whether model output improves correlation with state-level measures of healthcare. We report both qualitative and quantitative results. Model output correlates with state-level measures of quality healthcare, including patient likelihood of visiting their primary care physician within 14 days of discharge (p=0.03), and using the proposed model better predicts this outcome (p=0.10). We find similar results for healthcare expenditure. Generative models of text can recover important information from online physician reviews, facilitating large-scale analyses of such reviews.

Copyright © 2014 BMJ Publishing Group Ltd