Are Online Reviews of Physicians Biased Against Female Providers?
Proceedings of the 4th Machine Learning for Healthcare Conference, PMLR 106:406-423, 2019.
Patients increasingly seek out information regarding their healthcare online. Online reviews of caregivers in particular may influence from whom patients seek treatment. Are these sources biased against female providers? To address this question we analyze a new dataset of online patient reviews of male and female healthcare providers with respect to numerical ratings and language use. We perform both regression and (data-driven) qualitative analyses of language via neural embedding models induced over review texts. In both cases we account for provider specialty. To do so while learning embeddings, we explicitly induce specialty, sex, and rating embeddings from review meta-data via a ‘matched-sampling’ training regime. We find that females consistently receive less favorable numerical ratings overall, even after adjusting for specialty. To analyze language use in reviews of male versus female providers, we induce neural embeddings (distributed representations) of gender and qualitatively characterize the ‘distributional semantics’ that this induces. We observe differences in language use, e.g., analysis of average vector similarities over repeated runs reveal that many of the words closest to the coordinates in embedding space associated with positive sentiment and female providers describe interpersonal characteristics (sweet , considerate , caring , personable , compassionate ): such descriptors do not seem as similar to the point corresponding to positive sentiment regarding male providers. To facilitate research in this direction we publicly release data, embeddings, and all code (including Jupyter notebooks) to reproduce our analyses and further explore the data: https://github.com/avi-jit/RateMDs.