Automatic Classifiers as Scientific Instruments: One Step Further Away from Ground-Truth

Jacob Whitehill, Anand Ramakrishnan
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:6757-6765, 2019.

Abstract

Automatic machine learning-based detectors of various psychological and social phenomena (e.g., emotion, stress, engagement) have great potential to advance basic science. However, when a detector d is trained to approximate an existing measurement tool (e.g., a questionnaire, observation protocol), then care must be taken when interpreting measurements collected using d since they are one step further removed from the under- lying construct. We examine how the accuracy of d, as quantified by the correlation q of d’s out- puts with the ground-truth construct U, impacts the estimated correlation between U (e.g., stress) and some other phenomenon V (e.g., academic performance). In particular: (1) We show that if the true correlation between U and V is r, then the expected sample correlation, over all vectors T n whose correlation with U is q, is qr. (2) We derive a formula for the probability that the sample correlation (over n subjects) using d is positive given that the true correlation is negative (and vice-versa); this probability can be substantial (around 20 - 30%) for values of n and q that have been used in recent affective computing studies. (3) With the goal to reduce the variance of correlations estimated by an automatic detector, we show that training multiple neural networks d(1) , . . . , d(m) using different training architectures and hyperparameters for the same detection task provides only limited “coverage” of T^n.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-whitehill19a, title = {Automatic Classifiers as Scientific Instruments: One Step Further Away from Ground-Truth}, author = {Whitehill, Jacob and Ramakrishnan, Anand}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {6757--6765}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/whitehill19a/whitehill19a.pdf}, url = {https://proceedings.mlr.press/v97/whitehill19a.html}, abstract = {Automatic machine learning-based detectors of various psychological and social phenomena (e.g., emotion, stress, engagement) have great potential to advance basic science. However, when a detector d is trained to approximate an existing measurement tool (e.g., a questionnaire, observation protocol), then care must be taken when interpreting measurements collected using d since they are one step further removed from the under- lying construct. We examine how the accuracy of d, as quantified by the correlation q of d’s out- puts with the ground-truth construct U, impacts the estimated correlation between U (e.g., stress) and some other phenomenon V (e.g., academic performance). In particular: (1) We show that if the true correlation between U and V is r, then the expected sample correlation, over all vectors T n whose correlation with U is q, is qr. (2) We derive a formula for the probability that the sample correlation (over n subjects) using d is positive given that the true correlation is negative (and vice-versa); this probability can be substantial (around 20 - 30%) for values of n and q that have been used in recent affective computing studies. (3) With the goal to reduce the variance of correlations estimated by an automatic detector, we show that training multiple neural networks d(1) , . . . , d(m) using different training architectures and hyperparameters for the same detection task provides only limited “coverage” of T^n.} }
Endnote
%0 Conference Paper %T Automatic Classifiers as Scientific Instruments: One Step Further Away from Ground-Truth %A Jacob Whitehill %A Anand Ramakrishnan %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-whitehill19a %I PMLR %P 6757--6765 %U https://proceedings.mlr.press/v97/whitehill19a.html %V 97 %X Automatic machine learning-based detectors of various psychological and social phenomena (e.g., emotion, stress, engagement) have great potential to advance basic science. However, when a detector d is trained to approximate an existing measurement tool (e.g., a questionnaire, observation protocol), then care must be taken when interpreting measurements collected using d since they are one step further removed from the under- lying construct. We examine how the accuracy of d, as quantified by the correlation q of d’s out- puts with the ground-truth construct U, impacts the estimated correlation between U (e.g., stress) and some other phenomenon V (e.g., academic performance). In particular: (1) We show that if the true correlation between U and V is r, then the expected sample correlation, over all vectors T n whose correlation with U is q, is qr. (2) We derive a formula for the probability that the sample correlation (over n subjects) using d is positive given that the true correlation is negative (and vice-versa); this probability can be substantial (around 20 - 30%) for values of n and q that have been used in recent affective computing studies. (3) With the goal to reduce the variance of correlations estimated by an automatic detector, we show that training multiple neural networks d(1) , . . . , d(m) using different training architectures and hyperparameters for the same detection task provides only limited “coverage” of T^n.
APA
Whitehill, J. & Ramakrishnan, A.. (2019). Automatic Classifiers as Scientific Instruments: One Step Further Away from Ground-Truth. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:6757-6765 Available from https://proceedings.mlr.press/v97/whitehill19a.html.

Related Material