A maximum-mean-discrepancy goodness-of-fit test for censored data

Tamara Fernandez, Arthur Gretton
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:2966-2975, 2019.

Abstract

We introduce a kernel-based goodness-of-fit test for censored data, where observations may be missing in random time intervals: a common occurrence in clinical trials and industrial life-testing. The test statistic is straightforward to compute, as is the test threshold, and we establish consistency under the null. Unlike earlier approaches such as the Log-rank test, we make no assumptions as to how the data distribution might differ from the null, and our test has power against a very rich class of alternatives. In experiments, our test outperforms competing approaches for periodic and Weibull hazard functions (where risks are time dependent), and does not show the failure modes of tests that rely on user defined features. Moreover, in cases where classical tests are provably most powerful, our test performs almost as well, while being more general.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-fernandez19a, title = {A maximum-mean-discrepancy goodness-of-fit test for censored data}, author = {Fernandez, Tamara and Gretton, Arthur}, booktitle = {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics}, pages = {2966--2975}, year = {2019}, editor = {Chaudhuri, Kamalika and Sugiyama, Masashi}, volume = {89}, series = {Proceedings of Machine Learning Research}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/fernandez19a/fernandez19a.pdf}, url = {https://proceedings.mlr.press/v89/fernandez19a.html}, abstract = {We introduce a kernel-based goodness-of-fit test for censored data, where observations may be missing in random time intervals: a common occurrence in clinical trials and industrial life-testing. The test statistic is straightforward to compute, as is the test threshold, and we establish consistency under the null. Unlike earlier approaches such as the Log-rank test, we make no assumptions as to how the data distribution might differ from the null, and our test has power against a very rich class of alternatives. In experiments, our test outperforms competing approaches for periodic and Weibull hazard functions (where risks are time dependent), and does not show the failure modes of tests that rely on user defined features. Moreover, in cases where classical tests are provably most powerful, our test performs almost as well, while being more general.} }
Endnote
%0 Conference Paper %T A maximum-mean-discrepancy goodness-of-fit test for censored data %A Tamara Fernandez %A Arthur Gretton %B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-fernandez19a %I PMLR %P 2966--2975 %U https://proceedings.mlr.press/v89/fernandez19a.html %V 89 %X We introduce a kernel-based goodness-of-fit test for censored data, where observations may be missing in random time intervals: a common occurrence in clinical trials and industrial life-testing. The test statistic is straightforward to compute, as is the test threshold, and we establish consistency under the null. Unlike earlier approaches such as the Log-rank test, we make no assumptions as to how the data distribution might differ from the null, and our test has power against a very rich class of alternatives. In experiments, our test outperforms competing approaches for periodic and Weibull hazard functions (where risks are time dependent), and does not show the failure modes of tests that rely on user defined features. Moreover, in cases where classical tests are provably most powerful, our test performs almost as well, while being more general.
APA
Fernandez, T. & Gretton, A.. (2019). A maximum-mean-discrepancy goodness-of-fit test for censored data. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:2966-2975 Available from https://proceedings.mlr.press/v89/fernandez19a.html.

Related Material