Towards a Cautious Modelling of Missing Data in Small Area Estimation

Julia Plass, Aziz Omar, Thomas Augustin
Proceedings of the Tenth International Symposium on Imprecise Probability: Theories and Applications, PMLR 62:253-264, 2017.

Abstract

In official statistics, the problem of sampling error is rushed to extremes when not only results on sub-population level are required, which is the focus of Small Area Estimation (SAE), but also missing data arise. When the nonresponse is wrongly assumed to occur at random, the situation becomes even more dramatic, since this potentially leads to a substantial bias. Even though there are some treatments jointly considering both problems, they are all reliant upon the guarantee of strong assumptions on the missingness. For that reason, we aim at developing cautious versions of well known estimators from SAE by exploiting the results from a recently suggested likelihood approach, capable of including tenable partial knowledge about the nonresponse behaviour in an adequate way. We generalize the synthetic estimator and propose a cautious version of the so-called LGREG-synthetic estimator in the context of design-based estimators. Then, we elaborate why the approach above does not directly extend to model-based estimators and proceed with some first studies investigating different missingness scenarios. All results are illustrated through the German General Social Survey 2014, also including area-specific auxiliary information from the German Federal Statistical Office’s data report.

Cite this Paper


BibTeX
@InProceedings{pmlr-v62-plass17a, title = {Towards a Cautious Modelling of Missing Data in Small Area Estimation}, author = {Plass, Julia and Omar, Aziz and Augustin, Thomas}, booktitle = {Proceedings of the Tenth International Symposium on Imprecise Probability: Theories and Applications}, pages = {253--264}, year = {2017}, editor = {Antonucci, Alessandro and Corani, Giorgio and Couso, Inés and Destercke, Sébastien}, volume = {62}, series = {Proceedings of Machine Learning Research}, month = {10--14 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v62/plass17a/plass17a.pdf}, url = {https://proceedings.mlr.press/v62/plass17a.html}, abstract = {In official statistics, the problem of sampling error is rushed to extremes when not only results on sub-population level are required, which is the focus of Small Area Estimation (SAE), but also missing data arise. When the nonresponse is wrongly assumed to occur at random, the situation becomes even more dramatic, since this potentially leads to a substantial bias. Even though there are some treatments jointly considering both problems, they are all reliant upon the guarantee of strong assumptions on the missingness. For that reason, we aim at developing cautious versions of well known estimators from SAE by exploiting the results from a recently suggested likelihood approach, capable of including tenable partial knowledge about the nonresponse behaviour in an adequate way. We generalize the synthetic estimator and propose a cautious version of the so-called LGREG-synthetic estimator in the context of design-based estimators. Then, we elaborate why the approach above does not directly extend to model-based estimators and proceed with some first studies investigating different missingness scenarios. All results are illustrated through the German General Social Survey 2014, also including area-specific auxiliary information from the German Federal Statistical Office’s data report.} }
Endnote
%0 Conference Paper %T Towards a Cautious Modelling of Missing Data in Small Area Estimation %A Julia Plass %A Aziz Omar %A Thomas Augustin %B Proceedings of the Tenth International Symposium on Imprecise Probability: Theories and Applications %C Proceedings of Machine Learning Research %D 2017 %E Alessandro Antonucci %E Giorgio Corani %E Inés Couso %E Sébastien Destercke %F pmlr-v62-plass17a %I PMLR %P 253--264 %U https://proceedings.mlr.press/v62/plass17a.html %V 62 %X In official statistics, the problem of sampling error is rushed to extremes when not only results on sub-population level are required, which is the focus of Small Area Estimation (SAE), but also missing data arise. When the nonresponse is wrongly assumed to occur at random, the situation becomes even more dramatic, since this potentially leads to a substantial bias. Even though there are some treatments jointly considering both problems, they are all reliant upon the guarantee of strong assumptions on the missingness. For that reason, we aim at developing cautious versions of well known estimators from SAE by exploiting the results from a recently suggested likelihood approach, capable of including tenable partial knowledge about the nonresponse behaviour in an adequate way. We generalize the synthetic estimator and propose a cautious version of the so-called LGREG-synthetic estimator in the context of design-based estimators. Then, we elaborate why the approach above does not directly extend to model-based estimators and proceed with some first studies investigating different missingness scenarios. All results are illustrated through the German General Social Survey 2014, also including area-specific auxiliary information from the German Federal Statistical Office’s data report.
APA
Plass, J., Omar, A. & Augustin, T.. (2017). Towards a Cautious Modelling of Missing Data in Small Area Estimation. Proceedings of the Tenth International Symposium on Imprecise Probability: Theories and Applications, in Proceedings of Machine Learning Research 62:253-264 Available from https://proceedings.mlr.press/v62/plass17a.html.

Related Material