Modular Questionnaire Designs for Social Surveys: Statistical Modelling of Designed Missingness

Project Directors Prof. Ph.D. Annelies G. Blom, Prof. Dr. Christof Wolf, Dr. Christian Bruch Project Staff Julian Beat Axenfeld DFG-funded 2017 – 2023

Research question/goal:

The project examined the usefulness of procedures for imputing planned missing values resulting from modular questionnaire designs (MQDs). The aim of the imputations was to obtain a complete data set ready for secondary data analysis. Our research focussed on the application of MQDs in social science surveys, which underlie typical conditions such as relatively small sample sizes, large numbers of variables, low variable/item correlations, and categorical levels of measurement.

To achieve our research aims, we ran Monte Carlo simulations on datasets of the German Internet Panel (GIP) using high-performance computers of the federal state of Baden-Wuerttemberg (bwHPC).

Our research project produced several key results: First, the allocation of items of the same topic to the same module (assuming a high correlation of these items) led to worse imputation-based estimates than a random allocation or an allocation of items of the same topic to different modules. Due to the large number of small correlations in our data, we observed only few differences between the last two strategies. Second, we compared the performance of a series of different imputation methods regarding their ability to produce complete datasets that allow for estimates with acceptable quality when applying MQDs. For small samples and large numbers of variables, which are typical for social science surveys, we obtained good results with imputation methods that simplify the imputation models. Examples are procedures that reduce the number of predictors.

Third, we examined item nonresponse, which can occur in addition to the planned missing values in MQDs. We showed that serious problems arise when the proportion of nonresponse from the sum of both sources is too large and when item nonresponse is missing not at random (MNAR). Thus, we recommend reducing the number of planned missing values for items that are expected to produce high levels of item nonresponse. 


Publications

Journal Articles

  • Wiśniowski, Arkadiusz, Joseph W. Sakshaug, Diego Andres Perez Ruiz, Annelies G. Blom (2020): Integrating Probability and Nonprobability Samples for Survey Inference. Journal of Survey Statistics and Methodology, 8, 1, 120-147. More

Presentations

  • Axenfeld, Julian B. (2022): Planned Missing Data in Social Surveys: Evaluating Strategies Regarding Their Design and Imputation. [8th bwHPC Symposium, virtual conference, 27/11/2022 - 27/11/2022]. More
  • Bruch, Christian (2018): Variance estimation under imputation using the rescaling bootstrap. [Joint Statistical Meetings, Vancouver, 27/07/2018 - 01/08/2018]. More

Thesis

  • Axenfeld, Julian B. (2023): Imputation of Missing Data from Split Questionnaire Designs in Social Surveys. Mannheim, University of Mannheim. More