New Methods for Job and Occupation Classification

Project Directors Prof. Dr. Frauke Kreuter Project Staff Antje Marlene Rosebrock, Knut Wenzig DFG-funded 2014 – 2021

Research question/goal:

Currently, most surveys use open-ended questions to ask participants about their occupation. The verbatim responses are coded afterwards into a classification with hundreds of categories and thousands of jobs, which is an error-prone, time-consuming, and costly task. When textual answers have a low level of detail, accurate coding may be impossible.
The project aimed to improve the measurement process using a novel instrument: during the survey, respondents were asked to answer a closed question about their occupations, directly after they answered an initial open-ended question. A supervised machine learning algorithm was trained to suggest a short list of candidate job categories, from which respondents could select the most appropriate one. Owing to the careful design of the instrument’s layout, the interaction between interviewers and respondents, and the job descriptions that are used for communication, high usability standards can be ensured.
The new instrument has been tested in different population surveys, and it has been shown that interviewers and respondents feel comfortable using the instrument. We argue that data quality improves when respondents can self-select the most appropriate occupational category. However, a detailed analysis of data quality turned out to be complex and is left for future research.


Publications

Edited Books

  • Foster, Ian, Rayid Ghani, Ron S. Jarmin, Frauke Kreuter, Julia Lane (Eds.) (2017): Big Data and Social Science: A Practical Guide to Methods and Tools. 356. London, Chapman & Hall / CRC Press. More

Journal Articles

  • Amaya, Ashley, Ruben L. Bach, Florian Keusch, Frauke Kreuter (2021): New Data Sources in Social Science Research: Things to Know Before Working With Reddit Data. Social Science Computer Review, 39, 4, 943-960. More
  • Schierholz, Malte (2018): Eine Hilfsklassifikation mit Tätigkeitsbeschreibungen für Zwecke der Berufskodierung. AStA Wirtschafts- und Sozialstatistisches Archiv: Eine Zeitschrift der Deutschen Statistischen Gesellschaft, 12, 3-4, 285–298 . More
  • Schierholz, Malte, Miriam Gensicke, Nikolai Tschersich, Frauke Kreuter (2018): Occupation coding during the interview. Journal of the Royal Statistical Society: Series A (Statistics in Society), 181, 2, 379–407. More

Presentations

  • Schierholz, Malte (2018): A comparison of automatic algorithms for occupation coding. [Statistische Woche, Linz, 10/09/2018 - 13/09/2018]. More
  • Schierholz, Malte (2018): A comparison of automatic algorithms for occupation coding. [European Conference on Data Analysis, Paderborn, 03/07/2018 - 05/07/2018]. More
  • Schierholz, Malte (2017): A New Auxiliary Classification with Job Activities for Occupation Coding. [7th Conference of the European Survey Research Association, Lisbon, 16/07/2017 - 20/07/2017]. More
  • Bethmann, Arne, Malte Schierholz, Knut Wenzig, Markus Zielonka (2014): Automatic Coding of Occupations : Using Machine Learning Algorithms for Occupation Coding in Several German Panel Surveys. [WAPOR 67th Annual Conference : Extensible Public Opinion, Nice, 03/09/2014 - 05/09/2014]. More
  • Schierholz, Malte, Arne Bethmann (2014): Automating Survey Coding for Occupation. [Joint Statistical Meetings 2014, Boston, Mass., 01/08/2014 - 06/08/2014]. More

Reports

  • Schierholz, Malte (2014): Automating survey coding for occupation. 10/2014, 65. Nürnberg, The Research Data Centre (FDZ) of the Federal Employment Agency in the Institute for Employment Research. More

Thesis

  • Schierholz, Malte (2019): New methods for Job and Occupation Classification. Mannheim, University of Mannheim. More