The use of latent variable models in policy: a road fraught with peril?

20 August, 2019
Kobe, Japan

Choice modelers have long recognized that people’s attitudes and beliefs affect their decisions. However, self reported measures are subject to measurement error, and these self-reported measures are likely correlated with unobserved factors influencing choice leading to potential endogeneity bias. The integrated choice and latent variable model allows for the inclusion of these measures indirectly through a latent variable. Variations of the model has been implemented in marketing, transport, environmental and health economics.

The basic premise of the model is that answers to Likert scale questions concerning attitudes and beliefs can be mapped to latent character traits. For example, responses to questions about nature conservation might reveal information that you are latently a tree hugger. However, the latent variable is difficult to interpret and it is unclear whether it measures what the researcher thinks. Some authors argue that integrated choice and latent variable models should be accompanied by exploratory factor analysis to understand better which attitudinal questions correlate with which latent constructs. Nonetheless, this lack of interpretability raises questions about the appropriateness of using hybrid choice models to inform policy.

An inappropriate use of hybrid choice models is to inform policies that seek to influence choice by targeting the latent variable given the possible endogenous relationship between the two. While this appears to be common in transport, researchers in other fields, e.g., environmental economics, seek to understand how latent character traits affect choices and behavior. As such, hybrid choice models could serve a purpose in identifying groups of people that are more susceptiple to a certain policy meausure.

While a familiar aphorism among econometricians is that all models are wrong, to be of practical use, there is a need to ensure that choice model results are understandable to a non-technical entity. Choice modellers, like other econometricans, should adhere the words of Albert Einstein that Everything should be made as simple as possible, but not simpler. There is the need to be mindful of the proliferation of parameters and model complexity. While more comprehensive models ensure the choice data is ftted well, there is a risk that it is tailored too closely to the sample data. This compromises the ability to generalise the model beyond the existing dataset and may be restrictive for policy makers. While end users will often want to establish the relationship between the dependent variable(s) and a relatively small number of key independent variables, increasing model complexity is justified if it produces reasonably more accurate results. The investigations in this paper follows from this latter argument, and the need to quantify reasonably. Are there certain conditions under which identifying these latent segments of the population outweighs the computational cost of doing so? We put forth arguments for whether or when it is appropriate to consider hybrid discrete choice models to inform policy, but continue the discussion of whether the hybrid model exists out of academic curiosity or the real need to use unobservable latent variables to inform policy.

Danny Campbell
Danny Campbell
Professor of Economics
Erlend Dancke Sandorf
Erlend Dancke Sandorf
Marie Skłodowska-Curie Research Fellow