Making sense of existing and new evidence: simulation modelling

6.28  The outline logic model in Box 6.A at the beginning of the chapter is conceptually simple, but the examples presented in Box 6.B are in the most part quite involved, with each "step" in the logic itself implying a potentially large number of processes. For instance, the impact pathway for the health costs of air pollution involves complex physical, chemical, biological, technological and economic relationships between the generation of air emissions from electricity generation, meteorological conditions, human physical response to exposure to airborne pollutants, and individuals' attitudes towards changes in their respiratory health.

6.29  As suggested in Chapter 2, in cases such as these, it might not be realistic to expect even a well-designed evaluation to be able to detect any effect of one input - e.g. the amount of coal burned in a power station - and some ultimate outcome - e.g. individuals' health-related quality of life - in a single study. This is because there are too many confounding factors and too much "noise" in the pathway for the effect of one variable on a "distant" outcome to be detected. In these circumstances, an evaluation of a "shorter" set of links in the logic chain is likely to have more chance of producing a robust outcome (e.g. the effects of changes in air quality on reported respiratory health). However, there then remains the question of how the real relationship of interest (which might be the entire impact pathway) can be evaluated.

6.30  In other situations, reviews of the existing literature, using some of the techniques considered in this chapter, might reveal that there is a substantial body of robust evidence covering particular aspects of the logic model in question, but little or no evidence relating to others. This might mean that an evaluation which is restricted in scope and focuses on these less developed areas will be considered more robust and better value for money than one that attempts to cover the entire impact pathway. The issue is then how the results of this new study can be combined with existing evidence to answer the evaluation questions.

6.31  Simulation modelling is one way in which the results of different evaluations of separate parts of the impact pathway or logic of an intervention can be combined. Simulation models are most commonly constructed in spreadsheet-style software using quantitative data. This requires that the evidence relating to the different links in the logic model are expressed in quantitative terms (e.g. effect sizes). It also means that the evidence must relate to comparable "endpoints", or at least to endpoints which can be "translated" into comparable measures. Box 6.G illustrates this using the example training intervention introduced in Chapter 2.

Box 6.G: Constructing a simulation model for a hypothetical policy intervention

Chapter 2 presented a (hypothetical) example policy to recruit unemployed individuals onto a new training scheme which provides seminars to improve work skills, with the intention of reducing the costs of unemployment.

A simulation model of a (full) economic evaluation of this intervention might require quantitative evidence on the following links of the implied logic model:

1  measures of the resources used (costs) in delivering seminars;

2  effect of the seminar series on (net) seminar attendance;

3  effect of seminar attendance on participant skills;

4  effect of change in participant skills on subsequent employment and earnings trajectories; and

5  effect of changes in employment and earnings trajectories on quality of life and other relevant indicators (e.g. health status).

In this example, the endpoints of each stage in the logic model are the same, and hence are comparable, by construction. Evidence relating to each stage could therefore be linked in a simulation model with no need for "translation". However, if existing evidence relating to the fourth stage above was defined in terms of (e.g.) formal qualifications, but the evidence on the third stage measured skills in terms of specific abilities (e.g. reading and writing), then some translation might be necessary to estimate the "qualification equivalents" of the skill levels resulting from the intervention.

6.32  The example in Box 6.G suggests that some form of simulation modelling is likely to play a role in a large proportion of impact evaluations. For instance, where outcomes are expected to be affected materially over a number of years, some simulation of these effects might be necessary to ensure that evaluation evidence is obtained in a timely fashion. In addition, it might be difficult to detect in a single evaluation study an effect on lifetime earnings trajectories of attendance on a short-term training course at some point in the past, again suggesting the need to simulate any such effects (assuming there is evidence to support them). Finally, any wide-ranging economic evaluation will almost certainly require a simulation model, not least because many economic outcomes can only be measured through dedicated research exercises. An example might be a survey of affected individuals to estimate the value of changes in health status, which evidence suggests is associated with pollution-related reductions in air quality.

6.33  Whether a simulation-based approach to answering the evaluation questions will be appropriate and necessary is important to establish early on in the design of any new evaluation research study. This is because the need to use endpoints which are either comparable directly or can be translated into comparable terms might influence the design of the study, data collection and so on. Selecting the incompatible outcome measures at the study design stage might make it impossible to make the necessary linkages in the simulation model, because there is no satisfactory "translation". Issues related to data collection are discussed further in Chapter 7.

6.34  Note that simulation-based evaluations will always be subject to some uncertainty about the validity of the assumed links and evidence underpinning them. With this approach, all outcomes are not measured directly, so the evaluation cannot "prove" that an impact was actually caused by the intervention in question. Where endpoints need to be translated to make them comparable, the translation will by necessity be based on assumption(s), and the validity of these assumptions will affect the reliability of the calculated impacts. In some cases, evidence relating to some links in the logic model might be relatively weak or even missing entirely, requiring stronger assumptions and introducing greater uncertainty. Many theory-based evaluations use significant amounts of qualitative evidence and assumptions to produce estimates of the impact of an intervention, and the uncertainty inherent in such information needs to be borne in mind when considering the reliability of the results.