A comparison of the World Health Organisation's HEAT model results using a non-linear physical activity dose response function with results from the existing tool

Introduction: The WHO-Europe’s Health Economic Assessment Tool is a tool used to estimate the costs and benefits of changes in walking and cycling. Due to data limitations the tool’s physical activity module assumes a linear dose response relationship be-tween physical activity and mortality. Methods: This study estimates baseline population physical activity distributions for 44 countries included in the HEAT. It then compares, for three different scenarios, the results generated by the current method, using a linear dose-response relationship, with results generated using a non-linear dose-response relationship. Results: The study finds that estimated deaths averted are relatively higher (lower) using the non-linear effect in countries with less (more) active populations. This difference is largest for interventions which affect the activity levels of the least active the most. Since more active populations, e.g. in Eastern Europe, also tend to have lower Value of a Statistical Life estimates the net monetary benefit estimated by the scenarios are much higher in western-Europe than eastern-Europe. Conclusions: Using a non-linear dose response function results in materially different estimates where populations are particularly inactive or particularly active. Estimating base-line distributions is possible with limited additional data requirements, although the method has yet to be validated. Given the significant role of the physical activity module within the HEAT tool it is likely that in the evaluation of many interventions the monetary benefit estimates will be sensitive to the choice of the physical activity dose response function. This study aims to improve an existing WHO model by introducing a non-linear dose-response between physical activity and mortality. The existing model only assumes a linear relationship between physical activity and all-cause mortality which has been introduced for parsimonious reasons but nevertheless is based on a very strong (and non realistic) assumption. The study makes clear that it relies on the quality of the existing WHO HEAT model and does not try to challenge the simplicity of that model but tries to show that a non-linear dose-response in the relationship between activity levels and mortality leads to more accurate findings. Based on the HEAT model's framework, the study converts avoided deaths into monetary benefits based on the value of a statistical life. study to


Introduction
There is a growing recognition of the importance of considering health in all policies [1][2][3] . One example of successful integration of health impact in another policy domain is the World Health Organization's Health Economic Assessment Tool (HEAT), which has been widely used, primarily by transport planners, to estimate the health benefits associated with increased walking and cycling 4 . The success of the HEAT is in part due to its simplicity, requiring relatively few user inputs compared to other health economic models 5 .
However, a limitation of the HEAT is that despite broad consensus that the relationship between physical activity and all-cause mortality is non-linear, such that the greatest health benefits from an extra unit of physical activity accrue in those who are least active 6-8 , the HEAT assumes a linear relationship between physical activity and mortality. The HEAT methods and user guide states that "a linear relationship was chosen to avoid additional data requirements on baseline activity levels (which would be needed using a non-linear dose-response function)" 4 . There is however a recognition that improvements in data availability could allow for a non-linear relationship to be used in the future. The same report states that "An approach based on a non-linear relationship could be adopted as part of future updates of HEAT, when suitable data on the baseline level of physical activity in different populations are available to provide default values for HEAT" (p.9).
This study uses a method developed by Hafner et al. 9 to estimate the distribution of physical activity in 44 countries in the WHO European Region for which the HEAT applies. It then compares, for three hypothetical scenarios, the number of deaths averted and the monetary benefit when assuming a linear relationship, as done by the current HEAT model, and a non-linear relationship between physical activity and all-cause mortality. Although previous analysis has shown the importance of estimating changes in the distribution of physical activity, rather than categorizing activity levels 10 this is the first time that the effect of the shape of the dose response relationship has been analysed within a single health economic model, with all other structural assumptions held constant. Woodcock et al. (2013) 11 estimated the difference in the number of deaths averted between the ITHIM and HEAT tools when modelling all-cause mortality, and when modelling several diseases individually. Since the ITHIM model uses a non-linear power transformation, the difference between the ITHIM and HEAT does in part reflect differences associated with the dose-response function. However, there are other differences between the ITHIM and HEAT which make it impossible to isolate the effect of the shape of the dose-response relationship for physical activity on model outcomes. This study aims to isolate this effect, to investigate how sensitive the HEAT model is to the assumed dose response relationship.

Data and measures
This study uses data on the prevalence of insufficient physical activity in 44 HEAT countries from a publication by 12, the self-reported non-occupational (leisure time and commuting) physical activity levels of a representative sample of the English population from the Health Survey for England 2015 13 , country specific mortality rates for those aged 20-74 from the European Mortality Database 14 and value of a statistical life estimates from a systematic review 15 . It uses the linear doseresponse relationship between physical activity and mortality from 7 as described in the HEAT methodology paper 4 , and a non-linear dose-response relationship as described in 16. A summary of data including sources can be found in Table 1.

Analysis
We estimate the number of deaths averted per 100,000 and the net monetary benefit using both the non-linear dose-response method and the linear dose-response currently used by HEAT for 44 European countries in three scenarios: This analysis is not an attempt to estimate the probability, feasibility or costs of achieving the scenarios. For each scenario we assume that the outcome is achieved, and we estimate the benefits in terms of deaths averted per 100,000 and monetize these benefits using the VSL.
The current HEAT method using a linear dose response relationship The current HEAT method requires the user to input preintervention and post-intervention physical activity levels, in terms

Amendments from Version 1
There have been several updates: 1) The Abstract has been tidied to ensure that line-breaks are implemented correctly.
2) There are several adaptations to the text as per responses to reviewers, there has been one additional reference added (see tracked changes).
3) As a result of these changes based on reviewer requests, Figure 3 has been updated in the main body and a new figure and table have been added to the supplementary materials.
Any further responses from the reviewers can be found at the end of the article REVISED of minutes of walking and cycling 4 . It estimates the relative risk associated with each activity level using Equation 1 below.
For a walking intervention the relative risk RR lit is 0.89, the reference minutes of activity from the literature Mins ref is 168mins per week and the risk reduction cap RR min is 0.7, such that every additional 10 minutes of weekly walking (Mins local = 10) reduces relative risk by 0.65 percentage points, to a limit of 30 percentage points. Number of deaths averted DA is then calculated by multiplying the absolute difference in relative risk between intervention and comparator (RR i −RR c ) by the country specific mortality rate of the population aged 20-74 MR c and the population affected, pop (Equation 2 below). This is then monetized in terms of monetary benefit (MB) in Equation 3 by multiplying the number of deaths averted by the country specific value of a statistical life VSL c : The adapted method using a non-linear dose response relationship The non-linear dose response method requires a baseline distribution of physical activity. We use weekly metabolic equivalent of task minutes (MET-minutes) from moderate and vigorous physical activity to summarize an individual's physical activity level in one number 17 . A distribution of weekly MET-mins for each country was imputed using a method from Hafner et al. (2019) 9 . This method combines estimates of prevalence of physical inactivity for each of the 44 countries with the distribution of physical activity in a generic distribution (we use the distribution derived from the Health Survey for England). Each percentile, n, of physical activity in the target country, c, distribution is calculated separately using the equation below.
The weekly MET-mins, p, for each country c, at each percentile, n, is based on the prevalence of sufficient physical activity, x c , in the country, c, compared to the prevalence of sufficient activity in the generic distribution,  Once relative risk is calculated, the deaths averted and monetary benefit are calculated using Equation 2 and Equation 3.

Comparison
For each of the 44 countries included in the analysis, for each of the three scenarios, and for each of the four dose response functions, we calculated two metrics: -the number of deaths averted per 100,000 persons aged 20-74.
-the monetary benefit associated with mortality reduction, using the HEAT VSL estimates for each country 15 .
A comparison of the number of deaths averted under different modelling methods are displayed using simple scatter plots with a 45-degree line of equality, and monetary benefit estimates are shown, in Euros, on choropleth maps of Europe.

Results
The estimated distributions of physical activity for each of the 44 countries in the analysis are provided in the supplementary material, and can also be found on this GitHub repository: https:// github.com/RobertASmith/HEAT_DRF/blob/master/output/coun-try_mets_dist.csv.
A comparison of the number of premature annual deaths averted per 100,000 people using the two different methods in each of the three scenarios for the 44 WHO European Region countries is shown in Figure 1 below. The estimates derived using the linear dose response method are shown on the x-axis and the non-linear dose response on the y axis. A 45-degree line of equality is plotted to aid comparison. The country points are labelled with ISO3 codes and shaded from black for low insufficient physical activity prevalence (IPAP) to blue for those with a high IPAP.
The figure shows that for the first scenario, an additional 10 minutes of daily walking, countries with particularly inactive (active) populations tend to have higher (lower) estimated deaths averted using the non-linear function compared to the linear function.
In the second scenario all individuals with activity levels below WHO physical activity guidelines of 600 MET-mins per week increase activity to meet guidelines. Here, the non-linear function results in higher deaths averted than the linear function in most countries, except for some with especially low prevalence of insufficient physical activity (e.g. Moldova and Belarus).
In the third scenario, in which all individuals increase their physical activity level by 10%, estimates derived using a non-linear function are much lower than using a linear function for all countries, regardless of the prevalence of insufficient physical activity. This is because those with low physical activity levels, who would benefit the most from increased physical activity according to a non-linear model, have low increases in MET-mins, while those who are highly active have high absolute increases in MET-mins but benefit little in terms of premature mortality reduction when using a non-linear model.
In order to allow for trade-offs in decision making between health and non-health outcomes, the HEAT tool monetises the deaths averted using the Value of a Statistical Life (VSL) 18 , giving an estimate in terms of monetary benefit. Figure 2 below shows the monetary benefit associated with Scenario 1, using a log-linear dose response function with a power transformation of 0.375. The monetary benefits tend to be higher in countries with higher insufficient physical activity prevalence and higher VSL (e.g. Ireland, the UK and Luxemburg) and markedly lower in countries with lower VSL and/or lower physical inactivity prevalence such as Ukraine and Moldova, this results in marked differences between the West and East Europe.

Discussion
Increasing population physical activity is likely to yield large benefits in health, wellbeing & productivity worldwide 9 . However, trade-offs often exist between increasing population physical activity and achieving other health and non-health outcomes. It is therefore important to have a robust method to consider whether interventions that improve activity levels provide good value for money. The HEAT is an example of a tool, often used by transport planners, which allows users to estimate, and monetize, the benefits of increased walking and cycling 3 . In general, the estimates derived from the physical activity module of the tool have been shown to contribute the most to total monetary benefit (Mueller et al., 2015).
We describe an adaption to the current HEAT physical activity module which applies a non-linear dose response relationship between physical activity and mortality risk to estimated country specific baseline distributions of physical activity. The method is more sensitive to interventions which increase the activity levels of the least active, and less sensitive to interventions which increase the activity levels of the most active. This means that similar scenarios may yield less health benefit in more active countries. As noted in our previous work 5 , since countries with higher GDP tend to have a higher Value of a Statistical Life 15 and higher prevalence of insufficient physical activity 12 , the estimated net monetary benefit tends to be higher in western Europe than eastern Europe.
There are numerous limitations of this analysis. Firstly, the method used to estimate the baseline distributions of physical activity in each of the HEAT countries (from 9) assumes that the shape of the physical activity distribution is relatively similar in every country. Comparing the distributions estimated by this method, and provided in the supplementary material, with more detailed datasets would help to validate the estimates of population physical activity distributions. It is likely that the method is reliable for similar countries (e.g. the UK and Germany) but may not be reliable where culture differs (e.g. the UK and Chad). However, it is unlikely that this would affect the main finding of this study, since large differences in the linear and non-linear functions exist when using the UK distribution which is based upon survey data. It is   also worth noting that in this study, as in many other studies relying upon secondary data, the assumption is made implicitly that the same survey methods for physical activity are utilised in the estimation of the dose-response function, and for the purposes of calculating relative risk. Any differences in the survey methods will generate a bias in the estimation of relative risk.
We also note that the comparison between the linear dose response function currently used by the HEAT, and a non-linear function based on Woodcock et al. 2011 is a false dichotomy. It is likely that non-parametric regression techniques, such as spline regression will yield a dose response relationship that is more appropriate, avoiding implausibly large benefits for particularly inactive individuals, which is apparent at low levels of Weekly MET-mins in Figure 3. However, the authors are not aware of any such studies directly relating to population mortality to date, although it is likely new evidence will emerge.
A further limitation of this study is that we do not consider the usability of the tool, only show that a more conceptually valid method is possible. Since the tool is designed to be used by users with little to no public health, epidemiology, statistics and programming ability it is also important that the methods behind the tool are easy to explain, and the tool is simple to use. Increased complexity, in terms of more, or more detailed inputs, and a more difficult to explain model structure may make the tool less 'use-able', used here as a loose term which encompasses both technical feasibility of use and user understanding & confidence, and therefore less valuable. Further work to determine whether stakeholders understand the use of a non-linear dose response relationship on baseline and intervention distributions, and whether users can obtain intervention group physical activity distributions, will likely be a determining factor as to the feasibility of adapting the HEAT tool. Nevertheless, this paper demonstrates that the two approaches do result in substantial differences at the population level, and therefore where possible the non-linear dose response function should be used by researchers.
The trade-off between the 'usability' and 'accuracy' of health impact assessment tools (and public health economic models more generally) is one that needs further attention in the academic literature. Models and tools tend to be either high accuracy but low usability -for example models created in high level programming languages with high computational demands and long runtimes -or low accuracy but high usability -including the HEAT physical activity modules. Understanding how to utilize new tools from data-science to make models which are very accurate and usable would be a useful avenue of future research. Likewise, understanding how to incrementally improve the accuracy of highly usable models (like HEAT) without compromising usability would be a valuable endeavor.

Conclusions
We show that for the WHO European Region countries included in the HEAT tool, the estimates of deaths averted, and therefore monetary benefit, differs substantially depending on the dose response function used. The nonlinear dose response function results in greater estimated benefits, relative to the linear dose response function, where increased physical activity accrues to those who are relatively inactive. It therefore results in greater benefits in countries with higher prevalence of physical inactivity, or interventions which are targeted toward the least active. Developing tools which are both usable, in terms of data requirements and ease of explanation to users, and highly accurate is an important avenue for future research in health impact assessment and public health economics more widely.

RAND Europe, Cambridge, UK
This study aims to improve an existing WHO model by introducing a non-linear dose-response between physical activity and mortality. The existing model only assumes a linear relationship between physical activity and all-cause mortality which has been introduced for parsimonious reasons but nevertheless is based on a very strong (and non realistic) assumption. The study makes clear that it relies on the quality of the existing WHO HEAT model and does not try to challenge the simplicity of that model but tries to show that a non-linear dose-response in the relationship between activity levels and mortality leads to more accurate findings. Based on the HEAT model's framework, the study converts avoided deaths into monetary benefits based on the value of a statistical life.
In my view, this study makes a very important contribution by showing that a non-linear doseresponse is very important when we try to consider the benefits of a more active population. Especially given the heterogeneous costs and benefits of aiming to improve the activity levels of the least active population this is an important improvement of the existing model. The study references the relevant existing literature and discusses clearly the limitations of the study (and to some extent those of the HEAT model).
One element I would like to add is perhaps a discussion around the value of a statistical life. The study makes a reference to a study where the value of the statistical life is taken from, but the actual value chosen doesn't seem to be discussed in more detail. Also, is that value of statistical life uniformly applied to all countries or have they been adjusted for a country-specific context? Also, as the VSL is a somewhat contested metric, have the authors thought to apply sensitivity ranges for the value? I understand that was not the main intention of the study, but given the study findings are relevant, it would be great to add some sensitivity ranges.

Is the work clearly and accurately presented and does it cite the current literature? Yes
Is the study design appropriate and is the work technically sound?

correspond to fewer years of life lost."
The authors compare a linear versus a power-transformed exposure. The estimation of the power transformation is taken from my paper (Woodcock et al. 2011). However, meta-analytic methods have moved on and I would now recommend cubic spline based approaches where avaiable e.g. Smith et al. 2016, 2 n.b. this paper is only for diabetes) The advantage of this approach is it allows a more flexible shaped curve. The transformed curves produce implausibly large benefits at lowest exposure levels (below 1 MET h/wk) and correspondingly produce smaller benefits beyond that. This means the transformed approach is overly sensitive to how many people are assumed to be doing zero activity. Given that PA survey design can lead to notably different results, including considerable variation in the proportion of the population who are active, care is needed. I support the authors approach to modelling PA distributions. However, careful consideration is needed to harmonise responses to different questionnaires. Such harmonisation should also consider the questionnaires used as part of generating the dose response meta-analyses.
Walking and cycling vs broader measures of PA: When generating the PA distributions if I understand correctly these are not just walking or cycling, but a broader measure (probably non-occupational PA)? While I support the idea of modelling PA distributions the user of a tool like HEAT would then need to know who's PA is being changed, the active or the inactive people, alternatively the tool would need to be able to estimate this in some way based on the type of intervention. In ITHIM we use stochastic matching based on demographics and other variables between PA and travel survey distributions, but additional approaches would be needed depending on the use case.