COVID-19 – exploring the implications of long-term condition type and extent of multimorbidity on years of life lost: a

The COVID-19 pandemic is responsible for increasing Background: deaths globally. Most estimates have focused on numbers of deaths, with little direct quantification of years of life lost (YLL) through COVID-19.  As most people dying with COVID-19 are older with underlying long-term conditions (LTCs), some have speculated that YLL are low. We aim to estimate YLL attributable to COVID-19, before and after adjustment for number/type of LTCs. We first estimated YLL from COVID-19 using standard WHO life Methods: tables, based on published age/sex data from COVID-19 deaths in Italy. We then used aggregate data on number/type of LTCs to model likely combinations of LTCs among people dying with COVID-19. From these, we used routine UK healthcare data to estimate life expectancy based on age/sex/different combinations of LTCs. We then calculated YLL based on age, sex and type of LTCs and multimorbidity count. Using the standard WHO life tables, YLL per COVID-19 death Results: was 14 for men and 12 for women. After adjustment for number and type of LTCs, the mean YLL was slightly lower, but remained high (13 and 11 years for men and women, respectively). The number and type of LTCs led to wide variability in the estimated YLL at a given age (e.g. at ≥80 years, YLL was >10 years for people with 0 LTCs, and <3 years for people with ≥6). Deaths from COVID-19 represent a substantial burden in Conclusions: terms of per-person YLL, more than a decade, even after adjusting for the typical number and type of LTCs found in people dying of COVID-19. The extent of multimorbidity heavily influences the estimated YLL at a given age. More comprehensive and standardised collection of data on LTCs is needed to better understand and quantify the global burden of COVID-19 and to guide policy-making and interventions. 1 1 2 2


Introduction
When severe, coronavirus disease 2019 (COVID-19) causes acute respiratory failure, often requiring mechanical ventilation 1 . Globally, as of 6 th April 2020, more than 1,200,000 confirmed cases have been reported including 67,000 deaths 2 . In response to this threat, governments have introduced non-pharmaceutical interventions such as physical distancing and the delivery of health services has radically changed, with resources diverted towards the management of COVID-19 and away from their usual activities 3 . These measures have aimed to limit a surge in cases that risks overwhelming healthcare services 4 .
Since few health care systems could have responded adequately to the increased need for acute care without these changes, these decisions were in some ways inevitable. However, in the absence of a vaccine, as societies seek to "return to normal", decisions about the extent and nature of ongoing measures to limit spread of COVID-19 will be more difficult. These choices will require balancing the likely direct effects on mortality from COVID-19 against the likely indirect impacts on mortality for other conditions -due, for example, to inadequate access to necessary services for many people with long-term conditions (LTCs), potential reluctance of the public to attend for acute events such as myocardial infarction, or impacts from forced unemployment, loss of income and social isolation. The indirect effects are likely to be complex, most will be downstream, and will require extensive research to be better understood. However, we need to capture the direct effects of COVID-19 as accurately as possible now, via currently available data and methodologies.
Currently, most reports of COVID-19 deaths have used raw counts 2 . This may give a distorting picture of the mortality burden, however, as it does not consider how long someone who died from COVID-19 might otherwise have been expected to live. As people dying from COVID-19 are predominantly older and have pre-existing LTCs 5-7 , some have speculated that many of these people would have soon died of other causes and that life expectancy may therefore not being greatly impacted 8,9 . While multimorbidity, the presence of multiple LTCs, is known to be associated with increased mortality 10 , people with multimorbidity nonetheless can be expected to live for many years 11 . Raw counts of deaths may therefore mislead policy-makers and the public, causing them to either over-or under-estimate the total impact of COVID-19 related deaths.
Within epidemiology, there is a standard measure used to account for this difficulty, the years of potential life lost (YLL) 12 . YLL can be expressed per-capita as the average number of years an individual would have been expected to live had they not died of a given cause. The conventional approach to YLL uses data on the age at which deaths occurred combined with typical life expectancy at a given age, to estimate a weighted average of the number of years lost. YLL is used to allow fair comparisons of the health impact of different policies, such as different measures to address the pandemic. However, given the controversial role of multimorbidity in COVID-19 deaths it is also important to calculate YLL additionally considering the effects of the presence of a single LTC or multimorbidity.
Therefore, we propose to quantify the burden of mortality related to COVID-19, both using the conventional age-based YLL measure, and YLL additionally accounting for type and number of underlying LTCs.

WHO standard YLL approach
The standard approach for calculating years of life lost is to apply the distribution of ages among those who died from a specific cause to a standard life- Overview of modelling to accommodate long-term conditions and multimorbidity The remainder of the methods describes our approach to estimating YLL accounting for number and type of underlying LTC, along with age and sex. Our modelling comprised three main components: (i) estimating the prevalence of, and correlations between, LTCs among people dying with COVID-19; (ii) modelling UK life expectancy based on age, sex, and each combination of these LTCs separately; and (iii) combining these models to calculate the estimated YLL per death with COVID-19. These are summarised by age-group, sex, and multimorbidity counts (that take into account different combinations of LTCs).
The data sources used for each of these stages of modelling are summarised in Figure 1.

Rapid review
To inform our estimates of number and type of LTCs, we performed a rapid review to identify data on underlying conditions for people dying with COVID-19. We searched the WHO repository of COVID-19 studies on 24 th March 2020. To identify studies reporting data on LTCs among people who had died from Covid-19, we screened titles and abstracts of all epidemiological, clinical, case-series and review articles (n=1685). We identified and screened 77 potentially relevant full-text articles, of which four reported aggregate data on LTCs among people who had died of COVID-19. Three were small studies (32, 44, and 54 deaths, respectively) based in Wuhan, China 5-7 . However, the fourth was a comprehensive report from the Istituto Superiore di Sanità (ISS) (published each Tuesday and Wednesday) including data on 11 common LTCs (ischaemic heart disease, atrial fibrillation, heart failure, stroke, hypertension, diabetes, dementia, chronic obstructive pulmonary disease, active cancer in the past 5 years, chronic liver disease and chronic renal failure), as well as the number of patients who had 0, 1, 2 or ≥3 LTCs for 701 of the 6801 people who died with COVID-19 in Italy 14 . In view of the smaller sizes of the Chinese studies, and the greater dissimilarity of these populations with the UK relative to the Italian data, we opted not to include these in the analysis. These data were used to construct a plausible scenario for the prevalence of combinations of LTCs among people who died from COVID-19 for the modelling presented here.

Long-term condition prevalence and correlation models.
This first stage of our modelling aimed to estimate the prevalence and correlation between specific LTCs among people dying with COVID-19.
We utilised aggregate data on COVID-19 deaths from the Istituto Superiore di Sanità in Italy. Since we were unable to obtain IPD for the Italian case-series of deaths from COVID-19, we had to infer the joint prevalence of LTCs from the summarised information available, i.e. the marginal distribution of multimorbidity counts (the row sums, or total number of diseases for each patient, wherein counts of ≥3 LTCs were collapsed into the single category of 3+) and the marginal distributions of LTC frequency (the columns sums, or the total number of patients with each LTC). To that end, we developed a Bayesian latent process model of disease prevalence and correlation and fitted it using Markov chain Monte Carlo (MCMC) to both elements in the published data. This analysis was applied jointly to the small number of deaths that had occurred in Scotland, primarily to aid convergence in Bayesian model fitting by providing some information about the correlation between LTCs 15 . The Scottish subset of the data contained a partial record of known LTCs for individual patients, but the multimorbidity count per patient, as well as the marginal frequency of each LTC, were missing (hence, modelled as latent). Bayesian priors for the correlations between diseases were specified with a tendency to zero (shrinkage). Numerical investigations indicated little sensitivity of convergence to the strength of shrinkage, so we opted for weak shrinkage as a precautionary approach. This model gave us the full matrix of correlations between every combination of LTCs at the level of individuals, therefore providing us with a complete dependence structure of LTCs presented within the sample of COVID-19 mortalities. In order to propagate uncertainty through the analysis, from this fitted model (effective sample size of MCMC 410) we simulated 10,000 notionally "typical" patients, with plausible combinations of LTCs (under the combined Italian and Scottish data).
To test the sensitivity of our findings to the estimated correlations, we also estimated the YLL under two opposite extremes (i) that LTCs were independent and (ii) that LTCs were highly correlated. Unlike the Bayesian LTC mode, these sensitivity analyses did not use the information on the multimorbidity counts from the ISS report, but only the proportion of patients with each of the eleven comorbidities. For the "independent" scenario we created 11 vectors comprising 1s and 0s (respectively with and without the long term condition) corresponding in length to the number of patients. We then sampled from these vectors with replacement to obtain 10,000 simulated patients. For the "highly correlated" scenario we first sorted each vector, then combined them to form a 710x11 matrix, then sampled each row with replacement to obtain 10,000 simulated patients.
Age models. Next, we modelled the relationship between age and multimorbidity counts among people dying with COVID-19. We were unable to obtain direct estimates of the association between age and extent of multimorbidity among patients who had died from COVID-19. Therefore, we modelled two scenarios: independence between age and multimorbidity count (i.e. no correlation between age and multimorbidity count among people dying of COVID-19), and a positive association between age and multimorbidity count. To inform the latter, we examined data within SAIL for 145 patients who had influenza recorded as the cause of death in their death certificate in 2011. We found that for men, age increased by 4.7 years per unit increase in the number of LTCs until the count reached 6 after which there was no evidence of further increase. For women, the figure was 2.6. Therefore, we performed the modelling assuming that for COVID-19 the mean age increased by 5 years per unit increase in multimorbidity count across the range from 0 to 6 LTCs in men. To allow for some degree of uncertainty around this estimate by sampling from a normal distribution. We arbitrarily chose a standard deviation of 0.5. We estimated this similarly for women, but using a mean increase of age of 3 years per increase in multimorbidity count. We incorporated this information in a model fitted to the summary age data provided in the Italian case report. We obtained 10,000 samples from the posterior distribution for inclusion in the YLL calculations. SAIL analyses were approved by SAIL Information Governance Review Panel (Project 0830). Approval for the use of individual patient data in the analysis was given by the NHS Public Health Scotland Caldicott officer.
Survival models. For patients aged 50 years or older at death, we estimated mortality according to age, sex and combinations of each LTC using the Secure Anonymised Information Linkage (SAIL) databank. SAIL is a repository of routinely collected healthcare data (including primary care, hospital episodes, and mortality data) from a representative sample covering approximately 70% of the population of Wales. From these data, we identified all participants aged over 49 years who were registered with a participating practice for the duration of 2011 (approximately 0.85 million people). This period was selected as electronic coding of diagnoses was well established, and it allowed >6 years of follow-up. Age and sex were extracted from primary care records. We also identified all LTCs for which we had information of COVID-19 deaths from Italy. LTCs were identified using a combination of primary care data (using Read diagnostic codes) and hospital episodes (using ICD-10 codes  15 . In models stratified by sex we included all the LTCs as main effects as well as age-LTC interactions that improved the model fit in terms of the Akaike information criterion. In sensitivity analyses we also included two-way (comorbidity-comorbidity) and three-way (comorbiditycomorbidity-age) interaction terms for the four comorbidities with the largest effect measure estimates (COPD, heart failure, liver failure and dementia) requiring 12 additional parameters. To propagate uncertainty from the survival models we obtained 10,000 samples of the coefficient estimates by sampling from a multivariate normal distribution corresponding to the coefficients and variance-covariance matrix from the regression models.

Combination of comorbidity and mortality models.
In the final analysis, we combined 10,000 samples from all three sources: LTC combination models, age models and survival models. We used the rate and shape parameters with the cumulative distribution function implemented in the flexsurv package to calculate the survival probabilities at 3-month intervals from aged 50 to 120 (to allow all curves to descend to zero). From these times and survival probabilities we estimated the mean survival, or life expectancy.
Bayesian models were written in the JAGS language 21 and implemented using runjags for R (version 2.0.4) 22 , survival models were fit using the flexsurv package in R (version 1.1.1) 20 , and for the final analysis the model-outputs were also combined in R (version 3.6.1). The 95% uncertainty intervals were obtained using empirical bootstrapping, with the number of samples in the mean equal to the effective sample size from the LTC correlation model. All code, data (except individual-level data for Scotland), intermediate outputs and diagnostic plots are provided on GitHub (https://github.com/dmcalli2/covid19_yll_final) 15 .

WHO life tables
The proportion of men and women in 10-year age-bands was reported for the 6801 deaths included in the ISS case report. On applying the proportion in each age-band to the WHO Global Burden of Disease 2010 life tables for men, we found that the YLL was 14.4 per person using the whole cohort and 14 after excluding those aged under 50. For women, comparable figures were 12.2 and 11.8 years, respectively.

Comorbidity models
For 710 patients who had died with COVID-19 for whom information on LTCs was presented in the ISS report 14 , the proportion with each LTC was as follows:-ischaemic heart disease 27.8%, atrial fibrillation 23.7%, heart failure 17.1%, stroke 11.3%, hypertension 73%, diabetes 31.3%, dementia 14.5%, chronic obstructive pulmonary disease 16.7%, active cancer in the past 5 years 17.3%, chronic liver disease 4.1%, chronic renal failure 22.2%. The ISS report also presented the proportion of patients who died with each of the following multimorbidity counts: 0 (2.1%), 1 (21.3%), 2 (25.9%) and ≥3 (50.7%). Using these data, alongside individual-level patient data for a small number of patients from Scotland to aid with model fitting, we were able to simulate a set of realistic notional patients with specific combination of LTCs. The correlations between every pair of LTCs are shown in the appendix and the full posterior distributions from the modelling are available at GitHub (https://github.com/dmcalli2/covid19_yll_final) 15 .

Age models
Based on the proportions reported for each age-band, for men the mean age for the ISS deaths was 77.9 years when people aged less than 50 were excluded and 77.4 years overall. For women the figure was 81.1 for both. The models we fit to these data to smooth out the distribution and to make it easier to accommodate different scenarios for the association between age and multimorbidity counts comorbidity are shown in Figure 2; the distribution of age and multimorbidity counts for men and women are shown under the assumption that these are independent, and under the assumption that multimorbidity is associated with age.

Survival models
The coefficients for the survival models are shown in the supplementary appendix. Briefly, all LTCs other than hypertension were associated with increased mortality (in a model including 10 other LTCs), and for each LTC the association with mortality was attenuated as the baseline age increased. Figure 3 shows the survival curves applied to different age and combinations of LTCs, stratified by age-band and multimorbidity count. This figure shows how these associations and age relate to survival across the age range from 50 to 110 years old.

Years of life lost
For men the average YLL on adjusting for number and type of LTC as well as age was 13.1 (12.2-14.1). For women this value was 10.5 (9.7-11.3). The results were similar under the different assumptions for the age-multimorbidity association and in both sensitivity analyses, whether assuming strongly correlated or independent LTCs (Table 1). For comparison, the YLL based on age alone using the WHO tables was 14.0 and 11.8 for men and women, respectively.
Across the simulated patients there was substantial variation in YLL adjusted for multimorbidity count (Figure 4).
On stratifying the YLL estimates by sex, age and multimorbidity count (for the simulated patients) there were clear differences ( Figure 5, Table 2) with the YLL ranging from around 2-years per person in men or women aged 80 with large numbers of LTCs, to around 35 years in younger people without any LTCs (Table 2). For most age-bands and most multimorbidity counts the YLL per person remained above 5. In sensitivity analyses including the survival models with additional comorbidity-comorbidity and comorbidity-comorbidity-age interaction terms, (despite these models having a better fit based on AIC) than the model presented here, the YLL only changed minimally from that seen in the main analysis. This was true overall YLL for each sex (13.1, 95% CI 12.2-14.0 and 10.5; 95% CI 9.7-11.3 for men and women respectively) and on additionally stratifying on age and multimorbidity count (as shown in Table 2). For the latter comparison, Figure 2. Modelled distribution of age in ISS population, assuming age is associated with comorbidity counts, and assuming age and comorbidity are independent. Coloured bars indicate the comorbidity count from zero (dark/blue) to 11 (light/yellow). Individual lines represent survival curves for a single simulated patients with a given set of LTCs. From light to dark (yellow to blue) they show decreasing multimorbidity counts. There are 10, 000 lines, one for each notional patient. Lines run from the age at which each simulated patient died (survival probability = 1) to when they would have died under the model (survival probability = 0). Patients with the same age and total multimorbidity count will have a different survival curve if they have a different set of 11 LTCs.

Summary of main findings
Using published data on people who have died from COVID-19 and survival models based on age and multimorbidity count in a general population in the United Kingdom, we estimated the burden (years life lost) from COVID-19 related mortality. We make a number of important observations. First, using the WHO GBD 2010 life tables as the reference 13 , the estimated YLL was over a decade for COVID-19 deaths with 14 YLL in men and 12 in women. As such, mortality from COVID-19 represents a substantial burden to individuals and comparable to high burden LTCs such as ischaemic heart disease and chronic obstructive pulmonary disease. Second, YLL estimated from models using the prevalence of underlying LTCs based on patients dying from COVID-19 in Italy and age-, sex-and multimorbidity countspecific survival models in the UK did not drastically impact the YLL. Across both men and women, the number of YLL dropped to 13 and 11 years respectively. Third, across most age and multimorbidity count strata the estimated YLL per person remained substantial and generally above 5 years. This means that even after accounting for multimorbidity count, most individuals lost considerably more than the "1-2 years" suggested by some commentators 23 perhaps reflecting the high prevalence of multimorbidity in this population, especially in those over the age of 50 years 24,25 . Finally, whilst the YLL remained high across most age-and multimorbidity count strata, the presence of multimorbidity did indeed influence the magnitude of the YLL. For example, in the elderly, over the age of 80, the estimated YLL in people with no LTCs was 11 years falling to less than two years with an increasing multimorbidity count.
YLL is a widely used metric to compare the relative impact of different causes of death and is used to guide policy-making and health service delivery and to prioritise interventions aimed at preventing deaths 26 . Using UK reports for approximate comparisons, the YLL for other conditions ranged, per capita from 8.2 for chronic obstructive pulmonary disease, 11.6 for coronary heart disease, 13.1 for pneumonia, and 21.6 for asthma 27 . Therefore, against these benchmarks, mortality from COVID-19 represents a substantial burden to individuals.
The estimated YLL can vary substantially depending on the reference population chosen and the age distribution among those who die. Moreover, where attempts are made to account for underlying conditions in those who died, the accuracy will depend on the quality and completeness of data both for those deaths, and in the reference population used to obtain estimates of survival according to those underlying conditions. Nonetheless, although imperfect, we would argue that public health agencies should present estimates of YLL for COVID-19, alongside the more usual counts of deaths. We have already seen that if agencies do not do so, commentators can and will fill this vacuum, sometimes making substantial errors such as using life expectancy at birth to make inferences about the years of life lost by someone who has already lived into later life and thereby considerably underestimating the impact of the disease on individuals 23 .

Strengths and limitations
Our analysis is novel in that it adjusts YLL for the number and type of underlying LTCs. This is important as people with underlying multimorbidity are recognised to be more vulnerable to COVID-19. However, although we had data for eleven common and important LTCs, we did not have markers of underlying disease severity among those who died. Severity of the underlying LTC has considerable impact on life expectancy 28 . Moreover, we had no data for rarer severe LTCs, which may nonetheless be common among those who die from COVID-19 at younger ages. As such, the attenuation of YLL following adjustment for LTCs may be an underestimate. However, we think that this effect is unlikely to be substantial enough to reduce YLL to the orders of magnitude suggested by some commentators. Indeed, on stratifying by age and multimorbidity counts, we rarely found average YLLs of below three. Also, we were not able to adjust our estimates for other factors and exposures (such as socioeconomic status, occupation, smoking, health behaviours) which would have given a more accurate representation of lifeexpectancy in the absence of COVID-19.
We did not have access to large quantities of individual-level data with which to estimate the prevalence of different combinations of LTCs. Therefore, we fitted a complex model (which was methodologically innovative and will be the subject of a separate publication) to estimate the joint probabilities, using the overall (marginal) estimates of each LTC, and the overall multimorbidity counts alongside a small amount of individual-level data from Scotland to help with model fitting. This model did not fully converge and had wide posteriors (indicating substantial uncertainty) for the correlation between LTCs. We nonetheless included the results of this model in our analysis because (i) it represents the best estimate for the joint probabilities given the available data and importantly, (ii) the results for overall YLL remained substantially similar in widely different sensitivity analyses assuming either that LTCs are highly correlated among people dying from COVID-19 or that they are entirely independent.
Finally, given the emergent nature of the coronavirus pandemic, this study was conducted rapidly and under pressure of time. We chose the best data for age, sex and prevalence of LTCs that was available to us at the time of our modelling, but better-quality individual-level data specific to individual countries will yield substantially more reliable estimates. We would suggest that each public health agency should produce country-specific estimates, using the same LTC definitions in those who died as in the reference population and ideally to an agreed international protocol. Our study has used complex state-of-the-art statistical modelling and inference techniques, which rely on expensive computer simulations. Given the time constraints, we had to find an acceptable trade-off between estimation accuracy and time constraints. Therefore, we will continue to refine our work to improve the convergence of the numerical procedures, although we do not expect that our conclusions, either about the overall YLL per capita, or about the distribution of YLL within the population, will substantially change. We have also provided all our data (except individual-level data form the Scottish population, for which we provide a simulated substitute dataset) and code to allow others to check our modelling and correct any errors 15 .

Conclusion
Among patients dying of COVID-19, there appears to be a considerable burden in terms of years of life lost, commensurate with diseases such as coronary heart disease or pneumonia.
While media coverage of the pandemic has focused heavily on COVID-19 affecting people with 'underlying health conditions', adjustment for number and type of LTCs only modestly reduces the estimated YLL due to COVID-19 compared to estimates based only on age and sex. Public health agencies and governments should report on YLL, ideally adjusting for the presence of underlying LTCs, to allow the public and policy-makers to better understand the burden of this disease.

Data availability
All code, data (except individual-level data for Scotland), intermediate outputs and diagnostic plots are provided on GitHub: https://github.com/dmcalli2/covid19_yll_final.

Version 1
Reader Comment 03 May 2020 , Other, USA, USA Jason Blumberg I'm perplexed by this study. How can it be assumed that the Covid victims would have lived the average life expectancy unless there's no or minimal standard deviation around that average? Wouldn't it be more compelling to compare to the minimum life expectancy of each cohort? Otherwise, you are implicitly assuming that the people who are dying are more or less representative of the average, which seems like a major assumption that, if untrue, would render your conclusions pretty useless. I hope I'm missing something here because it would seem far more intuitive to assume that people who are dying are the most vulnerable of their respective cohorts.

Competing Interests:
Reader Comment 02 May 2020 , George Mason University, USA David Bernstein I see you have partially addressed this already, but this was going to be my comment: Two people who are coded with the same disease could be in vastly different circumstances? We know the virus has taken a huge toll on nursing homes. An 82 year old with heart disease who lives in a nursing home is not similarly-situated, life expectancy-wise, to an 82 year old who is otherwise doing well and is self-sufficient. The former would assumedly be much more likely to succumb to Covid-19 than the latter. Similarly, "otherwise-healthy" people who succumb to Covid-19 can be expected to, on average, be more likely to have an undiagnosed health issue than those who don't. Is that taken into account? If neither of these are taken into account, the effect on life expectancy must be reduced. Now, I see you've responded that this should NOT have a major effect on life expectancy. I don't see how you can be so confident. A *huge* percentage of deaths, wildly disproportionate, have been in nursing ("care") homes. This is an extremely unhealthy population. In the U.S., iirc, the average life expectancy for someone entering a nursing home is something like 18 months. You simply can't compare an otherwise healthy 82 year old with heart disease to someone whose heart disease so enfeebles him or her that they need to be in a nursing home.

None. Competing Interests:
Author Response 30 Apr 2020 , University of Glasgow, Glasgow, UK

David McAllister
Thanks for your comment Martin Johnson. Please see this very rapid addendum we posted on our github