Rates of serious clinical outcomes in survivors of hospitalisation with COVID-19 in England: a descriptive cohort study within the OpenSAFELY platform

Background: Patients surviving hospitalisation for COVID-19 are thought to be at high risk of cardiometabolic and pulmonary complications, but quantification of that risk is limited. We aimed to describe the overall burden of these complications in people after discharge from hospital with COVID-19. Methods: Working on behalf of NHS England, we used linked primary care records, death certificate and hospital data from the OpenSAFELY platform. We constructed three cohorts: patients discharged following hospitalisation with COVID-19, patients discharged following pre-pandemic hospitalisation with pneumonia, and a frequency-matched cohort from the general population in 2019. We studied seven outcomes: deep vein thrombosis (DVT), pulmonary embolism (PE), ischaemic stroke, myocardial infarction (MI), heart failure, AKI and new type 2 diabetes mellitus (T2DM) diagnosis. Absolute rates were measured in each cohort and Fine and Gray models were used to estimate age/sex adjusted subdistribution hazard ratios comparing outcome risk between discharged COVID-19 patients and the two comparator cohorts. Results: Amongst the population of 77,347 patients discharged following hospitalisation with COVID-19, rates for the majority of outcomes peaked in the first month post-discharge, then declined over the following four months. Patients in the COVID-19 population had markedly higher risk of all outcomes compared to matched controls from the 2019 general population. Across the whole study period, the risk of outcomes was more similar when comparing patients discharged with COVID-19 to those discharged with pneumonia in 2019, although COVID-19 patients had higher risk of T2DM (15.2 versus 37.2 [rate per 1,000-person-years for COVID-19 versus pneumonia, respectively]; SHR, 1.46 [95% CI: 1.31 - 1.63]). Conclusions: Risk of cardiometabolic and pulmonary adverse outcomes is markedly raised following discharge from hospitalisation with COVID-19 compared to the general population. However, excess risks were similar to those seen following discharge post-pneumonia. Overall, this suggests a large additional burden on healthcare resources.

pandemic hospitalisation with pneumonia, and a frequency-matched cohort from the general population in 2019. We studied seven outcomes: deep vein thrombosis (DVT), pulmonary embolism (PE), ischaemic stroke, myocardial infarction (MI), heart failure, AKI and new type 2 diabetes mellitus (T2DM) diagnosis. Absolute rates were measured in each cohort and Fine and Gray models were used to estimate age/sex adjusted subdistribution hazard ratios comparing outcome risk between discharged COVID-19 patients and the two comparator cohorts. Results: Amongst the population of 77,347 patients discharged following hospitalisation with COVID-19, rates for the majority of outcomes peaked in the first month post-discharge, then declined over the following four months. Patients in the COVID-19 population had markedly higher risk of all outcomes compared to matched controls from the 2019 general population. Across the whole study period, the risk of outcomes was more similar when comparing patients discharged with COVID-19 to those discharged with pneumonia in 2019, although COVID-19 patients had higher risk of T2DM (15.2 versus 37.2 [rate per 1,000-person-years for COVID-19 versus pneumonia, respectively]; SHR, 1.46 [95% CI: 1.31 -1.63]). Conclusions: Risk of cardiometabolic and pulmonary adverse outcomes is markedly raised following discharge from hospitalisation with COVID-19 compared to the general population. However, excess risks were similar to those seen following discharge post-pneumonia. Overall, this suggests a large additional burden on healthcare resources.

Introduction
Cardiometabolic and pulmonary complications, especially thrombotic events, have been described as a key feature of the severe acute phase of COVID-19. A recent systematic review estimated the risk of venous thromboembolism (VTE) to be ~15% in hospitalised COVID-19 patients, with higher risks observed in people admitted to intensive care (~30% 1-3 ). Underlying reasons for this higher risk are likely to be multifactorial, including immobility following illness/hospitalisation as well as the known association with infection in general, mediated through interactions with general inflammatory and other immune pathways 4 . The possibility that SARS-CoV-2 may directly trigger pulmonary thrombi via vascular damage and inflammatory effects in the lung has also been raised 5 .
As the COVID-19 pandemic has progressed, it is increasingly reported that some patients who recover from the acute disease phase go on to experience a range of post-recovery clinical problems. This post-acute COVID-19 syndrome is currently not well described or understood, with the UK National Institute for Health and Care Excellence stating that any body system could be affected, for an undetermined period of time 6 . Any such syndrome now needs to be defined and quantified so that patients and health services can know what outcomes may be expected, and plan accordingly 6,7 . It is also unclear whether COVID-19 is exceptional in its association with cardiometabolic events, or comparable to other respiratory pathogens, such as influenza and Streptococcus pneumoniae, which have welldescribed associations with acute cardiovascular events.
Work to date on cardiometabolic outcomes with COVID-19 has largely focused on risks during hospitalisation, for example, highlighting increased myocardial involvement and acute kidney injury complications [8][9][10] . However, there is a lack of evidence on how these risks evolve in survivors of severe COVID-19. We therefore measured the rates of cardiometabolic outcomes in people in England who were discharged from hospital following the acute phase of COVID-19. For context, we compared these rates with those seen prior to the pandemic in both the general population and amongst people discharged following hospitalization for non-COVID-19 pneumonia.

Study design and data sources
We conducted an observational cohort study using electronic health record (EHR) data from primary care practices using TPP software. TPP provides the software to approximately 40% of English general practices and it is used to record all clinical information such as diagnoses, blood tests and prescriptions. This is additionally linked to Office for National Statistics (ONS) death registrations and Secondary Uses Service (SUS) data (containing hospital records) through OpenSAFELY. This is an open-source data analysis platform developed during the COVID-19 pandemic, on behalf of NHS England, to allow near real-time analysis of pseudonymised primary care records at scale, currently covering approximately 40% of the population in England, operating within the EHR vendor's highly secure data centre 11,12 . Details on Information Governance for the OpenSAFELY platform can be found in the Software and reproducibility section.

Population
We included all adults aged ≥18 years registered with a general practice for ≥1 year on the index date with information on age, sex, and socioeconomic status. From this source population we selected three cohorts: all patients hospitalised with COVID-19 (in the year starting 1 st February 2020), a comparison cohort containing all patients hospitalised with non-COVID pneumonia across an equivalent period starting in 2019 (i.e. the year starting 1 st February 2019) and a general population frequency matched cohort in 2019. The COVID-19 and pneumonia cohorts were selected as anyone hospitalised with an associated diagnostic code for COVID-19 or pneumonia respectively (referred to as the "index hospitalisation"). The general population cohort was formed by matching each patient in the COVID-19 cohort to up to five patients eligible on 1st February 2019 in TPP on age (within 1 year), sex and region defined by Sustainability Transformation Partnership level (a more granular form of England NHS region). Matching was performed using a greedy matching algorithm, with no replacement (for more details, see GitHub).
The study periods ran for one year, starting 1st February in either 2019 or 2020, depending on the population (as defined above). The follow-up period began on the discharge date of the index COVID-19 or pneumonia hospital stay or 1st February 2019 in the general population matched cohort. For each analysis, follow-up ended on the earliest of: the first recorded outcome event, the study end date, or the date of death of the patient. For the AKI outcome, we excluded patients who were receiving dialysis before the index date (defined as presence of a dialysis code or eGFR < 15ml/min). For diabetes, we excluded any patients who had a previous diabetes event, to ensure only incident diagnoses were measured.
Outcomes were defined primarily as the presence of a diagnostic code for each of the respective outcomes, either in the general practice record, in hospital, or as a cause of death on a death certificate. For the primary analysis, we excluded records where the patient had a recent outcome recorded within three months before the index date (including if they had been recorded during the index hospitalisation). This was to prevent double counting of the same event, for example where a general practice (GP) updates the record of a patient, recording an event that occurred during a recent hospitalisation. Sensitivity analyses explored the effect of including these events (see Statistical Methods).

Statistical methods
We described the characteristics of the three patient cohorts formed: patients discharged from an admission with COVID-19, patients discharged from an admission with pneumonia and patients from the general population matched on age, sex and region. Additionally, for the hospital cohorts, we described the characteristics of the admitted populations to highlight potential differences in those surviving to discharge.
Rates were reported for each outcome, per 1,000 person years, initially for the whole follow-up time, then stratified into time windows: 0-29 days, 30-59 days, 60-89 days, 90-120 days and 120+ days post discharge for the COVID-19 and pneumonia cohorts, to determine how the rate of outcomes changed over time. Rates in all three patient cohorts were stratified by age, sex and ethnicity.
We used Fine and Gray regression models to estimate subdistribution hazard ratios (SHRs) and 95% confidence intervals (CIs) to compare the risk of each outcome between, 1) the discharged COVID-19 group and the matched general population group, and 2) the discharged COVID-19 group and the discharged pneumonia group. In the primary analysis, we investigated crude univariable and age and sex adjusted models. The same patient can contribute person-time to all exposure groups, however these periods are non-overlapping therefore we applied robust standard errors. Any evidence of deviations from proportional subdistribution hazards were further investigated using time-period specific SHRs 13,14 .
In sensitivity analyses we tested the effect of including 1) previously omitted outcomes recorded in the primary care record when there was a recorded outcome within 3 months before the index date, and 2) only events recorded in hospital or as a cause of death on a death certificate.
Finally, in post-hoc analyses we investigated the robustness of results to those obtained from a model further adjusting for comorbidities and available lifestyle information.
Software and reproducibility Data management was performed using the OpenSAFELY software, Python 3.8 and SQL, and analysis using Stata 16.1. All codelists alongside code for data management and analyses can be found in our GitHub repository. All software for the OpenSAFELY platform is available for review and re-use at our organisation GitHub.
NHS England is the data controller; TPP is the data processor; and the key researchers on OpenSAFELY are acting on behalf of NHS England. OpenSAFELY is hosted within the TPP environment which is accredited to the ISO 27001 information security standard and is NHS IG Toolkit compliant 15,16 . Patient data are pseudonymised for analysis and linkage using industry standard cryptographic hashing techniques. All pseudonymised datasets transmitted for linkage onto OpenSAFELY are encrypted and access to the platform is via a virtual private network connection, restricted to a small group of researchers who hold contracts with NHS England and only access the platform to initiate database queries and statistical models. All database activity is logged; only aggregate statistical outputs leave the platform environment following best practice for anonymisation of results such as statistical disclosure control for low cell counts 17 . The OpenSAFELY research platform adheres to the obligations of the UK General Data Protection Regulation (GDPR) and the Data Protection Act 2018. In March 2020, the Secretary of State for Health and Social Care used powers under the UK Health Service (Control of Patient Information) Regulations 2002 (COPI) to require organisations to process confidential patient information for the purposes of protecting public health, providing healthcare services to the public and monitoring and managing the COVID-19 outbreak and incidents of exposure; this sets aside the requirement for patient consent 18 . Taken together, these provide the legal bases to link patient datasets on the OpenSAFELY platform.

Results
We identified 77,347 patients discharged following an admission with COVID-19 in the year starting from 1 st February 2020 and 127,987 patients discharged with pneumonia in the year starting from 1 st February 2019. For each patient discharged following an admission with COVID-19 we matched up to five patients from the general population eligible in TPP on 1st February 2019 on age, sex and region. We successfully matched five patients for over 99.9% of COVID-19 patients.
We present characteristics for the cohorts studied in Table 1 and  Table 2. Compared to the discharged COVID-19 cohort, the discharged pneumonia cohort had a higher proportion aged over 80. The ethnic breakdown was broadly similar between the three groups, although the discharged COVID-19 patients had a higher proportion of patients who were Asian and Asian British. Furthermore, Table 1 highlights that the discharged pneumonia patients had a higher proportion of comorbidities (including history of the investigated outcomes) compared to the discharged COVID-19 and general population cohorts.
By design, this work focuses on describing the burden of disease amongst patients discharged after severe COVID and our aim is not to make causal conclusions surrounding differences in the risk of outcomes. However, for additional context, the characteristics for the admitted COVID-19 and pneumonia populations are provided in Table 2. Comparison between Table 1 and Table 2 highlights the increased in-patient mortality of admitted COVID-19 patients (21%) compared to admitted pneumonia patients (14%). Furthermore, whilst Table 2 highlights expected differences in the age distribution between the admitted and discharged cohorts, the overall pattern of characteristics between the admitted and discharged groups remained largely consistent.
Overall rates of each outcome per 1,000 person years for the whole follow-up are presented in Table 3. For the majority of outcomes, we observed higher rates of serious cardiometabolic and pulmonary complications in discharged pneumonia  Figure 1). Across all three cohorts, the largest absolute rates were for AKI and heart failure. Overall rates stratified by age, sex and ethnicity are presented in the Extended Data (Supplemental Tables 5-11).
For COVID-19 patients, rates of stroke were higher amongst the over 80 group compared to the other comparison groups, although the pattern was not consistent across other ages groups. Similarly, rates were not constant by ethnic group, for example, the rate of new T2DM diagnoses were slightly higher amongst black patients discharged following an admission with COVID-19.
Stratified overall rates in 30-day time windows are shown in Figure 1. For discharged COVID-19 patients, we observed the highest rates for all outcomes in the first 30 days post-discharge, with a gradual decline in subsequent periods, consistent with the pattern of rates observed for patients discharged with pneumonia in 2019. For the discharged COVID-19 and pneumonia groups, we observed pronounced rates of AKI and heart failure in the first 30 days post-discharge ( Figure 1).
After age and sex adjustment, the discharged COVID-19 group had markedly higher risk for all outcomes compared to the matched general population group (Table 4, Figure 2 Results from a sensitivity analysis investigating robustness of absolute and relative rates to outcome definition are presented in the Extended Data. Change in outcome definitions did not meaningfully alter conclusions. Finally, given evidence of non-proportional subdistribution hazards we present SHRs in 30-day periods (Figure 3). For all outcomes, we observed substantially higher SHRs in the first 30-days post-discharge for COVID-19 patients compared to the general population group which gradually reduced in subsequent periods. Furthermore, during the first 30-days post-discharge we observed higher risk of AKI and PE in the discharged COVID-19 group compared to the pneumonia group.

Key findings
In this descriptive study, we set out to report the overall rates of seven cardiometabolic and pulmonary outcomes in three cohorts: 1) patients discharged following hospitalisation with COVID-19 (in the year from 1 st Febraury 2020), 2) patients   *Covariates included in the fully adjusted model were: age (parametrised as a four-knot restricted cubic spline), sex, ethnicity, obesity, smoking status, index of multiple deprivation quintile, region, hypertension, asthma, chronic respiratory diseases other than asthma, chronic heart disease, diabetes, non-haematological and haematological cancer, reduced kidney function, chronic liver disease, dementia, other neurological disease, organ transplant, asplenia (splenectomy or a spleen dysfunction, including sickle cell disease), rheumatoid arthritis, lupus, or psoriasis, other immunosuppressive conditions (permanent immunodeficiency ever diagnosed or aplastic anaemia or temporary immunodeficiency recorded within the past year) and any history of the outcomes studied.

Figure 2. Age and sex adjusted subdistribution hazard ratios (SHRs) for risk of outcomes in patients who were hospitalised with
COVID-19 and then discharged, compared to 1) patients who were hospitalised with pneumonia and then discharged, and 2) an age, sex and region matched general population comparator group.
discharged following hospitalisation with pneumonia (in the year from 1 st February 2019) and 3) a frequency-matched group of patients from the general population in 2019. We found that the rate of cardiometabolic and pulmonary complications following discharge from hospitalisation with COVID-19 was notably higher compared to an age, sex and region matched general population cohort, especially for PE and AKI. However, patients discharged with COVID-19 followed a broadly similar pattern of elevated risk to those discharged from hospital after pneumonia in 2019. In post-hoc analyses, further adjustment for comorbidities and lifestyle information resulted in similar, though slightly attenuated associations ( Table 4).
The pattern of change in the rate of outcomes over time following discharge from hospital was broadly similar between the COVID-19 and pneumonia patients, with the highest rate in the initial 30 days of follow-up, then a 2-3-fold drop in the next 30 days, followed by a more gradual decline. In both discharge-based cohorts, rates remained substantial even after more than 120 days. This highlights an additional short-term and potential long-term burden on healthcare services 19 .
We did not attempt to determine the casual nature of any observed differences. In the context of the rapidly changing pandemic, we aimed to provide a description of the burden of outcomes after discharge from hospitalisation with COVID-19, compared to cohorts discharged from hospitalisation with pneumonia and the general population, to inform health services and provision. Any study aiming to draw causal conclusions would require careful investigation and adjustment for features of the pathophysiology associated with SARS-CoV-2 infection, changes in healthcare provision during the pandemic and potentially differential likelihood of ascertainment of pre-existing conditions; especially in light of the observed higher in-hospital mortality rate for hospitalised COVID-19 patients compared to patients hospitalised with pneumonia.

Strengths and limitations
We were able to source our cohorts from the OpenSAFELY platform, which contains data on over 17m adults. This gave us a population who were discharged following hospitalisation with COVID-19 of 77,347, allowing us to obtain precise estimates of the rate of each outcome. We were also able to draw on multiple linked data sources, including primary care records, hospitalisations and death certificates. This allows a more complete picture to be presented of the clinical activity surrounding each outcome.
Much of the research characterising cardiometabolic outcomes in COVID-19 patients to date has focused on in-hospital clinical activity and describing the proportion of events in a narrow time-window post-discharge 9,10,20-23 . However, our use of both a general population and active control population of patients hospitalised with pneumonia in 2019 provides useful context for the rates of these outcomes in COVID-19 patients who survive hospitalisation. Furthermore, presenting the rates in this context is more informative than within a general population alone and offers an important comparison with a cohort experiencing exposure to another acute respiratory illness event requiring hospitalisation.
Our study aimed to describe clinical events that occurred after discharge from hospital, and not the total additional morbidity burden of COVID-19 hospitalisation: specifically, we did not set out to describe events that occurred during hospital admission with COVID-19 or pneumonia. However, in our view reliable analysis of in-hospital events may only be achievable with bespoke collections of detailed hospital data, due to shortcomings in routinely collected administrative data that are widely used for such analyses. For example, SUS and HES data contain a list of diagnostic codes associated with each hospitalisation, but they do not contain sufficient information to determine the exact timing of all events within each hospitalisation episode. This means that time-to event analyses are not possible. Similarly, it is not possible to reliably determine the sequence of events during hospitalisation: so, a patient hospitalised with COVID-19, who later had a stroke, may be coded in a similar way to a patient who was hospitalised with a stroke, and then infected with SARS-CoV-2 while in hospital. In addition, routine PCR testing on hospitalised patients during the pandemic may lead to very high ascertainment of infection with SARS-CoV-2, which may not have occurred to the same extent in the comparison population for pneumonia admissions.
It has been reported that there was a marked reduction in GP and hospital activity during the first wave of the pandemic, for example a 40% reduction in admissions for acute coronary syndrome [24][25][26] . This may be in part explained by a reluctance of patients to present at healthcare services for fear of contracting the virus. As a result, we believe population-level rates of many outcomes will be under-ascertained during 2020 compared with 2019. It is unknown whether this applies in the same way to patients who have already had severe COVID-19; if ascertainment is lower, then this would result in a possible under-estimate of outcome rates associated with COVID-19 in our study.
A recent observational study measured similar outcome events in a population of patients discharged from hospital following COVID-19 27 . They observed elevated rates in the COVID-19 population compared to a matched general population control group. Our findings are consistent in showing similarly higher rates of outcomes in patients post-discharge with COVID-19.
However, importantly we further show that these higher rates of outcomes are broadly comparable, if not slightly lower, when compared to people discharged from hospital following pneumonia, selected as a major non-COVID respiratory infection.

Conclusion
In this study, the rate of cardiometabolic and pulmonary events in COVID-19 survivors discharged from hospitalisation was elevated in a similar manner to patients discharged from hospitalisation with pre-pandemic pneumonia. Furthermore, the impact of the post COVID-19 hospitalisation events described in this study upon the NHS in England is substantial. Future work should investigate any association between non-hospitalised SARS-CoV-2 infection and these outcomes, and quantify any likely population level impact.

Data and software availability
Underlying data All data were linked, stored and analysed securely within the OpenSAFELY platform (https://opensafely.org/). Data include pseudonymized data such as coded diagnoses, medications and physiological parameters. No free text data are included. All code is shared openly for review and re-use under MIT open license. Detailed pseudonymized patient data are potentially re-identifiable and therefore not shared.
For security and privacy reasons, OpenSAFELY is very different to other approaches for EHR data analysis. The platform does not give researchers unconstrained access to view large volumes of pseudonymised and disclosive patient data, either via download or via a remote desktop. Instead we have produced a series of open source tools that enable researchers to use flexible, pragmatic, but standardised approaches to process raw electronic health records data into "research ready" datasets, and to check that this has been done correctly, without needing to access the patient data directly. Using this data management framework we also generate bespoke dummy datasets. These dummy datasets are used by researchers to develop analysis code in the open, using GitHub. When their data management and data analysis scripts are capable of running to completion, and passing all tests in the OpenSAFELY framework, they are finally sent through to be executed against the real data inside the secure environment, using the OpenSAFELY jobs runner, inside a container using Docker, without the researcher needing access to that raw potential disclosive pseudonymised data themselves. The non-disclosive summary results output tables, logs, and graphs are then manually reviewed, as in other systems, before release.
As part of building that resource for the community, we are working with NHS England to cautiously on-board a small number of external pilot users to develop their analyses on OpenSAFELY. This process is described in further detail on our webpage.

Are the conclusions drawn adequately supported by the results? Yes
The comparison with both a general population cohort and a pneumonia cohort is an important strength of the study. The known association between influenza and cardiovascular events is mentioned in the introduction and it could be considered also to include a third comparison cohort of patients hospitalized with influenza. This would allow for a comparison of adverse outcomes between a common and well-known viral infection and COVID-19.

○
In "outcomes and follow-up": "AKI" should be written out the first time it is used.

○
The rationale for using a Fine and Grey regression could be mentioned and the challenges with the interpretation of subdistributional hazards could be discussed (Austin and Fine, 2017 1 ).

○
In the primary analysis patients with a recorded outcome within three months before the index are excluded. Is it possible that this could result in including patients with prevalent disease, e.g., heart failure or T2DM, if that patient has not seen a physician during the three months before index? This may contribute to the high rate of heart failure after discharge (Table 3). Have the authors considered doing a sensitivity analysis only including incident outcomes with e.g. a 5 year look-back period?
○ Minor comment: AKI should be spelled out in the abstract and in the "Outcomes and follow-up" in the methods section.