Ethnic differences in the incidence of clinically diagnosed influenza: an England population-based cohort study 2008-2018

Background: People of non-White ethnicity have a higher risk of severe outcomes following influenza infection. It is unclear whether this is driven by an increased risk of infection or complications. We therefore aimed to investigate the incidence of clinically diagnosed influenza/influenza-like illness (ILI) by ethnicity in England from 2008-2018. Methods: We used linked primary and secondary healthcare data (from the Clinical Practice Research Datalink [CPRD] GOLD and Aurum databases and Hospital Episodes Statistics Admitted Patient Care [HES APC]). We included patients with recorded ethnicity who were aged 40-64 years and did not have a chronic health condition that would render them eligible for influenza vaccination. ILI infection was identified from diagnostic codes in CPRD and HES APC. We calculated crude annual infection incidence rates by ethnic group. Multivariable Poisson regression models with random effects were used to estimate any ethnic disparities in infection risk. Our main analysis adjusted for age, sex, and influenza year. Results: A total of 3,735,308 adults aged 40-64 years were included in the study; 87.6% White, 5.2% South Asian, 4.2% Black, 1.9% Other, and 1.1% Mixed. We identified 102,316 ILI episodes recorded among 94,623 patients. The rate of ILI was highest in the South Asian (9.6 per 1,000 person-years), Black (8.4 per 1,000 person-years) and Mixed (6.9 per 1,000 person-years) ethnic groups. The ILI rate in the White ethnic group was 5.7 per 1,000 person-years. After adjustment for age sex and influenza year, higher incidence rate ratios (IRR) for ILI were seen for South Asian (1.70, 95% CI 1.66-1.75), Black (1.48, 1.44-1.53) and Mixed (1.22, 1.15-1.30) groups compared to White ethnicity. Conclusions: Our results suggest that influenza infection risk differs between White and non-White groups who are not eligible for routine influenza vaccination.


Introduction
People from ethnic minority backgrounds are represented disproportionately among patients with severe coronavirus disease 2019 . Early in the pandemic there were reports of excess COVID-related critical care admissions and deaths among people from Black and South Asian ethnic groups 1,2 . Recent research has found people of Black and South Asian ethnicity have increased risk of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, COVID-19-related hospitalization and death, independent of deprivation, occupation, household size, and underlying health conditions 3,4 .
The COVID-19 pandemic has reinforced the importance of seasonal influenza vaccination. By preventing influenza-related hospitalization, vaccination can minimize the risk of hospitalacquired COVID-19 (co-) infection for these individuals and reduce health service pressures, particularly the need for isolation of patients with respiratory symptoms awaiting COVID-19 test results.
In the United Kingdom (UK), the influenza vaccine is routinely recommended for adults aged ≥65 years, or people <65 years with underlying health conditions. These recommendations formed the basis of the original guidance to identify patients at moderate-and high-risk of COVID-19. Influenza vaccine recommendations were expanded for the 2020/21 season to include all adults ≥50 years 5 . However, vaccine uptake among clinical risk groups is low, particularly for Black and Mixed Black ethnic groups 6 . This is consistent with previous findings that Black ethnicity is associated with lower influenza vaccine uptake among children and pregnant women [7][8][9] . In addition, people of non-White ethnicity have higher risk of severe outcomes following influenza infection 10,11 . It is unclear whether this is driven by the risk of infection or complications, with most research focused on distal outcomes rather than initial infection risk.
Here we use clinical diagnoses data to investigate the incidence of influenza and influenza-like illness (ILI) by ethnicity from 2008-2018 among people not eligible for routine influenza vaccination, to consider disparities in infection risk.

Study design and data sources
We conducted a retrospective cohort study using anonymized primary care data from the UK Clinical Practice Research Datalink (CPRD) GOLD and Aurum databases 12,13 with linked secondary care data from the Hospital Episodes Statistics Admitted Patient Care (HES APC) database and death data from the Office for National Statistics. CPRD GOLD and Aurum collect records from >35 million patients registered at with National Health Service (NHS) general practitioners. The data include diagnoses, prescriptions, immunizations and demographics. HES APC data are collated from inpatient care at all NHS hospitals in England. The NHS is a universal health system publicly funded through general taxation which is accessible to all UK residents, although there is an annual surcharge for people who move to the UK. The corresponding author had full access to all CPRD GOLD and Aurum data used in the study, with relevant linked patient HES APC and ONS death data obtained from CPRD.

Study population
We included all adults aged 40-64 years registered at a CPRD contributing practice in England between 01/09/2008 and 31/08/2018 who were present in the GOLD and Aurum datasets. We then excluded among this study population any patients with a health condition indicative of influenza vaccination eligibility (Table 1), and those who had ever received pneumococcal vaccination, or influenza vaccination in the 12 months before baseline (all codes listed here, DOI: https:// doi.org/10.17037/DATA.00002102). Among the final study population, we started study follow-up to identify diagnoses of ARI (outcome of interest) at the latest of 12 months after current registration, up-to-research-standard (GOLD only), 40 th birthday, or 01/09/2008. Follow-up ended at the earliest of; a new diagnosis of a condition conferring eligibility for vaccination, pneumococcal or influenza vaccination, death, transfer out, the practice's last data collection, 65th birthday, or 31/08/2018.

Variables
Our exposure of self-reported ethnicity was captured in CPRD and supplemented with HES APC if missing in CPRD. We grouped ethnicity into the five and 16 census categories (the relevant subgroups from the 16 categorization are shown after the corresponding five category group in brackets in the following list) of White (British, Irish, Other White), South Asian (Indian, Pakistani, Bangladeshi, other South Asian), Black (African, Caribbean, other Black), Other (Chinese, all other), and Mixed (White and Asian, White and African, White and Caribbean, Other Mixed).
Our outcome of influenza/ILI was identified from diagnostic codes in CPRD and HES APC. In a second analysis, we expanded our outcome definition to acute respiratory infection (ARI), additionally including codes for pneumonia, acute bronchitis, or other acute infections suggestive of lower respiratory tract involvement (all codes listed here, DOI: https://doi.org/10.17037/ DATA.00002102). We considered the following confounders in our analysis; age (grouped into 40-44, 45-49, 50-54, 55-59 and 60-64), sex (men or women), year of outcome, region of residence and socioeconomic status. Region of residence was classified using the 10 regionally breakdowns for England available within CPRD. Socioeconomic status was assigned based on Townsend score quintile.

Health condition Study definition
Cardiovascular disease (CVD) Any previous clinical diagnosis, major intervention for, or clinical review specific to CVD including heart disease (congenital or otherwise), heart failure, stroke or transient ischaemic attack.
Chronic liver disease Any previous clinical diagnosis of, or clinical review specific to, chronic liver disease including cirrhosis, oesophageal varices, biliary atresia and chronic hepatitis.
Chronic kidney disease (CKD) Any previous clinical diagnosis of, or clinical review specific to, CKD stages 3-5, history of dialysis or renal transplant in Gold or Aurum. Or with estimated glomerular filtration rate (eGFR) to classify CKD stage 3-5 in Gold.
Chronic respiratory disease Any previous clinical diagnosis of, or clinical review specific to, chronic respiratory disease, including chronic obstructive pulmonary disease, emphysema, bronchitis, cystic fibrosis, or fibrosing interstitial lung diseases.

Asthma
Any previous clinical diagnosis of, or clinical review specific to, asthma with at least two prescriptions of inhaled steroids in the year before baseline. Or any previous hospitalisation for asthma.
Chronic neurological disease Any previous clinical diagnosis of, or clinical review specific to, a neurological disease such as Parkinson's disease, motor neurone disease, multiple sclerosis (MS), cerebral palsy, dementia or a learning/ intellectual disability.

Diabetes mellitus
Any previous diagnosis of, or clinical review specific to, diabetes mellitus, or with a prescription for medication used to treat diabetes.
Asplenia/sickle cell disease Any previous clinical diagnosis of, or clinical review specific to, asplenia or dysfunction of the spleen (including sickle cell disease but not sickle cell trait).
Severe obesity Latest body mass index before baseline was ≥40 kg/m 2 .

Immunosuppression
Any previous clinical diagnosis of, or clinical review specific to, HIV, solid organ transplant or other permanent immunosuppression (such as genetic conditions compromising immune function).
In the two years before baseline: clinical diagnosis of, or clinical review specific to, aplastic anaemia or haematological malignancy, or receiving a bone marrow or stem cell transplant.
In the year before baseline: previous clinical diagnosis of, or clinical review specific to, other/unspecified immune deficiency or receiving chemotherapy or radiotherapy.
In the year before baseline: prescription of biological therapy or at least 2 prescriptions for oral steroids or other immunosuppressants including DMARDS, Methotrexate, Azathioprine, or corticosteroid injections.

Statistical analysis
All analyses were done with Stata (version 16). We calculated crude annual infection incidence rates by ethnic group with age-and sex-stratification. Multivariable Poisson regression models with random effects, to account for multiple infections in the same patient, were used to estimate any ethnic disparities in infection risk. Our main analysis adjusted for age, sex, and influenza season/year. A second model additionally adjusted for region of residence and socioeconomic status, which may both confound and mediate an association between ethnicity and infection. Influenza circulation may vary regionally with the ethnic profile of the population also varying by region. Socioeconomic disadvantage is a risk factor for many infectious diseases with socioeconomic disadvantage also more prevalent in non-White ethnic groups in England.

Results
Our cohort included 3,735,308 patients ( Figure 1), of whom 87.6% were White (n=3,271,115), 5.2% South Asian (n=196,262), 4.2% Black (n=157,075), 1.9% Other (n=69,440), and 1.1% Mixed (n=41,416) ( Table 2). We excluded 511,682 (12.0%) patients with no recorded ethnicity; this group had longer follow-up, fewer consultations and were more likely to be male than the included study population (  Table 3). Non-White populations were younger and resided in more deprived areas than the White population, while a higher proportion of the White population were obese.
We identified 102,316 influenza/ILI episodes recorded among 94,623 patients, and 560,860 ARI episodes among 421,349 patients. The rate of influenza/ILI was highest in the South Asian group (9.6 per 1,000 person-years) followed by the Black group (8.4 per 1,000 person-years) ( Table 4). In all ethnic groups the influenza/ILI rates were higher in women than men and decreased with age.

Discussion
We showed an increased rate of influenza/ILI among Black, South Asian and Mixed groups based on clinical diagnoses following healthcare attendance. Specifically, those of Indian, Pakistani, Bangladeshi and African ethnicity had the highest rate compared to the White British group. When using our broader outcome of ARI, we only found an increased rate in the South Asian group with decreased rates in Black, Mixed and Other groups.
Our results suggest the risk of clinical influenza/ILI diagnosis risk differs between White and non-White groups. Such findings are consistent with studies of other acute viral respiratory infections including those which investigated the ethnic disparities in severe influenza outcomes, particularly during the 2009 H1N1 pandemic 10,11 as well as studies of COVID-19 infection risk and severe outcomes 3 .
Our study was conducted among patients not eligible for vaccination, and so disparities cannot be explained by differences in vaccine uptake or effectiveness: there are potentially even larger ethnic differences in influenza incidence among those eligible for influenza vaccine due to inequalities in chronic disease patterns. Since social mixing and household contact are important considerations for influenza/ILI transmission our findings are relevant to the whole population. People of non-white ethnicity tend to live in larger, multi-generational households with extended kinship and social networks 14,15 . Therefore, understanding ethnic disparities in respiratory infections across both high-and low-risk populations remains important for preventing hospitalizations.
Here we have presented results of a large population-based cohort study using nationally representative data. Excluding patients eligible for influenza vaccination due to chronic medical conditions should have reduced confounding. Nevertheless, our study may be impacted by some limitations. Under-diagnosis of health conditions may differ by ethnicity, with people from some ethnic groups less likely to be excluded   from our study population but more likely to have an undiagnosed, and therefore unmanaged condition, which may affect influenza risk. Ethnicity may be less well recorded in GP records for individuals without a chronic condition requiring frequent consultation, but financial incentivization between 2006-2011 boosted completion in GP records. Using hospital data boosted the completeness of ethnicity recording in our study population from 74% to 88%.
Influenza/ILI identification in our study was based on clinical diagnosis following healthcare attendance. Clinically identified influenza/ILI depends not only on attendance but also clinical coding practices, both of which may be associated with ethnicity. However, our results are consistent with other studies which used laboratory-confirmed measures of acute viral respiratory infections 3,10,11 . Our differing results for influenza/ILI and ARI outcomes may be attributable to the lack of specificity of ARI codes for influenza. We excluded individuals with known risk factors for influenza; it may be that other conditions are relevant risk factors for ARI generally.
Ethnic inequalities in the incidence of respiratory infections could arise because of differences in risk of exposure. Differences in exposure risk may be driven by factors such as occupation, including working in frontline high-exposure occupations (including healthcare settings), and household composition, with large multigenerational households more common in non-White ethnic groups 16 , as well as inequalities in access to care. Results from analysis of ethnic inequalities in access to care are mixed with the reasons for any inequalities complex, and likely due to multiple interlinked factors including; different cultural approaches to health, experiences of discrimination, and language barriers 17,18 . Potentially ethnic differences in influenza/ILI incidence could be greater than we have shown depending on the extent of access to care inequality. Unequal access to treatments will also affect the likelihood of adverse outcomes after infection.
We excluded children who are a key driver for influenza transmission; examining ethnic inequalities for infection risk in children is an area for future research.
The COVID-19 pandemic has drawn attention to the ethnic inequalities in infection risk. Ethnic disparities in outcomes have been previously highlighted, during the 2009 H1N1 influenza pandemic as well as for seasonal influenza 10,11 . Our study found that ethnic inequalities are also present for seasonal influenza/ILI. This reinforces the urgency of addressing lower influenza, and now COVID-19, vaccine uptake among minority ethnic groups 19 . We suggest targeted public health interventions are implemented to facilitate increased vaccine uptake in non-White ethnic groups.

Data availability Source data
The patient data used in this study are supplied from Clinical Practice Research Datalink (CPRD; www.cprd.com) but restrictions apply to the availability of these data, which were obtained under licence from the UK Medicines and Healthcare products Regulatory Agency, and so are not publicly available. For re-using these data, an application must be made directly to CPRD. Instructions for how to submit an application and the conditions under which access will be granted are explained at https://www.cprd.com/ research-applications.

Ethical approval
The study was approved by the CPRD Independent Scientific Advisory Committee (Protocol number: 19_209A2) and the We have added further detail on how data populate CPRD and HES as well as the setup of the NHS in the UK to the data sources section of the methods to aid international reader understanding "CPRD GOLD and Aurum collect records from >35 million patients registered at with National Health Service (NHS) general practitioners. HES APC data are collated from inpatient care at all NHS hospitals in England. The NHS is a universal health system publicly funded through general taxation which is accessible to all UK residents, although there is an annual surcharge for people who move to the UK". We have also expanded our discussion of ethnic differences in healthcare attendance in our limitations to include more detail the issues of inequalities in access to care. However, there is no way for us to directly measure access within the dataset other than consultation frequency, which we have included in our baseline characteristics table.

Point 3: When the groupings are clear, it is best to compare these major groups as to socio demographic profile and establish any significant difference (or no difference) among the ethnic groups. (Show the p-values): age group 40-49,50-59, 60+, Townsend quintile 4,5 (yes/no), region is unclear to me-best to say urban, periurban or rural.
Response to point 3: We agree that if conducting an analysis of overall risk factors for a clinical diagnosis of influenza/ILI by ethnicity it would be of interest to present statistical differences for baseline characteristics by ethnic group. However, we only intended to describe the baseline characteristics of the study population as we are not investigating the risk factors for ethnic differences in ILI. Additionally, with large electronic health record datasets p-values can be uninformative. We have expanded the description of our cohort in the text at the start of our results section. We used anonymised data with location of residence in the dataset available as regions of England, which do not correspond to rural/urban status. The regions of England are well recognised within the country and are the geography with public health is organised. Flu circulation may vary regionally with the ethnic profile of the population also varying regionally, so we considered region as a confounder.

Point 4: More than just establishing the incidence of influenza, ILI and ARI by ethnicity, you also can establish the risk factors for such by doing #3.
Response to point 4: Thank you for this suggestion. The aim of our analysis was to investigate whether the incidence of influenza/ILI varied by ethnic group, accounting for key confounding factors. This was a particularly relevant question for public health professionals to answer following the COVID-19 pandemic. We did not aim to investigate the individual risk factors associated with the ethnic variation in influenza/ILI incidence, for which a different analysis would be appropriate.

Point 5: Similar observations have been published for 2009 H1N1 and COVID-19. The conclusion should have a stronger recommendation for health care for ethnic groups at risk.
Response to point 5: We have strengthened our concluding remarks to "The COVID-19 pandemic has drawn attention to the ethnic inequalities in infection risk. Ethnic disparities in outcomes have been previously highlighted, during the 2009 H1N1 influenza pandemic as well as for seasonal influenza. Our study found that ethnic inequalities are also present for the incidence of clinically diagnosed seasonal influenza/ILI. This reinforces the urgency of addressing lower influenza, and now COVID-19, vaccine uptake among minority ethnic groups. We suggest targeted public health interventions are implemented to facilitate increased vaccine uptake in non-White ethnic groups." Point 6: There are too many tables and figures. Important ones are Figure 1, simplified Table 1 showing statistical significant differences, Figure 3 with the revised or clear groupings.
Response to point 6: We used two breakdowns for ethnicity -one with 5 groupings and one with 16 groupings, while this unfortunately results in long tables we think it is important to show the baseline characteristics, incidence rates and IRRs for all ethnic groups and both outcomes used (influenza/ILI and ARI). Inclusion of these data aids reader understanding of our key findings and interpretation of the results. Ideally, we would use a supplementary appendix to display the large tables, however, Wellcome Open Research's format requires all tables and figures be in the main text with supplementary material not permitted. We have updated Figure 3 (now named Figure 2) footnote to explain the groupings presented and removed Figure 2.