Risk factors for SARS-CoV-2 seroprevalence following the first pandemic wave in UK healthcare workers in a large NHS Foundation Trust

Background: We aimed to measure SARS-CoV-2 seroprevalence in a cohort of healthcare workers (HCWs) during the first UK wave of the COVID-19 pandemic, explore risk factors associated with infection, and investigate the impact of antibody titres on assay sensitivity. Methods: HCWs at Sheffield Teaching Hospitals NHS Foundation Trust were prospectively enrolled and sampled at two time points. We developed an in-house ELISA for testing participant serum for SARS-CoV-2 IgG and IgA reactivity against Spike and Nucleoprotein. Data were analysed using three statistical models: a seroprevalence model, an antibody kinetics model, and a heterogeneous sensitivity model. Results: Our in-house assay had a sensitivity of 99·47% and specificity of 99·56%. We found that 24·4% (n=311/1275) of HCWs were seropositive as of 12th June 2020. Of these, 39·2% (n=122/311) were asymptomatic. The highest adjusted seroprevalence was measured in HCWs on the Acute Medical Unit (41·1%, 95% CrI 30·0–52·9) and in Physiotherapists and Occupational Therapists (39·2%, 95% CrI 24·4–56·5). Older age groups showed overall higher median antibody titres. Further modelling suggests that, for a serological assay with an overall sensitivity of 80%, antibody titres may be markedly affected by differences in age, with sensitivity estimates of 89% in those over 60 years but 61% in those ≤30 years. Conclusions: HCWs in acute medical units and those working closely with COVID-19 patients were at highest risk of infection, though whether these are infections acquired from patients or other staff is unknown. Current serological assays may underestimate seroprevalence in younger age groups if validated using sera from older and/or more severe COVID-19 cases.


Introduction
Healthcare workers (HCWs) are at increased risk of COVID-19 1,2 . The true number of HCWs infected with SARS-CoV-2 to-date is unknown, particularly during the early stages of the pandemic. Initial methods to estimate HCW COVID-19 cases included extrapolation from work absenteeism rates, and are unlikely to be reliable 3 . Confirmation by molecular testing increased the accuracy of case detection, although access to nucleic acid amplification testing (NAAT) was limited during the early stages of the pandemic in the UK 4 . Serological testing can be performed at large scale, and is less affected by symptom-activated testing pathways, so may provide a more accurate estimate of previously infected HCWs and could be used in conjunction with other data to determine their risk factors for exposure [5][6][7][8] .
To enable the accurate interpretation of seroprevalence readouts, detailed characterisation of antibody evolution relative to the sampling time-frame, immunoglobulin isotype, antigenic target and assay performance is required 9-12 , Several commercial SARS-CoV-2 antibody assays have been validated using samples from patients with more severe COVID-19, and some studies have suggested those with milder or asymptomatic COVID-19 are less likely to develop detectable antibodies [13][14][15][16][17] . Furthermore, antibody levels to some coronaviruses are known to be higher in older individuals [18][19][20][21][22] . We theorised this may lead to age-specific differences in antibody assay sensitivity, which could be a significant confounder in population seroprevalence studies.
In this study we aimed to investigate SARS-CoV-2 seroprevalence in HCWs at Sheffield Teaching Hospitals (STH), a National Health Service (NHS) Foundation Trust in the United Kingdom (UK), following the first wave of the pandemic in the UK. To achieve this, we sought to measure SARS-CoV-2 antibody titres by developing and using an in-house assay, prior to using statistical modelling to explore risk factors associated with seropositivity, the evolving antibody response, and the impact of age on assay sensitivity.

Methods
Background and setting STH is an NHS trust offering secondary-and tertiarylevel care across four sites in South Yorkshire, UK, with 1,669 inpatient beds and 18,500 employees 23 . Patients with a medical reason for admission are typically admitted to STH either through attending the Emergency Department (ED), or through referral to the Acute Medical Unit (AMU) by their General Practitioner (GP). On AMU, patients are given an initial management plan before being triaged to the most appropriate medical specialty ward e.g. respiratory medicine. The first patient with confirmed COVID-19 was admitted to STH on 23 February 2020; the first wave of the UK pandemic occurred between March 2020 and June 2020. Patients with suspected or confirmed COVID-19 were referred directly to the infectious diseases ward by GPs and other STH admission areas as capacity allowed. When capacity was reached, suspected COVID-19 patients would be either placed in side rooms or cohort bays on AMU or other wards, whilst confirmed COVID-19 patients could be moved to cohort wards.
Testing of symptomatic staff for SARS-CoV-2 by NAAT was introduced on 17 March 2020. On the same day, Public Health England (PHE) de-escalated the recommendations for the personal protective equipment (PPE) required by HCWs caring for inpatients with suspected or confirmed COVID-19 from 'Level 3 Airborne' to 'Level 2 Droplet' for routine care 24 . Subsequently, the requirement for universal 'Level 2 Droplet' PPE for all inpatient and outpatient care began on 08 April 2020. Local STH policy was changed on 15 June 2020 to mandate staff use surgical face masks while on hospital premises.

Recruitment and consent
From 13-18 May 2020, all contactable STH staff (n=17,757) were invited to take part in the COVID-19 Humoral ImmunE RespOnses in front-line HCWs (HERO) study by email and intranet alert. To engage staff in areas with limited communications access, additional recruitment posters and face-to-face enrolment sessions were used.
Following an electronic informed consent process, participants provided self-reported data on-line on age, gender, ethnicity, job role, and pandemic working environment ('COVID-19 zones') 24 . Details of any possible or confirmed prior COVID-19 illnesses occurring since 01 February 2020 was also collected. These were categorised as: i), diagnosed with COVID-19 and confirmed by NAAT, ii), clinically diagnosed with COVID-19 but NAAT not performed, and iii), self-reported symptoms only 24 . Together, we defined these three groups as "symptomatic", as asymptomatic testing was only introduced after the study recruitment period. Those reporting no illness between 01 February 2020 to the date of recruitment were defined as "asymptomatic". All that had enrolled were emailed times of phlebotomy appointments, and were invited to attend on a first come first served basis for the first visit, and then invited by email to book a specific appointment slot to attend for their second visit after four weeks +/-7 days. An 8.5ml serum sample was taken at each visit to outpatient phlebotomy services for serological testing.

Serology assay development
To develop the in house ELISA, we created an assay validation dataset 25 consisting of serum from 190 SARS-CoV-2 NAAT-confirmed cases (52 hospitalised patients and 138

Amendments from Version 2
Following reviewer 2's comments, two minor changes have been made to the manuscript: 1. A further difference between ED and AMU infection control practices has been added to the discussion 2. The reasons for excluding samples from V2 analysis has been added to the manuscript (previously just in Supplementary  Figure 3 "Study Flow Diagram" in Extended Data (reference 24 in the manuscript) Any further responses from the reviewers can be found at the end of the article REVISED healthcare workers with mild infections sampled between 14 and 120 days from NAAT positivity), and 675 patients sampled prior to 2017 (Extended data: Table S1). Thresholds based on the absorbance value at 450nm (A 450) for defining reactivity to spike (A 450 0·1750) or NCP (A 450 0·1905) were set to optimise the sensitivity of each assay. Given the IDSA guidance for ensuring a specificity of ≥99·5% in assays used for SARS-CoV-2 seroprevalence studies, specificity was enhanced by defining a SARS-CoV-2 seropositive sample as one where both spike and NCP were reactive 26 .

SARS-CoV-2 serology
Serum samples from study participants were then tested for IgG and IgA reactivity to two SARS-CoV-2 proteins using our in-house ELISA: the full-length extracellular domain (amino acids 14-1213) of Spike glycoprotein, including a replacement of the furin cleavage site R684-R689 by a single alanine residue and replacement of K986-V987 by PP, produced in mammalian cells; and full-length untagged Nucleocapsid protein (NCP) produced in E. coli (Uniprot ID P0DTC9 (NCAP_ SARS2)). [27][28][29] . High binding microtitre plates (Immulon 4HBX; Thermo Scientific, 6405) were coated overnight with proteins diluted in phosphate buffered saline, washed with 0·05% PBS-Tween, and blocked for one hour with 200 µL/well casein buffer. Following optimisation, sample dilutions used were 1:200 for the IgG assay or 1:100 for the IgA assay 24 . Plates were emptied and 100 µL/well of sample or control loaded. After two hours incubation, plates were washed and loaded with goat anti-human IgG-HRP conjugate (Invitrogen, 62-8420) at 1:500, or goat anti-human IgA-HRP conjugate (Invitrogen, 11594230) at 1:1000, for one hour. Plates were washed and developed for 10 minutes with 100 µL/well TMB substrate (KPL, 5120-0074). Development was stopped with 100 µL/well HCl Stop solution (KPL, 5150-0021), and absorbance read at 450nm. All steps were performed at room temperature.
A calibration curve of sera pooled from convalescent SARS-CoV-2 NAAT-confirmed patients with high antibody titres for both spike and NCP was included on plates to allow quantification of antibody concentrations. The calibration curve was generated by serially diluting in 1·75× steps from a starting concentration of 1:200 for the IgG assay or 1:100 for the serum IgA assay. When the WHO International Standard for anti-SARS-CoV-2 immunoglobulin (NIBSC, 20/136) later became available, the calibration curve was run in parallel for the IgG assay 24 . Data for the IgG assay are therefore given in WHO antibody units, whereas IgA assay data are given in arbitrary antibody units.

Sample size
To meet the primary objective of measuring the SARS-CoV-2 IgG seroprevalence, we calculated a sample size of 1,000 HCWs would provide +/-1·4% precision based on a seroprevalence estimate that ~4% of the UK population may have been infected by April 2020, with a two-sided 95% confidence interval (with n=753, Binomial exact 95%CI has been estimated to be 2·7-5·6%) 30 .

Statistical modelling
We considered three statistical models, i) a seroprevalence model, ii) an antibody kinetics model, and iii) a heterogeneous sensitivity model. For the seroprevalence model, we used the serostatus of all participants at first blood draw in a sensitivityand specificity-adjusted Bayesian multilevel logistic regression model. Using seropositivity as the binary response variable, we considered three different Bayesian Hierarchical Regression model subtypes all with explanatory demographic variables age, race and gender, and each model with a different primary exposure; job location, contact with COVID-19 patients, and job type 24 . In addition, we fitted a symptomatic prevalence model, where the data used were seropositive persons only, and the binary response variable was asymptomatic or symptomatic infection.
For the antibody kinetics model, we included samples from individuals who were seropositive at both bleeds, in a Bayesian multilevel linear regression model in two parts: i) using log2 antibody units (logAU) at the first blood draw as the response variable and ii) using the change in antibody titre at the follow up bleed (median 28 days) as the response variable. Age, ethnicity, gender and symptom severity (asymptomatic or symptomatic) were used as covariates and each model was run separately for four different antibody-antigen combinations; Spike-IgG, NCP-IgG, Spike-IgA, NCP-IgA. The time until seroreversion was calculated for each covariate group and antibody-antigen interactions by i) sampling a starting titre value and a rate of decline from the two models, and then ii) calculating the time until the minimum observed antibody value was reached for that antibody-antigen interaction, assuming a continuous rate of decrease.
In our heterogeneous sensitivity and specificity model, we explored how estimates for the sensitivity and specificity derived from our assay validation dataset generalise to covariate groups, e.g. participant age. To model the generalisability of these performance measures, we compared the seropositivity classification of our study dataset according to our in-house antibody assay, with the predicted seropositivity classification from hypothetical assays with an assumed sensitivity and specificity. Our model considers the different distribution of the A 450 values in the assay validation and HERO study datasets to model how reliably the sensitivity and specificity given by the manufacturers of the assays generalise to specific subpopulations. Using the assay validation dataset, we estimated an A 450 cut-off value for every chosen sensitivity value, and then used this A 450 cut-off to classify seropositivity in the study dataset. We then estimated the implied sensitivity on the HERO dataset by comparing seropositivity classification based on the estimated A 450 cut-off value, with the seropositivity classification from our in-house assay (which for ease of comparison, we assume represents the maximum possible sensitivity and specificity (i.e. 100%) in this model), we estimate an "implied" sensitivity on the HERO dataset which would arise if the commercial assay alone had been used to detect seropositivity. This framework allowed us to estimate the hypothetical performance of serological assays reported in the literature on our HERO dataset, along with co-variate specific sensitivity.
All analysis was performed in R version 4.0.2 31 and cmdstanr version 0.2.0 32 . An R package containing all the analysis in this study is available at https://doi.org/10.5281/zenodo.6320552.

Serology assay development
We found our in-house ELISA had a sensitivity of 99·47% (95% confidence interval (CI) 97·10% -99·99%) and specificity of 99·56% (95% CI 98·71% -99·91%) for our IgG assay (Extended data: Figure S1a). Compared with IgG, we saw more rapid waning of the IgA response following SARS-CoV-2 infection, as well as higher levels of cross-reactivity in pre-pandemic samples. These factors complicated defining seropositivity based on an A 450 threshold, as there was no clear separation between titres in these two groups. We therefore opted to use our spike and NCP IgA ELISA solely to compare IgA titres of individuals classified as seropositive by our IgG assay (Extended data: Figure S2). Antibody units at each given dilution of the calibration curve are shown in Table S2 (Extended data).
Registration and study visits 1478 STH staff consented to take part between 13 May and 5 June 2020 (Extended data: Figure S3). Of these, 1277 attended for a first visit (V1) between 15 May 2020 and 12 June 2020. As two samples were contaminated in transit, we obtained a valid serostatus for 1275 samples. 1174 attended a second visit (V2) between 15 June and 10 July 2020, however eight samples were excluded from the V2 analysis as they were either unlabelled (n=2), lost (n=1) or the participant had participated in a COVID-19 vaccine trial (and seroconverted) (n=5) (Extended data: Figures S3 and S4).

Antibody kinetics model
Differences in antibody concentration between samples were calculated for four different antibody-antigen interactions (spike-IgG, NCP-IgG, spike-IgA, NCP-IgA). Though there was a positive correlation between Spike-IgG and NCP-IgG across all samples (R 2 = 0·53), the correlations between serum IgG and IgA were much weaker (R 2 between 0·17 and 0·3) (Extended data: Figure S6).

Heterogeneous sensitivity model
The heterogeneous sensitivity model demonstrates that using varying A 450 cut-offs (corresponding to varying sensitivity values) to categorise seropositivity in the HERO dataset will result in a lower sensitivity than that defined using our assay validation dataset ( Figure 3a). The model also shows that there is no difference in implied sensitivity between using spike or NCP as the antigenic target in the ELISA assay.
The relationship between the A 450 cut-off value and the sensitivity and specificity for the assay validation datasets for each antigen were plotted with the associated ROC curves (Extended data: Figures S8 and S9). We hypothesised that the higher A 450 values seen in older adults suggest that some commercially available serological assays may have a higher sensitivity in detecting COVID-19 antibodies in older age groups compared with younger age groups. We therefore used our model to estimate age-specific implied sensitivity for assays of different sensitivity profiles in estimating seroprevalence in our HERO dataset. We found that the sensitivity of a serological assay decreases with age due to the higher antibody titres seen in older people, with a clearer trend in an NCP-based assay compared to a spike-based assay (Figure 3b). Assuming a theoretical assay validation set sample sensitivity of 80% for the NCP protein, the resulting median implied sensitivity for age groups <30, 30-39, 40-49, 50-59, and 60+ years was 61%, 77%, 70%, 85%, and 89% respectively.

Discussion
We found a high SARS-CoV-2 seroprevalence in HCWs at a large UK hospital trust compared to national seroprevalence estimates, following the first pandemic wave in the UK 33 . In addition, we identified important risk factors associated with occupational exposure to COVID-19, and described a significant association between age and the likelihood of a positive serological result which has important implications for the validation of SARS-CoV-2 antibody assays and the hitherto interpretation of population-level COVID-19 serology data.
Over 20% of HCWs at STH had evidence of SARS-CoV-2 infection within just over 100 days of the first confirmed COVID-19 patient being admitted to our NHS trust. This high proportion over a short space of time is likely representative of the much higher exposure to SARS-CoV-2 infection among certain subpopulations of the workforce that we tested. Although data from other settings and countries suggested infection risk in HCWs is similar to community exposure, this seroprevalence is much higher than estimated seropositivity in the UK population at a similar time (6·0%, 95 CrI 5·8-6·1 in July 2020) 33,34 .
Despite universal PPE and IPC guidelines across STH, our data show that HCWs working in AMUs are at significantly   Occupational and physiotherapists (OT/PT) had the highest rates of seroprevalence across all of the job roles included in our cohort (45.5%), which is consistent with some other UK Increasing age was associated with seropositivity, with over a third of our HCWs aged >60 testing seropositive, and with higher antibody titres. We demonstrate that the sensitivity of a serological assay increases with increasing age due to the higher antibody titres seen in older people, and with a clearer trend in NCP-compared to spike-based assays. Our data complements the existing literature, which shows antibody titres against SARS-CoV-2 and other coronaviruses are higher in older individuals, which could be due to a higher risk of exposure to the virus, greater antigenic load or boosting of antibodies from cumulative seasonal coronavirus infections throughout their lifetime [18][19][20][21][22] . Several of the commercial SARS-CoV-2 antibody assays available (e.g. Roche Elecsys, Abbott SARS-CoV-2 IgG and Wantai ELISA) were validated with patient sera collected from those with more severe disease early on in the pandemic (i.e. those who presented to health services) 13-15 . Patients with severe COVID-19 have been shown to have higher antibody titres than those with milder disease (Extended data : Figure S1b), and it would be reasonable to assume these cases were likely to also be older in age 13-17,39,40 . Our antibody kinetic modeling data suggest that using such samples from severe COVID-19 cases for the purposes of assay calibration may result in an assay with lower or insufficient sensitivity when applied to less severe or younger (often community) populations. We also found that NCP-IgG is likely to wane more quickly than Spike-IgG. Depending on the sampling time frame relative to pandemic wave, serological testing based on NCP-IgG alone may further underestimate seroprevalence. With increasing vaccine coverage, use of spike IgG to determine seroprevalence also becomes more problematic when distinguishing whether an individual is seropositive from vaccination or previous infection. Assays which combine antibody responses to membrane protein with NCP antibodies may overcome these challenges 41,42 .
We note the limitations of our study, which include a potential for selection bias due to participants self-enrolling for convenience, rather than using systematic sampling. While we cannot measure the extent of this effect on the measured seroprevalence, we think any volunteer bias would have been equal across all categories compared, and so not altering the validity of these comparisons. In addition, we recognise that our cohort has relatively low numbers of HCWs from minority ethnic backgrounds (~10%), compared to the Sheffield general population (19%) 43 .
With the ongoing global devastation caused by the COVID-19 pandemic and its lasting effect on healthcare services, understanding the risk factors leading to HCW exposure is paramount to ensuring the continuity of effective and safe patient care. Our real-world data suggest that NHS HCWs face high levels of exposure to SARS-CoV-2, plus highlights locations and job roles at greatest risk during the first wave of the pandemic. Population seroprevalence data can help guide decision makers on risk management. Using assays that have been validated using serum samples from a broad population, combined with antibody kinetic modelling and/or with age-adjusted cut-offs could overcome the potential limitations we have highlighted. The project contains the following extended data:

Data availability
- Table S1. Details of the samples used to set thresholds during assay validation - Table S2. Comparison of antibody units in assay calibration curve sera assigned to the assay with WHO international standard antibody units - Table S3. Summary of the response variables and the covariates used in the regression model.
- Table S4. Summary of the model parameters used in the regression model.
- Table S5. Summary of the response variables and the covariates used in the regression model.
- Table S6. Summary of the model parameters used in the regression model.
- Table S7. Summary of the model parameters used in the Heterogenous sensitivity model.
- Figure S1a. ROC curves of the spike and NCP assays - Figure S1b. Spike-and NCP-specific IgG response in inpatients vs outpatients - Figure S2. Comparison of IgA assay A450 based on IgG Serostatus - Figure S3. Study flow diagram.
- Figure S4. Histogram (overlayed) showing the symptom onset, date of first bleed (all cases and symptomatic cases only), and time at second bleed (all cases and symptomatic cases only).
- Figure S5. Model-predicted proportion of asymptomatic estimates for three different models (A-C), adjusted and unadjusted with covariates gender, age group and ethnicity.
- Figure S6. Correlation between the four different antibody measures for 264 serological samples.

References
- Figure S7. Rate of decline for the antibody concentrations post-symptom onset for the four antibody measures. The fitted line is from a linear regression, with the 95% CI shown in red.
- Figure S8. Relationship between sensitivity/specificity and the cutoff value for the control dataset.
- Figure S9. ROC curves with the A 450 cut-off value indicated in red for the control dataset. x-axis shows the False Positive Rate, y-axis is the sensitivity.
- Figure S10. ROC curves for different age groups and antigen proteins, with the A 450 cut-off value indicated in various colours for the control dataset.
- Figure S11. (a) Specificity of the control data set against the implied specificity of the HERO dataset for spike and nucleoprotein. (b) Specificity of the control data set against the implied age-specific specificity of the HERO dataset for spike and nucleoprotein. This is an interesting paper and the authors have done a very thorough job. Obviously, with the progression of the pandemic, vaccination roll out and development of subsequent variants these data may not have the same relevance as during the first wave. I have a few specific points, but nothing too critical to the article.
As both spike and nucleocapsid antibodies were tested for vaccination is less of an issue, but as the recruitment period in the study overlapped with vaccine trials that were recruiting, it would be worth clarifying that none of the participants were in those early vaccine trials.
What case definition was used for the symptomatic definition? Was there any major difference in the seroprevalence between those who had classical COVID-19 symptoms and those with more viral respiratory type symptoms?
It would be really helpful if there were some data on local community prevalence/seropositivity around the testing dates (I appreciate that the UK figure of 4% is given for April 2020 but local data would be helpful in assessing the numbers).
Was there a difference in PPE use that could explain the difference in seroprevalence between ED and AMU? Many ED's retained use of higher level respiratory PPE due to performing aerosol generating procedures, which would align it with critical care areas.

Are sufficient details of methods and analysis provided to allow replication by others? Yes
If applicable, is the statistical analysis and its interpretation appropriate? I cannot comment. A qualified statistician is required.
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Yes
Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Author Response 07 Jun 2022

Hayley Colton, Sheffield Teaching Hospitals Nhs Foundation Trust, Sheffield, UK
Thank you for taking the time to review our paper. We've responded to your points below: We were mindful of participation in vaccine trials affecting our results, as we were involved with vaccine trials locally. Following their V1 visit, five of our participants had subsequently enrolled (and seroconverted) after taking part in a vaccine trial, so were excluded from V2 analysis. This is not mentioned explicitly in the manuscript but is stated in Supplementary Figure 3 "Study Flow Diagram" in Extended Data (reference 24 in the manuscript). This has been added to the manuscript: "1174 attended a second visit (V2) between 15 June and 10 July 2020, however eight samples were excluded from the V2 analysis as they were either unlabelled ( Figures S3 and S4)."

1.
We did not directly compare classical COVID symptoms vs other viral respiratory symptoms within this study. Participants self reported any previous possible or confirmed COVID-19 episodes on the electronic questionnaire (the wording of all of the categories is on page 3 of Extended data, reference 24 in the manuscript and also below), and then we compared asymptomatic vs symptomatic groups within the results. Beyond the symptomatic / asymptomatic definition, all the other symptom 'questions' asked were about those that had been specifically cited as being associated with COVID-19 at the time, before such a thing as 'classical' COVID-19 was defined -so any comparison between 'classic COVID and 'other' viral symptoms would be post hoc.
"The certainty of an illness being consistent with COVID-19 was classified as: -Diagnosed COVID-19 and NAAT confirmed ("YES -I was diagnosed and it was 2. confirmed with a swab test"), -Diagnosed COVID-19 but NAAT not performed ("YES -I was diagnosed but it has not been confirmed with swab test"), -Self reported Symptomatic ("MAYBE -I've had symptoms of cough, fever or headache or required days of work for a new medical problem"), -Asymptomatic ("NO" or "NO and I've had a negative PCR test")" I was unable to find any specific seroprevalence data for My overall comments are: Figure 1 shows that the association between high seroprevalence and age attenuated after adjustment with zones, job and wards. Could there be confounding relationships? If zones, job and wards were added to the antibody kinetics model, would the association between age and high titre attenuate as well? 1.
It appears that a major aim here is to valid an in-house assay. The manuscript could be more clear if this was clearly stated in the beginning. An important caveat for this work is that the dataset was based on this in-house assay.

2.
My complete list of comments are listed below: Abstract: "SARSCoV-2 antibodies were tested using an in-house assay for IgG and IgA reactivity against Spike and Nucleoprotein (sensitivity 99·47%, specificity 99·56%)." This was in the results section of the manuscript. Was validation of the in-house assay an objective of this work? If yes this should be clearly specified and the quoted text should go into results. 1.
"HCWs in acute medical units working closely with COVID-19 patients were at highest risk of infection, though whether these are infections acquired from patients or other staff is unknown." I would assume HCWs from other ward types also work closely with patients? 2.
"Current serological assays may underestimate seroprevalence in younger age groups if validated using sera from older and/or more symptomatic individuals." It was not shown in the results that symptomatic individuals had higher titres than asymptomatic ones? 3.

Introduction:
"The true number of HCWs exposed to SARS-CoV-2 to-date is unknown, particularly during the early stages of the pandemic. Initial methods to estimate HCW exposure included extrapolation from work absenteeism rates, and are unlikely to be reliable3. Confirmation by molecular testing increased the accuracy of case detection, although access to nucleic acid amplification testing (NAAT) was limited during the early stages of the pandemic in the UK4. Detection of exposure by antibody seroconversion may provide a more accurate estimate of risk in HCW populations, can be performed at large scale, and is less affected by symptom activated testing pathways5-8." Do the authors mean infected/infection instead of exposed/exposure since confirmation testing by NAAT refers to infection rather than exposure? 1.
"Many antibody assays have been evaluated using samples from hospitalised patients; it is unclear how these assays perform with the lower antibody levels found in those with more mild or asymptomatic SARS-CoV-2 infection10,12" There have been a number of studies looking at serology response in mild COVID cases, e.g. https://www.nature.com/articles/s41598-020-77125-8. 1 Suggest to include in the discussion how the findings from this study compare to the existing ones.

2.
Please specify the aims in the introduction. 3. refer to? "Using the assay validation dataset, we estimated the A450 cut-off value for a range of chosen sensitivity values, and then used this A450 cut-off to classify seropositivity in the study dataset." How did you choose a single A450 value from a range of sensitivity values? 10.
"We then estimated the implied sensitivity on the HERO dataset by comparing seropositivity classification based on the estimated A450 cut-off value, with the seropositivity classification from our in-house assay (which for ease of comparison, we assume represents the maximum possible sensitivity and specificity (i.e. 100%) in this model)." What does 'implied' mean here?

Results:
"Rapid waning of IgA responses following SARS-CoV-2 infection complicated defining positive and negative samples based on the convalescent sera we used for assay validation." Why is this so? 1.
"1174 attended for a second visit (V2) between 15 June and 10 July 2020" 'for' is unnecessary in this sentence.
2.  The raw data showed that a higher proportion of HCWs who were >60 years old were seropositive compared to the other age groups. However, there's no longer this association after adjustment. What was/were the confounder(s)?

5.
In the antibody kinetics model, neither zones, job and wards were used as an independent variable. As mentioned in the earlier point, would this finding of age and higher titres modulate/disappear if zones, job and/or wards are included in this model as well? 6. Figure 3: Please specify the meaning of 'control' in the x-axis. 7.

Discussion:
Were the infection prevention and control measures similar in the ED and AMU? Could the higher prevalence in AMU be due to transmissions from asymptomatic/ mildly symptomatic patients who were not triaged to be isolated by the ED? 1.
"Previous studies clearly demonstrate that patients with more severe COVID-19 have higher antibody titres" However this was not found in this study? 2.
"Reassuringly, our seroprevalence rates are similar to those seen in other UK based seroprevalence studies" Citation 5 also enrolled volunteer participants (publicised study 3. using social media), and citation 21 is the HERO extended data?
It will be informative to have a more detailed discussion on if the finding of age affecting serology titres were reported in other studies.
between age and high titre attenuate as well? David Hodgson: The reviewer's interpretation of Figure 1 seems correct, but to be crystal clear; Figure 1 shows the (unadjusted) association between high seroprevalence, and the covariates age, gender, and race, and the (adjusted) association between high seroprevalence, and the covariates age, gender, race, and one primary exposure variable (either zones, jobs or wards). In Figure 1, the similarity between the unadjusted and adjusted seroprevalence estimates suggest that neither zone, jobs or wards were strong confounders on the age-dependent prevalence, therefore they are dropped in the subsequent modes for parsimony.
2. It appears that a major aim here is to valid an in-house assay. The manuscript could be more clear if this was clearly stated in the beginning. An important caveat for this work is that the dataset was based on this in-house assay. Hailey Hornsby: As seen later in this response, the manuscript intro has been expanded to include the ELISA validation as an objective of the work: "To achieve this, we sought to measure SARS-CoV-2 antibody titres by developing an in-house assay (...)" results that symptomatic individuals had higher titres than asymptomatic ones? Hayley Colton: Commercial assays referred to in this sentence were validated against samples from patients who were hospitalised with severe disease; whereas our assay was validated using samples from both inpatients with severe disease and outpatients with mild (or no) symptoms). Although we found that symptomatic cases showed similar titres compared to asymptomatic cases within our HCWs, we did not analyse the titres against severity, whereas other studies have. For our assay development, we did compare inpatient and outpatient antibody titres, this has now been added to the Extended Data as Figure S1b. Thank you for pointing this out, the manuscript has been amended to reflect that we are referring to severity rather than symptoms exclusively: "Current serological assays may underestimate seroprevalence in younger age groups if validated using sera from older and/or more severe COVID-19 cases." Introduction: 1. "The true number of HCWs exposed to SARS-CoV-2 to-date is unknown, particularly during the early stages of the pandemic. Initial methods to estimate HCW exposure included extrapolation from work absenteeism rates, and are unlikely to be reliable3. Confirmation by molecular testing increased the accuracy of case detection, although access to nucleic acid amplification testing (NAAT) was limited during the early stages of the pandemic in the UK4. Detection of exposure by antibody seroconversion may provide a more accurate estimate of risk in HCW populations, can be performed at large scale, and is less affected by symptom activated testing pathways5-8." Do the authors mean infected/infection instead of exposed/exposure since confirmation testing by NAAT refers to infection rather than exposure? Hayley Colton: Thank you for pointing out, rephrased as below: "The true number of HCWs infected with SARS-CoV-2 to-date is unknown, particularly during the early stages of the pandemic. Initial methods to estimate HCW COVID-19 cases included extrapolation from work absenteeism rates, and are unlikely to be reliable3. Confirmation by molecular testing increased the accuracy of case detection, although access to nucleic acid amplification testing (NAAT) was limited during the early stages of the pandemic in the UK4. Serological testing can be performed at large scale, and is less affected by symptom activated testing pathways, so may provide a more accurate estimate of previously infected HCWs and could be used in conjunction with other data to determine their risk factors for exposure 5-8." 2. "Many antibody assays have been evaluated using samples from hospitalised patients; it is unclear how these assays perform with the lower antibody levels found in those with more mild or asymptomatic SARS-CoV-2 infection10,12" There have been a number of studies looking at serology response in mild COVID cases, e.g. https://www.nature.com/articles/s41598-020-77125-8.1 Suggest to include in the discussion how the findings from this study compare to the existing ones. 2. It appears that willing healthcare workers enrolled themselves into the study. Were there any measures to prevent volunteer bias? Could volunteer bias influence the results? Paul Collini: Volunteer bias could conceivably increase the observed prevalence if those who thought they had been exposed to COVID were more likely to volunteer and really had been more likely to have been exposed and infected. The only measure we had to prevent bias was that we approached domestic services staff through their managers as well as email as we anticipated they wouldn't be accessing emails. However, we had no measures to mitigate any volunteer enrolment bias. While we cannot measure the extent of this effect on the measured prevalence of antibody detection, we suspect that it is small; at the time of the study there was a huge interest among staff as to whether they had yet been exposed to SARS CoV-2, which appeared to be irrespective of perceived exposure and was reflected in the rapid enrolment; also at that time it was still not appreciated how common asymptomatic COVID was, yet a large number of people without any prior symptoms enrolled. Even if volunteer bias were to artificially increase the observed prevalence, we think any volunteer bias would have been equal across all categories we compared, and so not altering the validity of these comparisons. We have altered the discussion to better reflect this -see response to 'Discussion point 3'.' 3. "These were categorised as: i), diagnosed with COVID-19 and confirmed by NAAT, ii), clinically diagnosed with COVID-19 but NAAT not performed, and iii), self-reported symptoms only21. Together, we defined these three groups as "symptomatic", as asymptomatic testing was only introduced after the study recruitment period." If we expect a proportion of those in groups ii) and iii) are not true COVID-19 cases, how would that affect the results? Were any sensitivity analyses performed to address this uncertainty? David Hodgson: Just to be clear, case positivity was defined through seropositivity and this sentence refers to the classification of these seropositive individuals as either a symptomatic or asymptomatic COVID-19 case. For patients in group ii) the COVID case was diagnosed by a clinical professional but a NAAT test was not possible at the time. Therefore it is unlikely, especially given the short time-scale between reporting COVID-like symptoms (Feb 2020) and the first bleed of the experiment (Apr 2020) that this is a misdiagnosed case. For people in iii) it is possible that some people were asymptomatic with COVID-19 and were reporting symptoms of another respiratory disease, however the sample size of this group is small (22 / 340 (6.4%) of nonasymptomatic cases) and even if such misdiagnoses were common, it is unlikely to impact the results.
4. "Using seropositivity as the binary response variable, we considered three different model subtypes with varying primary exposures; job location, contact with COVID-19 patients, and job type21." It is not clear from this sentence what were the actual independent variables used in the model subtypes? Which variable was given a random effect in the multilevel model?
In the GitHub account, where the README says 'The stan code and the raw mcmc outputs is contained in the outputs/ folder.', I could not find the model codes?" David Hodgson: Further details of the model are given in the Extended Data (reference 21). To make the model independent variables as clear as possible in manuscript, we have edited the manuscript as below: "Using seropositivity as the binary response variable, we considered three different Bayesian Hierarchical Regression model subtypes all with explanatory demographic variables age, race and gender, and each model with a different primary exposure; job location, contact with COVID-19 patients, and job type21. " Regarding the stan code and raw mcmc outputs; there is a typo in the README file, and these files are actually contained in the 'include' folder. This has now been updated in the Github Repository. Thankyou for pointing out this error.

"For the antibody kinetics model, we included samples from seropositive individuals in a
Bayesian multilevel linear regression model in two parts:…" Does this refer to the HCWs who were positive in the first blood draw only?
David Hodgson: We included HCWs who were positive at both their first and second bleed as stated in the Extended Data (reference 21 in the manuscript) Manuscript edited as following: "For the antibody kinetics model, we included samples from individuals who were seropositive at both bleeds, in a Bayesian multilevel linear regression model in two parts:" 6. "In our heterogeneous sensitivity and specificity model, we explored how estimates derived from our assay validation dataset generalise to covariate groups, e.g. participant age." What is the assay validation dataset? It is probably clearer to move the explanation from results to the methods section.
Hayley Colton: The validation of the in-house assay has now been added as an objective: "To achieve this, we sought to measure SARS-CoV-2 antibody titres by developing and using an in-house assay, prior to using statistical modelling to explore risk factors associated with seropositivity, the evolving antibody response, and the impact of age on assay sensitivity" The development technique of the in-house assay has been moved to methods, whilst the validation results have been moved to results.
7. "In our heterogeneous sensitivity and specificity model, we explored how estimates derived from our assay validation dataset generalise to covariate groups, e.g. participant age. To model the generalisability of these performance measures, we compared the seropositivity classification of our study dataset using our in-house antibody assay, with the predicted seropositivity classification from hypothetical assays with a quoted sensitivity and specificity. Our" These sentences are a little confusing. Firstly, what does 'estimates' in 'estimates derived from our assay validation dataset generalise to covariate groups, e.g. participant age' refer to? Is it referring to sensitivity and specificity? Secondly, "we compared the seropositivity classification of our study dataset using our in-house antibody assay" sounds like you compared the classification using the antibody assay. Thirdly, in "with the predicted seropositivity classification from hypothetical assays with a quoted sensitivity and specificity", what does 'quoted' mean? Does it mean 'assumed sensitivity and specificity since it is a hypothetical assay? David Hodgson: In answer to the reviewer's first question; yes, "estimates" in this sentence refers to sensitivity and specificity. For the reviewer's second point, we have clarified the sentence to ensure that this confusion does not arise. Finally, for the reviewer's third point, yes, "quoted" in this sentence can be read as "assumed". With these comments in mind, we have changed the text accordingly: "In our heterogeneous sensitivity and specificity model, we explored how estimates for the sensitivity and specificity derived from our assay validation dataset generalise to covariate groups, e.g. participant age. To model the generalisability of these performance measures, we compared the seropositivity classification of our study dataset according to our in-house antibody assay, with the predicted seropositivity classification from hypothetical assays with an assumed sensitivity and specificity." 8. Please introduce A450 when it was first mentioned in the manuscript. Hailey Hornsby: Added an introduction to A450 to Serology Assay Development: "Thresholds based on the absorbance value at 450nm (A 450) for defining reactivity to spike ( A 450 0·1750) or NCP ( A 450 0·1905) were set to optimise the sensitivity of each assay." 9.In "to model how reliably quoted performance measures generalise.", what does 'quoted' refer to? David Hodgson: "Quoted" refers to the sensitivity and specificity given by the manufacturers of the assays. To make this clearer we have changed the text: "to model how reliably the sensitivity and specificity given by the manufacturers of the assays generalise to specific subpopulations." 10. "Using the assay validation dataset, we estimated the A450 cut-off value for a range of chosen sensitivity values, and then used this A450 cut-off to classify seropositivity in the study dataset." How did you choose a single A450 value from a range of sensitivity values? David Hodgson: There is an A450 value for each sensitivity value considered. To make this as clear as possible in the sentence we have changed the sentence accordingly: "Using the assay validation dataset, we estimated an A450 cut-off value for every chosen sensitivity value, and then used this A450 cut-off to classify seropositivity in the study dataset." 11. "We then estimated the implied sensitivity on the HERO dataset by comparing seropositivity classification based on the estimated A450 cut-off value, with the seropositivity classification from our in-house assay (which for ease of comparison, we assume represents the maximum possible sensitivity and specificity (i.e. 100%) in this model)." What does 'implied' mean here? David Hodgson: "Implied sensitivity" is defined by the description following the sentence. To make this clearer we have rearranged the sentence accordingly: "By comparing seropositivity classification based on the estimated A450 cut-off value, with the seropositivity classification from our in-house assay (which for ease of comparison, we assume represents the maximum possible sensitivity and specificity (i.e. 100%) in this model), we estimate an "implied" sensitivity on the HERO dataset which would arise if the commercial assay alone had been used to detect seropositivity." 12. Please cite R and cmdstanr. David Hodgson: Apologies for this oversight, these are now referenced in the updated manuscript. Results: 1. "Rapid waning of IgA responses following SARS-CoV-2 infection complicated defining positive and negative samples based on the convalescent sera we used for assay validation." Why is this so? Hailey Hornsby: This sentence has been expanded and rewritten in manuscript to clarify: "Compared with IgG, we saw more rapid waning of the IgA responses following SARS-CoV-2 infection, as well as higher levels of cross-reactivity in pre-pandemic samples. These factors complicated defining seropositivity based on an A450 threshold, as there was no clear separation between titres in these two groups. We therefore opted to use our spike and NCP IgA ELISA solely to compare IgA titres of individuals classified as seropositive by our IgG assay." 2. "1174 attended for a second visit (V2) between 15 June and 10 July 2020" 'for' is unnecessary in this sentence.
Hayley Colton: Amended in manuscript: "1174 attended a second visit (V2) between 15 June and 10 July 2020" 3. Figure 1: Why do the black stars not overlap with the unadjusted mean estimates? David Hodgson: As explained in the caption, the black stars are point estimates from the raw data (no regression was performed), calculated by as the mean value for that specific subgroup. The remaining estimates on the Figure are from a hierarchical regression analysis. We remind the reviewer that here "unadjusted" means relative to the primary exposure (job, zone, wards), and that regularisation from the hierarchical model means that estimates from this model are likely to be different from the raw point estimates. E.g. The oldest (60+ years) age group had a much smaller sample size compared to the other age groups, therefore it is more susceptible to regularisation in a regression analysis.
4. Why were zones, job and wards not used in the same model? Was their model selection done? David Hodgson: There were two issues with using more than one of the primary response variables (zones, jobs, and wards) in the model. First we found multicollinearity between these exposures, making parameter inference difficult to interpret when more than one primary exposure was included in the regression analysis model. Second, the number of categories of these age groups gave rise to data sparsity issues (many groups have 0 entries) meaning again, results were unreliable. These reasons are explained in the Extended Data (reference 21 in the manuscript).
5. Figure 1: The raw data showed that a higher proportion of HCWs who were >60 years old were seropositive compared to the other age groups. However, there's no longer this association after adjustment. What was/were the confounder(s)? David Hodgson: This is due to regularisation of the hierarchical regression model. The oldest (60+ years) age group had a much smaller sample size compared to the other age groups, therefore it is more susceptible to regularisation in a regression analysis. In other words, the magnitude of the high seroprevalence observed in the older age group (with such a small sample size) is unlikely to be a true reflection of the seroprevalence in that age groups, given the consistency in seroprevalence across the other age groups with a large sample size.
6. In the antibody kinetics model, neither zones, job and wards were used as an independent variable. As mentioned in the earlier point, would this finding of age and higher titres modulate/disappear if zones, job and/or wards are included in this model as well?