Designing a multi-layered surveillance approach to detecting SARS-CoV-2: A modelling study [version 1; peer review: 2 approved with reservations]

Background: Countries achieving control of COVID-19 after an initial outbreak will continue to face the risk of SARS-CoV-2 resurgence. This study explores surveillance strategies for COVID-19 containment based on polymerase chain reaction tests. Methods: Using a dynamic SEIR-type model to simulate the initial dynamics of a COVID-19 introduction, we investigate COVID-19 surveillance strategies among healthcare workers, hospital patients, and community members. We estimate surveillance sensitivity as the probability of COVID-19 detection using a hypergeometric sampling process. We identify test allocation strategies that maximise the probability of COVID-19 detection across different testing capacities. We use Beijing, China as a case study. Results: Surveillance subgroups are more sensitive in detecting COVID-19 transmission when they are defined by more COVID-19specific symptoms. In this study, fever clinics have the highest surveillance sensitivity, followed by respiratory departments. With a daily testing rate of 0.07/1000 residents, via exclusively testing at fever clinic and respiratory departments, there would have been 598 [95% eCI: 35, 2154] and 1373 [95% eCI: 47, 5230] cases in the population by the time of first case detection, respectively. Outbreak detection can occur earlier by including non-syndromic subgroups, such as younger adults in the community, as more testing capacity becomes available. Conclusions: A multi-layer approach that considers both the surveillance sensitivity and administrative constraints can help identify the optimal allocation of testing resources and thus inform COVID-19 surveillance strategies. Open Peer Review


Introduction
Coronavirus disease 2019 (COVID- 19) is an infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and was first detected in Wuhan, China towards the end of 2019 1 . On 11 March 2020, the World Health Organization declared COVID-19 a global pandemic 2 . Within eight months of its emergence, COVID-19 has led to over 25 million reported cases and over 850,000 reported deaths globally (as of 7 th September 2020) 3 . Many countries and regions have succeeded in reducing COVID-19 incidence after the initial epidemics, such as China and New Zealand.
Nevertheless, it is unlikely that SARS-CoV-2 will be eradicated in the near future. Given the high transmissibility of the pathogen 4,5 , the non-trivial proportion of infectious individuals showing mild to no symptoms 6 , the non-specific nature of symptoms 7 , the highly intertwined global travel network 8 , and a lack of effective pharmaceutical measures for prevention or therapy 9 , countries that successfully contain the initial spread of COVID-19 will likely continue to face risks introduced by international travellers and unidentified local cases. With physical distancing measures gradually easing, sporadic infection clusters have already been observed 10 .
To prevent these sporadic infection clusters from seeding new epidemics, rapid infection detection is vital. Containment strategies, such as case isolation and contact tracing, rely on early detection of COVID-19 infections and can quickly be overwhelmed if transmission remains undetected for too long. Thus, sustainable, cost-effective, and highly sensitive surveillance systems capable of early warning for potential SARS-CoV-2 resurgence can guide ongoing implementation of control measures and are essential to the success of COVID-19 containment 11 . This study explores different surveillance strategies that maximise the probability of COVID-19 detection using polymerase chain reaction (PCR) while minimising the material and human resources required. We use Beijing as a case study and explore the potential benefits of conducting COVID-19 surveillance among healthcare workers, hospital patients, and regular community members. Hospital settings further break down to different layers, from fever clinic, respiratory departments, to general hospital departments. Fever clinics, a triage system that emerged during the 2003 outbreak of SARS-CoV, play a crucial role in COVID-19 response in China 12 and could be considered a potentially effective surveillance option elsewhere. The framework introduced is relevant to containment strategies in countries exiting the initial phases of COVID-19 epidemics, where only a small number of cases are observed sporadically.

Methods
The epidemic process We simulate the spread of SARS-CoV-2 using a deterministic age-stratified compartmental SEIR-type model (Figure 1) 13 . Additionally, the infectious compartment is split into I pc , I c and I sc to account for differences in disease progression. The compartment I c represents infectious individuals whose symptoms are sufficiently severe ("clinical" illnesses) for them to seek healthcare; the compartment I pc represents the pre-clinical infectious individuals who have not yet develop symptoms for them to seek care for COVID-19; the compartment I sc represents infectious individuals who may not seek healthcare for COVID-19 due to mild symptoms ("subclinical" illnesses). Individuals in the compartment I pc and I sc may seek care for other causes. All of I pc , I c and I sc contribute to the force of infection (FOI) although subclinical individuals may transmit diseases at a lower rate 14,15 . This framework has been used to study  in several previous studies 6,15,16 . More details about this model can be found in the Extended data Section 1 17 .
Chau et al. discovered that COVID-19 patients may remain PCR positive for up to 10 days after hospital admission, at which point they may no longer be infectious 18 . To accounts for this, building upon the existing model structure, we break the R compartment ("Removed", a state in which individuals are no longer able to transmit illness) down into sc R + (PCR-positive individuals from I sc ), c R + (PCR-positive individuals from I c ) and R_(PCR-negative individuals). Based on the results of Chau et al. 18 , we assume that PCR is more likely to detect a COVID-19 infection in clinical infectious compartments (I pc and I c ) compared to the subclinical infectious compartment (I sc ); in infectious compartments (I pc , I c , and I sc ) compared to post-infectious compartments ( c R + and sc R + ). The PCR diagnostic sensitivity in the context of COVID-19 is digitized from Chau et al. 18 . PCR diagnostic specificity, on the other hand, is assumed to be 100%, consistent with Grassly et al. 19 . Potential false-positive results may be compensated by double-testing positive samples.
Importantly, we assume that at the start of the second wave there is a sufficiently high number of susceptible individuals that early epidemic dynamics are not limited by widespread and long-lasting population immunity (for which there is limited evidence) 20 .
The age structure of the model considers five-year age categories for 0-74 year-olds and a single group for everyone above 75 years of age. We obtained other model parameters from the literature or online database (Table 1). At the start of the simulation, we assume there was one exposed (infected but not yet infectious) younger adult (i.e., 15-64 year-old); the population is otherwise susceptible, consistent with low seropositivity found in serological surveys 21,22 . A meta-analysis by Davies et al. showed an unmitigated reproduction number of COVID-19 to be 2.7 (95% critical interval: 1.6-3.9) 15 . Here, we assume an effective reproduction number (R e ) of 2 that reflects a 25% reduction from 2.7 as a result of public health measures Table 1. Parameters table. Respiratory illnesses are defined as "respiratory infections and tuberculosis" (excluding latent tuberculosis infections), "chronic respiratory diseases", and "Tracheal, bronchus, and lung cancer".

Parameters
Baseline Scenario Sensitivity Analyses

Epidemiological parameters
Latent period 2.6 days 1 -4 days Li et al. 1 Backer et al. 23 Davies et al. 15 Pre-clinical infectious period 2.4 days 1 -3 days Liu et al. 24 He et al. 25 Clinical infectious period 3.6 days 3 -6 days He et al. 25 Subclinical infectious period (assumed to be the same as the sum of pre-clinical and clinical infectious periods)

Surveillance layers and subgroups
Surveillance layers are settings where COVID-19 surveillance is possible; surveillance subgroups are defined by specific characteristics (e.g., age and occupation) within a given surveillance layer. In this study, we consider three surveillance layers and nine surveillance subgroups within these surveillance layers ( Table 2).
Fever clinics are triage systems established in China during the 2003 SARS epidemic to limit the spread of pandemic pathogens in hospital settings and allow rapid detection 33 . Telephone-based triaging systems combined with drive-through testing centres used in the United States (e.g., Georgia 34 ) during the COVID-19 pandemic are motivated similarly. A fever clinic is part of a hospital and often staffed by healthcare workers (HCWs) from respiratory departments or clinical laboratories. In the context of COVID-19, anyone with any respiratory symptoms and potential exposure to a confirmed case, as well as anyone with a combination of fever and any respiratory symptoms but no known exposure, are encouraged to present to fever clinics in China 35 . In China, the national sentinel influenza surveillance system reports the proportion of all outpatients with influenza-like illness (ILI) (defined as fever ≥38°C and cough or sore throat) 27 . In this study, the percentage of ILI among all outpatients is used to model the background rate of fever clinic visits for non-COVID-19 related causes.
For public health surveillance, this study assumes 1) that individuals with clinical respiratory illness who seek care would be captured by a surveillance system in either fever clinics or respiratory departments; 2) that those with a subclinical COVID-19 infection who are seeking care for non-respiratory causes may be detected in other hospital departments; 3) that those with a COVID-19 infection who do not actively seek care may be detected in the community whether they exhibit symptoms or not. Specific pathways to the detection of different surveillance layers and subgroups are shown in Table 2.
The risk of a susceptible individual, whether an HCW or not, contracting COVID-19 while present in a fever clinic or respiratory department is elevated due to the concentration of infected individuals. HCWs may face a higher COVID-19 infection risk than patients do in these locations due to prolonged exposures. In this study, the expected number of infectious encounters for HCW and non-HCW adults are calculated by taking the product of the number of contacts (either patient or nonpatient, digitised from 36) and the corresponding COVID-19 prevalence (inpatient or non-patient settings). The HCW to non-HCW ratios of infectious encounters (α) is then used as a risk multiplier for different HCW subgroups. More details can be found in the Extended Data Section 2 17 .
Surveillance sensitivity and probability of detection The surveillance sensitivity (φ i,k,t ) is the probability of detecting at least one individual infectious with COVID-19 in a subgroup i on a given day t (since the first case) is characterised by a hypergeometric distribution: where N i is the number of all members in subgroup i, p i,t is the proportion infected in subgroup i on day t, i is the PCR diagnostic sensitivity in subgroup i, i N i p i,t is the total number of detectable infected individuals in subgroup i, k is the number of individuals tested on day t, and q is the number of infectious individuals identified through testing (of k individuals). Different from the binomial distribution, hypergeometric distribution describes the probability of detection without replacement, and thus more appropriate in capturing healthcare-seeking behaviour. In this study, p i,t depends on the surveillance pathways outlined in Table 2. For example, for a patient seeking care at a fever clinic, the probability that they are infectious with SARS-CoV-2 is: . , where ILI t is the number of non-COVID-19 patients who could be triaged into fever clinics based on influenza-like illness (ILI). This equation does not include I pc and I sc as they would not be triaged into a fever clinic by definition. Parameters used to capture the proportions of the patient population who would be seen at fever clinics, respiratory departments, and other hospital departments are summarised in Table 1.
The probability of an individual in subgroup i on day t testing positive (p i,t ) decreases as the background population size increases (under the assumption of perfect specificity). In the example above, the background population in a fever clinic is captured by ILI t . In respiratory departments, the background population is the number of patients who visit respiratory departments for all non-COVID-19 respiratory causes (Rsp t ). Thus, at any given time t, Rsp t must be at least as large as ILI t . The proportion of COVID-19-infected patients in fever clinics p fc.pat,t is always larger than that in respiratory departments (p rsp.pat,t ) by definition.
We assume that HCWs who have clinical respiratory illness behave identically to other non-HCW individuals with clinical respiratory illness, and hence will attend fever clinics or respiratory departments. Surveillance targeting the HCW subgroups will only capture those current in the I pc , I sc , or sc R + (see Table 2). These HCWs will face a higher risk of infection due to occupational exposure (α fc.hcw,t ). For example, the probability of an HCW working at a fever clinic to be PCR detectable on a given day t can be expressed as: .
, . , pc sc sc t t fc hcw t fc hcw t where α fc.hcw,t is the increased risk multiplier for an HCW working at a fever clinic on day t. The probability π t of detecting COVID-19 infection by time t in at least one subgroup using k tests per day can be expressed as: Comparing fever clinics and respiratory departments The testing capacity needed at the respiratory departments to achieve a similar probability of COVID-19 detection while testing at maximum capacity at fever clinics (1,600 tests/day or 0.07 tests per thousand residents per day (t/k/day)) is found using mean absolute errors. Then, at different levels of daily testing capacity, we estimate the time it takes for the outbreak to surveillance processes simulated, a corresponding time at which we become aware of the on-going transmission is generated. We extract the corresponding cumulative incidences at these times of transmission detection from the results of the dynamic model, forming a probability distribution of epidemic sizes by the time first COVID-19 cases are detected. The relevant uncertainty is expressed using the middle 95% of the simulated sample, i.e., the 95% empirical confidence intervals (eCI).

Characteristics of surveillance systems
Fever Clinic Patients seeking care with influenza-like symptoms x x

Respiratory Department
Patients seeking care for any respiratory conditions x x

Other Hospital Departments
Patients seeking care for any nonrespiratory conditions capacity at 4.15 tests per 1,000 residents, consistent with the observed testing capacity in Beijing 29 . In a city with population size and age structure of Beijing (with a population of 22 million as of 2017) 40 , this translates to a maximum daily testing capacity of approximately 100,000 tests per day. Additionally, we assume a maximum of 10% HCWs at non-fever clinic hospital departments, and 30% of HCWs at fever clinics can be tested each day as a part of HCW surveillance to minimise interference with every-day work. We consider optimal resource allocation strategies by gradually incrementing the testing capacity in increments of 200.
The number of tests performed in the individual subgroups is also constrained by the number of individuals available for testing, driven by the local healthcare infrastructures and demographics. In this study, we extracted values for Beijing (Table 1). These values may apply to other large cities in the region.

Determining the most efficient strategy
We consider a wide range of testing capacities between 0 and 100,000 tests daily, with increments of 200 tests, totalling 501 different levels of testing capacity. Therefore, the number of daily tests available at level n is 200(n-1). We then determined how these 200(n-1) tests should be allocated across different subgroups to maximise the probability of detection (π k,t ).
Instead of an exhaustive search among all possible allocations of resources, we used a recursive algorithm. Assuming there are only two subgroups, i and j, the probability of detecting COVID-19 transmission can be expressed as: where the sum of k i and k j is (n-1). Then, in the next step, 200n tests can be best allocated as: where the sum of k i and k j + 1 and of k i + 1 and k j are both n. This recursive algorithm is applied to all nine surveillance subgroups to identify the optimal surveillance strategies, i.e., the allocation of COVID-19 tests that can achieve the highest π k,t .
All analyses were done in R 4.0.0 41

Results
The subgroups-specific surveillance sensitivity Surveillance sensitivity is the probability that on-going COVID-19 transmission (i.e., one or more cases) can be detected on a given day t. During an emerging yet undetected epidemic, the surveillance sensitivities of different subgroups are shown in Figure 2 for testing capacities between 0 and 100,000 tests/ day. At the baseline scenario (R e = 2), it takes 20 days for the epidemic process to incur over 100 cases (17 and  Overall, without considering administrative constraints such as hospital service capacity, the surveillance sensitivities (i.e., the probability of capturing at least one infectious case) among HCW are slightly higher than those among patients. Within a hospital, fever clinics have the highest surveillance sensitivity, followed by respiratory departments, and then by other hospital departments. The most sensitive age-group for COVID-19 surveillance is younger adults. Surveillance sensitivities are comparable between younger adults in the community and respiratory patients in hospitals.
The surveillance sensitivity of patient-based surveillance strategies may experience within-year variability. We found that in northern hemisphere summer when other respiratory pathogens are less common in a city like Beijing, patient-based surveillance at respiratory departments become a more sensitive surveillance option; in winter, when other respiratory pathogens are common and the number of respiratory patients per day goes up, patient-based surveillance performs relatively worse than its annual average levels in terms of COVID-19 detection (Extended data Section 3) 17 .

Comparing fever clinics and respiratory departments
To further understand the potential benefits of fever clinics, we compared the probability of detecting at least one case of COVID-19 by day t using realistic administrative constraints (e.g., fever clinic service capacity) for Beijing, China ( Figure 3). Using the parameters outlined in Table 1, we estimated that the maximum service capacity at fever clinic is 1,600 patients/day, respiratory departments 29,400 patients/day, and other hospital departments more than a quarter-million patients/day. If there is a capacity to conduct 1,600 tests/day (i.e., 0.07 t/k/day), the surveillance system was able to reach a 50% chance of detection among fever clinic patients 3 days earlier than among respiratory department patients. Using an unmitigated R t shrunk the time gained to 2 days; a 50% reduced Optimal strategies at different testing capacity Figure 4 shows the highest probability of COVID-19 detection achievable at different daily testing capacities. These are also referred to as efficiency frontiers as no higher probability of COVID-19 detection can be achieved without improving the testing capacity. The probability of COVID-19 detection increases as cumulative incidence increases and as daily testing capacities increase. As expected, when cumulative incidence is already high, a small number of tests will already allow us to detect on-going transmission. When daily testing capacities increase, we observe a diminishing return on investment. The additional increase in the probability of detecting COVID-19 by every additional 200 tests conducted decreases as overall daily testing capacity increases. Distinct turning points on the efficiency frontiers indicate when the surveillance system switches from conducting more tests in a given subgroup to testing a new subgroup due to exhausting available individuals. Figure 5 shows the specific composition of optimal surveillance strategies along the efficiency frontiers in Figure 4. The surveillance of COVID-19 with the purpose of early detection should prioritise testing among fever-clinic patients and HCWs. When more resources become available, respiratory department patients and HCWs should be included in the surveillance strategy. When more than 30,000 daily testing capacity is reached in a city with 24 million residents, testing among HCWs in other hospital departments and younger adults in the community should be included in the surveillance strategy. During conditions where cumulative incidences are extremely low (i.e., ≤100), including these two groups earlier on can improve the probability of detection (see also Extended data Section 5) 17 .   The surveillance processes in this study currently assume randomness in healthcare-seeking decisions. In other words, clinically infectious COVID-19 individuals are equally likely to use different healthcare service options available to them. In reality, at the beginning of an outbreak, patients are likely going to use only a few fever clinics or respiratory departments out of all that are available due to spatial clustering. Consequently, HCWs across the whole city may not have an equal amount of increased exposure to potentially infectious COVID-19 patients. Using fever clinic HCWs as an example, we discovered that the optimal number of tests identified in this study should be allocated in a stratified manner (Extended data Section 6) 17 . For example, compared to randomly testing 30% of all fever clinic HCWs, testing a randomly 30% of HCWs in each fever clinic is slightly more efficient, especially when cumulative incidence is low.

Discussion
This study evaluated different allocations of resources for designing COVID-19 surveillance strategies in Beijing, China. We discovered surveillance sensitivity is highest at fever clinics, followed by respiratory departments, and then by community-based subgroups. Testing at maximum capacity among fever clinics patients (i.e., 1,600 tests/day, 0.07 t/k/day) allows us 50% probability to detect on-going transmission by day 29 after the first infection given R t of 2 (day 25 when R t of 2.7, day 35 when R t of 1.4). To achieve similar levels of efficiency, a surveillance system based in the respiratory department would need more than twice the daily testing capacity. Testing at maximum capacity in respiratory patients (29,400 tests/day, 1.35 t/k/day) allows us a 50% probability to detect on-going transmission a week earlier. Surveillance sensitivity is slightly higher among HCWs compared to patients in hospitals given a within-year average condition. However, when other respiratory pathogens are less common, surveillance among fever clinic and respiratory department patients become significantly more sensitive.
The surveillance system in this study is particularly relevant to locations where epidemic "suppression" strategies (e.g., school and workplace closures combined with testing and contact tracing) have been successful in reducing COVID-19 incidence to a very low level, such as China, South Korea, and New Zealand 3 . These countries may experience extended periods without any new COVID-19 cases detected. However, epidemic risks will persist due to the non-trivial proportion of subclinical cases 6 , importation from other locations 8 , and the potentially short-lived immunity following infection 43 . Sporadic infection chains have already been observed in countries that previously reported no new cases for a while, such as China and South Korea, both of whom have relied on intensive testing policies to guide their follow-up interventions and to prevent large outbreaks. As more countries head into the post epidemic peak phases, surveillance and early detection will continue to play a central role in COVID-19 response. The population age distribution, contact patterns, and healthcare system administrative and service capacities from Beijing were used to parameterise the model in this study. The framework is easily transferable to studying other health systems. A wide range of testing capacities is examined, allowing national or subnational public health entities at different resource levels to benefit from the concepts derived here.
The concepts of fever clinics and respiratory departments should be interpreted broadly as triaging systems, each with a different baseline population defined by different sets of symptoms of various degree of syndromic specificity, leading to different surveillance sensitivity to detect COVID-19. In other words, the more specific symptoms are related to COVID-19, the higher the surveillance sensitivity achievable. An internal medicine department, for example, has lower surveillance sensitivity compared to the respiratory department but is more sensitive than an oncology department. We quantified the benefits of setting up a triage system such as fever clinics. The current criteria for a patient to be admitted into fever clinics follow the World Health Organisation's definition of influenza-like illness (defined as fever ≥38°C and cough or sore throat) 44 . In response to the ongoing outbreak started from Beijing's Xinfadi Market, some hospitals have revised the body temperature threshold down to 37.3°C 45 .
We are aware of additional triaging criteria based on the assessment of dyspnea, hypoxia, and chest x-ray interpretation 12 . These criteria may further decrease the background populations for fever clinics and respiratory departments, increasing surveillance sensitivity and thus reducing the number of tests needed. In Mao et al. 46 , based on only expert assessment and (when needed) chest x-ray, fever clinics in Shanghai were able to triage non-COVID-19 cases with 100% accuracy. This level of triaging accuracy, however, is not always possible. A COVID-19 transmission cluster in Harbin, China, included a patient who was triaged to be non-COVID-19, admitted for cerebral stroke, and thus not isolated. This patient, directly and indirectly, infected 35 persons including three HCWs while hospitalised 47 .
Another potential mechanism that can be used to reduce the number of tests needed is the pooling of samples. Instead of testing the sample of each person individually, this method combines swab specimens from many and tests them collectively. When positive pools of specimens are identified, additional individual-based tests will be carried out only within these pools. This method has been used for influenza surveillance and has been used for a one-time community-based COVID-19 screening in Wuhan 48,49 . While the pooling methods may not meet diagnostic needs in a clinical setting (e.g., timeliness), community-based surveillance could potentially rely on this approach to increase coverage. However, specific laboratory standards may need further evidence support.
Routine testing in the community (other hospital departments or age-specific community groups) is less likely to yield a high probability of detecting infections but can be added if early detection is prioritised, in which case we find that the highest probability for detection is among younger adults. Although many studies have found older adults to be the most susceptible age group with the highest clinical fraction who therefore are most likely to seek healthcare, surveillance of younger adults has higher sensitivity, possibly for two main reasons: (1) at the initial phase of an outbreak, younger adults may be more likely to be infected due to higher numbers of daily contacts 16 ; (2) a larger proportion of COVID-19 infections among younger adults may never progress clinically and thus may go undetected with symptom-based surveillance in the healthcare systems; infections among older adults, on the other hand, can already be detected via surveillance in healthcare settings and thus do not need further community testing 1,6,50 .
This study only considers virological tests based on PCR to detect COVID-19 infections and did not consider serological tests for several reasons. Firstly, the window of possible viral RNA detection occurs earlier than seroconversion for COVID-19 51,52 . The rate of seroconversion among subclinical cases is largely unknown. Serological tests thus are suboptimal given the objective of early outbreak detection. Secondly, it remains unclear if current serological tests can differentiate between current and previous infections. And last but not the least, follow-up interventions to interrupt onward transmission, such as contact tracing and quarantine, given serological test results, would be challenging. The window of potential infectiousness may be uncertain and long-passed, making it difficult for infected individuals to recall their close contacts.
This modelling study has limitations. Firstly, an important assumption in this study is that clinically infectious COVID-19 patients would seek medical care at the same rate as patients with other respiratory pathogens. For example, if 10% of patients infected with other respiratory pathogens seek care, then 10% of COVID-19 patients will seek care. However, this assumption may not be valid. Clinically, COVID-19 patients may be more likely to seek care due to pandemic awareness. They may also be less likely to seek care due to uncertainty around associated healthcare costs or the risk perception surrounding healthcare settings. Public messaging that motivates people to use healthcare may increase the sensitivity of hospitalbased COVID-19 surveillance at fever clinics and respiratory departments.
Secondly, it is also important to note that there is still much unknown about SARS-CoV-2, a novel pathogen. The results of this paper, thus, is built on our best knowledge, which may be challenged in the future. The underlying epidemic processes in this study assumes the population to be largely susceptible at the beginning with a reproduction number of 2. This assumption is justifiable based on low seropositivity in the general population and among HCWs 21,22,53 . However, a smaller reproduction number may be required if seroprevalence is high 54 . In that case, epidemic processes will be slower compared to shown here, increasing the time gained to respond via public health surveillance. However, the order of prioritisation among surveillance layers and subgroups as well as the distribution of resources will not change unless large differentials in terms of seropositivity are observed by age or occupation groups. Thirdly, as population-level prevalence decreases, the positive predictive value of any individual test will necessarily be lowered. Despite the high diagnostic specificity of PCR, there may be a small amount of false-positive tests. For example, in an environment completely free of SARS-CoV-2, conducting 30,000 tests/day may still lead to three false-positive test results if the diagnostic specificity is 99.99%. Hence, as testing capacity increases, almost but not quite a perfect testing specificity may lead to false alarms. This will need to be accounted for when deciding how to act upon identified COVID-19 cases. In practice, imperfect diagnostic specificity (resulting from factors such as laboratory contamination), could be compensated for by re-testing of the patients or samples. This, however, was not explicitly accounted for in our model although we do not expect the number of tests required for re-testing would be high at the beginning of outbreaks.
Highly sensitive COVID-19 surveillance contributes to timely outbreak detection, thereby enabling and guiding a swift and targeted response associated with a high probability of containment. In this study, we assessed the surveillance sensitivity in different surveillance layers and subgroups. While designing a COVID-19 surveillance system, prioritisation of fever clinics over respiratory departments and of patients over HCWs tend to optimise for higher COVID-19 detection probability given limited testing capacity. Community-based testing that targets non-respiratory patients and HCWs or age-specific community groups can capture subclinical infections but may only be considered when more testing capacity becomes available. Future research may further assess the value of information on COVID-19 surveillance systems by estimating the on-ward transmission prevented.

Weibing Wang
School of Public Health, Fudan University, Shanghai, China Based on a currently more recognized COVID-19 model framework, the study considered the part of recovered population with PCR-positive who are actually no longer infectious by further subdividing the recovered population compartment. This brings the extended SEIR model framework closer to reality.
This study considered the actual situation of different detection scenarios (e.g., fever clinic, respiratory department, etc.), and evaluated how to allocate different medical resources. For example, in the early stage of epidemic, patients in fever clinic should be given priority to be tested. This also provides scientific advice to the actual detection departments, which can more quickly predict the number of people who have been infected when the first infected person is detected, thus giving early warning to relevant departments for further prevention and control work.
There are some issues need to answer or consider: The author used an effective reproduction number (R e ) of 2 that reflects a 25% reduction from 2.7 as a result of public health measures, which need to give more evidence. As a number of publications has shown the effective implementation of NPIs can reduce the R e below 1.
day) may not be sustainable. However, the measurement of Qingdao and Yunnan have approved the testing capacity may not be a problem in cities in the country like China and other countries with abound resources.
The large-scale use of pooling testing will compromise the sensitivity and specificity of COVID-19 testing, which has been included in the guideline in China, and you may need to consider it in your modeling.

5.
I did not see how the close contacts would be tested and detected because of exposure, which was however very important for the measurements for control of epidemics. 6.
The author should pay more attention to the details of the article. Errors like these may cause confusion in the reader's understanding, for example, the formula in line 7 on page 2 of the attachment seems to be ambiguous which should be clinical infectious compartments.

7.
A few of newly published papers may need to be included in the references. 8.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes Are the conclusions drawn adequately supported by the results? Yes © 2020 Pei S. This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Sen Pei
Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY, USA In this study, authors used a dynamic SEIR-type model to investigate different surveillance strategies among healthcare workers, hospital patients and general community to detect a new COVID-19 outbreak in population. Using extensive model simulations, authors evaluated the surveillance sensitivity in different settings with various testing capacities, and proposed a multilayer surveillance approach that optimally allocates testing resources. Results from this modeling work have direct implications in COVID control and can inform allocation of testing to achieve rapid detection of the onset of a new outbreak. The study is in general technically sound, and the results are clearly communicated. Here I have a few questions and suggestions that I hope authors find useful.
The study employed a deterministic compartmental model and assumed the new outbreak is initiated by a single exposed person. As the initial transmission dynamics is highly stochastic, it is more appropriate to use a stochastic version of the model. This model stochasticity could also potentially affect the surveillance sensitivity and uncertainty in the results, for instance, Figure 3.
1. Figure 2 is hard to read using the current color map. A contour map would be better.

2.
In Table 1, does the PCR diagnostic sensitivity depend on the sampling time relative to infection time?

3.
It would be good to explain the meaning of crosses in Table 2. 4.
In the model, HCWs are assumed to face a higher infection risk than patients due to prolonged exposures. However, there are reports showing that the use of PPE actually could reduce the infection risk compared with general population. A discussion on this factor would be helpful.

Are sufficient details of methods and analysis provided to allow replication by others? Yes
If applicable, is the statistical analysis and its interpretation appropriate? Yes