Characterising the persistence of RT-PCR positivity and incidence in a community survey of SARS-CoV-2 [version 1; peer review: 1 approved with reservations]

Background: The REal-time Assessment of Community Transmission-1 (REACT-1) study has provided unbiased estimates of swab-positivity in England approximately monthly since May 2020 using RT-PCR testing of self-administered throat and nose swabs. However, estimating infection incidence requires an understanding of the persistence of RT-PCR swab-positivity in the community. Methods: During round 8 of REACT-1 from 6 January to 22 January 2021, we collected up to two additional swabs from 896 initially RT-PCR positive individuals approximately 6 and 9 days after their initial swab. Results: Test sensitivity and duration of positivity were estimated using an exponential decay model, for all participants and for subsets by initial N-gene cycle threshold (Ct) value, symptom status, lineage and age. A P-spline model was used to estimate infection incidence for the entire duration of the REACT-1 study. REACT-1 test sensitivity was estimated at 0.79 (0.77, 0.81) with median duration of positivity at 9.7 (8.9, 10.6) days. We found greater duration of positivity in those exhibiting symptoms, with low N-gene Ct values, or infected with the Alpha variant. Test sensitivity was found to be higher for those who were pre-symptomatic or with low N-gene Ct values. Compared to swab-positivity, our estimates of infection incidence included sharper features with evident transient increases around the time of changes in social distancing measures. Conclusions: These results validate previous efforts to estimate incidence of SARS-CoV-2 from swab-positivity data and provide a reliable means to obtain community infection estimates to inform policy response. The work is clear but there's a little detail of the statistical methods missing. The explanation of the data source and how it is used here is well explained, with further detail available in previous publications.


Introduction
Symptom-initiated community testing for SARS-CoV-2 is a vital public health intervention that enables the isolation of known positives and the quarantining of close contacts 1 . However, the trend of cases obtained from routine surveillance is often unreliable because of capacity issues and changing propensity to seek tests 2 . Therefore, data from representative community studies of swab-positivity such as the REal-time Assessment of Community Transmission-1 (REACT-1) study 3 are used to infer prevalence of swab-positivity in order to maintain situational awareness during periods when routine testing is less reliable. In order to infer infection incidence from these data, estimates are required of test sensitivity and the duration that people continue to test positive, but these values have so far been poorly characterised and estimates depend heavily on factors that will vary between settings, study designs and due to individual level characteristics. For example, estimates of sensitivity of RT-PCR to detect SARS-CoV-2 have been made using samples taken by health care professionals from hospitalized patients 4,5 , and from other small non-representative groups 6 . Also, the specific criteria used to declare results from RT-PCR assays as positive, such as use of multiple gene targets and limits of cycle threshold (Ct) values, vary from laboratory to laboratory.
REACT-1 is tracking the spread of SARS-CoV-2 in England over time at national and regional scales and in different socio-demographic groups 3 , based on self-administered throat and nose swabs obtained from non-overlapping random samples of the population 7 . However, estimates of daily incidence during the pandemic have relied on unvalidated assumptions concerning RT-PCR sensitivity and the duration of RT-PCR positivity.
Here, in a substudy of REACT-1 round 8 (6 January 2021 -22 January 2021) 8 we asked participants who tested positive to obtain two additional swabs approximately five days apart. In this way we aimed to estimate the study-specific sensitivity of RT-PCR based on a single swab, and the average duration for which individuals remained positive, allowing us to provide daily incidence estimates over time.

Results
Of the 2,282 individuals testing positive in round 8, 896 (39%) agreed to take part in this sub-study, of whom 874 (98%) had more than one successful RT-PCR test with valid date information (662 with three tests, 212 with two tests). The median interval between the first positive test and second test was 6 days, and it was 9 days between the first and third test, with the largest delay between the first and last test being 17 days. Of the 874 participants with at least one valid additional test: 323 (37%) were positive on all additional tests (237 with two additional tests and 86 with one additional test); 412 (47%) were negative on all additional tests (286 with two additional tests and 126 with one additional test); and the remainder had a mix of positive and negative results on additional tests ( Figure 1A).
We developed a statistical model of positivity to describe the probability of participants testing positive as a function of time since the first positive sample (see Methods). We fit an exponential decay in the probability of being positive, with a decay rate and initial proportion of positive individuals detected, P 0 , estimated from the data (Figure 2). For all participants, we estimated P 0 as 0.79 (0.77, 0.81) and daily decay rate 0.071 (0.065, 0.078) corresponding to a median duration positive of 9.7 (8.9, 10.6) days, or a mean duration positive of 14.0 (12.9, 15.4) days (under the assumption of an exponential trend, Table 1).
Fitting a P-spline model to incidence and swab-positivity separately (see Methods) we find that patterns in incidence preceded changes in swab-positivity ( Figure 3) and that features on the incidence time series were sharper than the corresponding features of the swab-positivity time series. For example, there was a more pronounced peak in mid-October in incidence compared with swab-positivity, which fell sharply until mid-November. From mid-November onwards incidence increased and appeared to begin decreasing in early January, though with no data from REACT-1 for December there was considerable uncertainty in the estimates for this period. From early May to beginning of July 2021, both incidence and swab-positivity increased exponentially. However, by the end of round 13 (12 July 2021) it appeared that incidence was no longer increasing, though with wide credible intervals on the P-spline.

Subgroup analyses
We found evidence of differences in the time course of positivity driven by a number of factors: Ct value, symptom status, lineage, and age ( Figure 2, Table 1, Extended data supplementary figure 1). Also, we extended the statistical model of positivity to include a plateau before the start of the exponential decay and fit models with and without the plateau to subsets of the data selecting the best-fitting version (see Methods).

Ct value
We found that the estimates of median duration of positivity and P 0 were dependent on N-gene Ct value ( Figure 4, Table 1). The estimate of P 0 was highest at 0.95 (0.91, 0.98) for participants whose initial Ct value was in the lowest ~1/3 (less than or equal to 24.5), indicating high sensitivity to detect strong positives, decreasing to 0.52 (0.39, 0.61) in those participants with a Ct value in the highest ~1/3 (greater than 32.5). The estimated median duration of positivity was highest in those with a Ct <= 24.5 at 14.8 (12.4, 18.9) days decreasing to 4.9 (4.0, 6.6) days for those with a Ct greater than 32.5. The best-fitting model for those with a Ct value less than or equal to 24.5, included a 4.0 (3.0, 4.5) day plateau before the exponential decay (difference in expected log predictive density [ELPD] = 9.5).
In addition to positivity, we analysed Ct values across repeated tests for the same individuals ( Figure 1). N-gene Ct values in the first (positive) test ranged from 11.6 to 40.4 with a mean of 28.0. In subsequent tests, overall, N-gene Ct values increased, though for a small proportion of individuals they decreased from the first test to subsequent tests. However, we were unable to estimate N-gene Ct values for 41% of second tests and 47% of third tests, either due to a negative test or because the Ct value was higher than the limit of reporting (N-gene Ct = 50).

Symptom status
We found differences in the estimates of P 0 and the median duration of positivity according to symptom status. Participants who reported any symptoms and those with the most predictive COVID-19 symptoms (loss or change of sense of smell, loss "1" represents a positive test result, "0" represents a negative test result, "X" that one test was undetermined and so excluded from the analysis. The ordering represents the order in which test results were obtained. (B) Distribution of changes in Ct value between the first (positive) test and following tests. Only tests with an accurate estimate of N-gene Ct value are included; those tests in which the N-gene Ct was not measured successfully which were included with an N-gene Ct value of 50 (the limit of detection) in panel A have not been included. (C) N-gene Ct value measured in repeat tests over time. Points show the data for the first measurement (red), second measurement (green) and third measurement (blue) of N-gene Ct value against time (jittered). The lines connect each individual's repeat measurements; decreases in Ct (dark grey) between first measurement and second measurement have been highlighted through a darker coloring. Points with a N-gene Ct value of 50 did not have any virus detected and so have been placed at the limiting value of the test.
or change of sense of taste, new persistent cough, fever 9 ) in the month prior to their first test had median durations of positivity of 12.2 (10.7, 14.4) days and 13.1 (11.0, 16.6) days respectively. In contrast, those with no reported symptoms in the month prior to their first test had a shorter median duration of positivity at 5.3 (4.6, 6.1) days (Table 1). Subsetting these Figure 2. Exponential decay models fit to all data and subsets of the data. Best-fitting exponential decay model with no plateau (dark green line) and 95% credible interval (green shaded region). Best-fitting exponential decay model with a plateau (dark orange) and 95% credible interval (orange shaded region) are included for the subsets of data where the model was preferred. Data (black points) and 95% Binomial confidence intervals (black error bars) are included for the proportion of tests that were positive. (A) Model fit to all available data.
(B,C) Model fits to subsets of the data based on the determined lineage (Alpha, wildtype). (D,E,F) Model fits to subsets of the data based on N-gene Ct value of the first test (Ct<=24.5, 24.5<Ct<=32.5, Ct>32.5). (G,H,I) Model fits to subsets of the data based on symptom status: those reporting any symptoms in the month prior to the first test (Any symptoms), those reporting no symptoms prior to all their tests (Asymptomatic) and those reporting no symptoms prior to their first test but symptoms prior to subsequent tests (Pre-symptomatic). Note that the models are not fit to the data as presented, but to the exact ordering of test outcomes for each individual. Table 1. Parameter estimates with 95% credible intervals for the exponential decay models fit to all data and fit to subsets of the data. Model 2 is only presented for the subsets of the data where it was found to be a better fit. Data

Ct value
Ct<=24 . Parameters shown are the exponential decay rate (k), initial proportion of positive individuals detected (Sensitivity proxy) and for Model 2 the additional time delay parameter.Values estimated from parameters are median duration positive, calulcated as log(2)/k with the time delay added on for Model 2, and mean duration positive, calculated as 1/k with the time delay added on for Model 2Parameter ranges represent the 95% credible intervals * Model 2 is only presented for the subsets of data where it was found to be a better fit (see Methods) ** Loss or change of sense of smell, loss or change of sense of taste, new persistent cough, fever *** Those reporting no symptoms have been subset into those that report symptoms in the past month during their subsequent tests (Pre-symptomatic) and those that do not (Asymptomatic)  individuals into those who reported any symptoms in subsequent tests (pre-symptomatic) and those who reported no symptoms (asymptomatic) also identified clear differences ( Figure 4, Table 1)  Table 1). We also fit models to three random samples of individuals infected with the Alpha variant (204 individuals out of 368 sampled without replacement), selected so that the proportion of individuals with initial N-gene Ct value <=24.5, 24.5<Ct<=32.5 and >32.5 was the same as for those infected with wildtype; this led to a slight reduction in the estimated median duration of positivity for Alpha variant, but it was still greater than the corresponding value for those infected with wildtype (Extended data supplementary table 1).
The best-fitting model for those infected with the Alpha variant, included a 3.4 (2.2, 4.2) day plateau before the exponential decay (difference in ELPD = 5.4).

Age
There were small differences in the estimates of P 0 and the median duration positive between those aged 41-59 years and those aged 60 years and above, but no significant pairwise differences between those aged 18-40 years and these groups ( Table 1).

Discussion
We estimate the overall sensitivity of the study based on a single swab to be ~79%, rising to 95% for strong positives, demonstrating why the RT-PCR remains the gold standard in testing for the presence of SARS-CoV-2. Additionally we characterised the median and mean duration that an individual remains positive after an initial positive test in REACT-1 at ~10 and ~14 days respectively (assuming a continued exponential decay beyond the follow-up period of our study), comparable to and validating previous estimates 10 .
With both sensitivity and duration of positivity well-characterised we were able to convert our previous estimates of swab-positivity into incidence allowing estimates of new daily infections across the whole study period of REACT-1. This allowed us to assess the effects of various non-pharmaceutical interventions by comparing the patterns in incidence with the timing of changes in COVID-19 restrictions. For example, a fall in incidence was seen from mid-October 2020 suggesting that behaviour may have changed before the formal start of the lockdown introduced on 3 November 2020, and sooner than would be implied from the swab-positivity data. This is consistent with findings from a previous study that reported a decrease in mobility in the last two weeks of October 2020 11 . Note that for the subset of data in which the lineage was determined to be the Alpha variant, and for the subset of data in which N-gene Ct value was less than or equal to 24.5 this is the density from the exponential decay model with an initial plateau (Model 2).
The model used in calculating daily incidence has a number of limitations. Firstly, we assumed an exponential decay to describe the probability of a participant testing positive in our study, which would not capture non-exponential longer-term trends in the waning of positivity. This limitation is unavoidable because our maximum follow-up was17 days after the initial positive infection, while it is known that people may remain positive for much longer periods 12 . Secondly, in estimating incidence we have assumed that the parameter estimates obtained for the decay of positivity from our round 8 substudy are representative of the entire study duration. Subgroup analysis showed there were differences in sensitivity and duration of positivity in individuals based on initial N-gene Ct value, symptomatology and viral lineage. With N-gene Ct value highly dependent on the contemporaneous growth rate of the pandemic 13 , the distribution of Ct values may vary between rounds. Furthermore, lineages responsible for infections changed over the study period with the emergence of the Alpha variant in late 2020 14 , and the Delta variant in April 2021 15 . We were unable to use lineage specific parameter estimates in estimates of infection incidence as lineage was only determined (via viral sequencing) in samples with lower Ct values (<34).
The median and mean durations of positivity we report do not directly inform isolation and quarantine policy. For example, the shorter duration of positivity we estimate for asymptomatic positives does not immediately suggest a shorter duration of isolation for asymptomatic contacts, despite the persistence of RT-PCR-positivity likely indicating some level of continued infectiousness: from our results, we cannot estimate the time at which symptoms occur in the 1/3 of those who initially did not report symptoms. Also, isolation and quarantine policies often reflect upper limits for durations of positivity, such as the 90th, 95th or even 97.5th percentiles 16 , and must also take into account practical and logistical constraints on their implementation.
In subgroup analyses, as noted, we obtained higher sensitivity estimates for those infections with a lower Ct value suggesting that the test is more sensitive for stronger positives predominantly indicating recent infections (Ct is lowest at around ~3 days 10 ). Furthermore, estimated sensitivity was highest in pre-symptomatic individuals, whereas it was comparable between asymptomatic and symptomatic individuals, again suggesting that RT-PCR testing is highly effective at detecting early-stage infections. Similar to a previous analysis 10 we found that median duration of positivity was lower in asymptomatic individuals, compared to those exhibiting any symptoms or one or more of the most predictive COVID-19 symptoms. Previous estimates of the proportion of infections that are asymptomatic, based on RT-PCR testing, may therefore have been underestimated due to their shorter duration. Finally, we indicate that Alpha variant infections may have a greater median duration of positivity than wildtype, similar to other work 6 , though with wide credible intervals. With the emergence of the Delta variant 15 and Omicron variant 17 in England since January 2021, characterising any possible changes in RT-PCR test sensitivity and duration of positivity will be necessary to estimate infection incidence from swab-positivity data.
Our study provides similar data on Ct values to previous studies 6,10 that have estimated the duration of the proliferation and clearance stage of the virus. However, due to a lower rate of sampling in our study we are unable to estimate these parameters. The median duration of 6 days between the first and second test is greater than the estimated proliferation stage of ~3 days. Additionally, as recruitment to our study was based upon an initial positive result, and we only had swab results up to a maximum of 17 days post-first swab, we were particularly misplaced in estimating the course of the virus, as we will not have captured individuals at both early and late stages of infection.
As well as estimating test sensitivity and duration of positivity for the REACT-1 study, our results identify factors that will drive sensitivity and duration of positivity for community-based sampling more generally. Given likely reductions in many populations in the provision of routine community testing and in people's propensity to seek tests, representative community PCR-based surveys similar to REACT-1 can continue to provide valuable situational awareness during periods of rapid changes in infection incidence.

Consent
Consent was obtained from all participants or their parent/ guardian for minors. During initial registration for the study participants are asked "Are you willing to take part in this study?/Are you willing for your child to take part in this study?" with possible answers being "1. Yes, I want to take part in this study" or "2. No, I do not want to take part.". Those who answered "2. No, I do not want to take part." were not sent testing kits and did not participate further in the study. Full registration forms for all rounds of REACT-1 are available.

Data
The methods for REACT-1 have been described before 7 .
Since May 2020 there have been 14 rounds of data collected, approximately every month (except December 2020). For each round of the study a random cross-section of the population of England ages 5 years and over is sent a letter inviting them to take part. The invitees are randomly selected at the lower tier local authority level (n=315) through the list of general practitioner's patients held by the National Health Service (NHS) in England. Those participants that agree to take part are sent a self-administered swab test (parent/guardian administered for those aged 5 to 12 years old). They also answer a short questionnaire providing information on age and other demographics as well as details on any symptoms that they exhibited in the month prior to their test. Swabs were collected in dry sample tubes and transported via cold chain to the laboratory for RT-PCR testing. RT-PCR testing was performed using the assay of the Viroboar V1 kit (manufacturer: Eurofins Genomics), using the Roche Lightcycler 480 qPCR thermal cyclers. In REACT-1, we obtain a single swab from each participant and test for N-gene and E-gene targets. We define a sample as positive if it has a valid positive signal for both gene targets (regardless of Ct value) or if it is positive only for N-gene with a Ct value equal to or lower than 37 7 .
During round 8 (6 January 2021 -22 January 2021) the study participants who tested positive were invited to take an additional two swab tests. For those who agreed, swabs were administered and collected as before.

Lineage designation
During round 8 of the study RT-PCR positive samples with a low enough N-gene Ct value (N-gene Ct value <34) and a high enough volume underwent genomic sequencing (Public Health England Research Ethics Governance Group (reference: R&D NR0195)). The methods have been described before 18 . In short, viral RNA was amplified using the ARTIC protocol 19 and sequence libraries were then prepared using CoronaHIT 20 . Raw sequence data were then analysed using the ARTIC bioinformatic pipeline 21 and lineage designation was performed using the machine learning-based assignment algorithm PangoLEARN 22 . Not all sequences obtained were of a sufficient quality for a lineage to be designated, and samples in which less than 50% of bases were covered were also deemed to have no lineage designation. Due to the repeat measurements in round 8 we were able to determine the lineage of infection as long as one swab test returned a definitive result. When discordant lineages were designated for the same individual the most advanced lineage was selected (e.g. B.1.1.7 over B.1) where one measurement just reflected a lower quality call, or if they were truly discordant (B.1.1.7 and B.1.117 for example) no lineage was designated. For the purposes of this paper, lineage segregated analysis only included individuals infected with the B.1.1.7 lineage (Alpha variant) or the wildtype (any lineage not a variant of concern or variant under investigation 23 ) Decaying positivity model We model the probability of a positive test t days after an initial positive test. The models contain two components. The first is the initial proportion of positive individuals detected, P 0 , equivalent to the sensitivity of RT-PCR obtained from a single swab. The second component is the probability of an individual being positive after t days given they were positive at time t=0. We allow it to take two forms. The first (Model 1) is a simple exponential: where k is the exponential decay rate. The second (Model 2) includes an extra parameter, τ, that introduces a plateau of duration τ before the exponential decay occurs: The Bayesian models are fit using a No-U-Turns Sampler 24 in Stan 25 with 10000 iterations and a burn-in of 500. Four chains are run for each model to assess whether there has been successful convergence. Writing P(positive, t | positive,t=0) as P(p,t) for ease of notation the likelihood of the model for each possible individual outcome is: Where the probability on the left hand side denotes the outcome of all tests (two or three) with 1 representing a positive test and 0 representing a negative test. The times t 2 and t 3 are the number of days from the first test to the second test and third test respectively.
Model comparison between Model 1 and Model 2 was done by performing Pareto smoothed importance-sampling leaveone-out cross-validation (PSIS-LOO) 26 . Estimates of the ELPD and its standard error were calculated for both models and the more complex model (Model 2) was determined to be preferred if the value of the ELPD was greater than the ELPD of the simple model (Model 1).
Posterior samples of model fits were used to estimate the median and mean duration of positivity. Model 1 is simply an exponential distribution, for which the median value is given by log e (2)/k, and the mean value is given by 1/k. For Model 2 the median and mean values are the same as for Model 1 but with the addition of the duration of the plateau.
Converting swab-positivity to incidence Estimates of the weighted swab-positivity for the first 13 rounds of REACT-1 have previously been calculated 27 . Weighted swab-positivity is converted to average daily incidence by dividing by the sensitivity of the study (P 0 from the decaying positivity model fit to all data), and by the mean duration of positivity (estimated from the decaying positivity model).
The daily infection incidence is converted to an estimate for the number of daily infections in England using mid-2020 population estimates 28 . Estimates of average daily infection incidence are calculated for the entire posterior distribution of P 0 and the mean duration of positivity with weighted swab-positivity randomly sampled from a normal distribution, with mean value the central estimate and standard deviation the width of the 95% confidence interval divided by 3.92. From this posterior the median and 95% confidence interval are estimated for average daily incidence.

P-spline incidence model
We fit a Bayesian P-spline model of incidence to the daily swab-positivity data for the first 13 rounds of REACT-1. The period of time spanning from 50 days prior to the first day of the study to the last day of the study is segmented into regularly sized knots of approximately 5 days, with a further three knots defined before and after this period (to remove edge effects). A system of 4th order basis-splines (B-splines) is then defined over these knots, and the model of infection incidence is the linear combination of these B-splines. Overfitting of the model is minimised through the inclusion of a second-order random-walk prior distribution on the coefficients of the where SP j is the swab-positivity on the j th day, I j-n is the incidence (from the P-spline) on the (j-n) th day, k is the exponential decay rate and Sens is the sensitivity of the test. The exponential decay rate and sensitivity are taken to be the best estimates from the decaying positivity model. The summation should technically run from 0 to infinity but due to the need to compute it we take N to be 50 which was selected to be large enough that the exponential term in the model becomes negligible. By converting the daily incidence into a daily swab-positivity we are then able to fit our model to the daily data for swab-positivity that the study collects. The model is fit using a No-U-Turns Sampler (NUTS) 24 implemented in STAN 25 .
P-spline swab-positivity model A Bayesian P-spline model of swab-positivity was similarly fit. The entire study duration was segmented into regularly spaced knots of approximately 5 days, with extra knots extending beyond the period of the study to remove edge effects.
Again a system of fourth order B-splines is defined over these knots and the P-spline model is a linear combination of these B-splines. Overfitting is as before minimised through the inclusion of a second-order random-walk prior distribution on the B-spline coefficients. As the P-spline is now modelling the swab-positivity there is no need for a transformation and the model can be fit directly to the daily swab-positivity data using NUTS 24 .

Data availability
Underlying data Access to individual level REACT-1 data is restricted due to ethical and security considerations and data protection issues due to individual data being identifiable. Summary statistics and data, including the daily weighted number of positive tests and daily weighted number of tests (used for the P-spline calculations for incidence and swab-positivity), are available in the extended data.
Additional summary statistics and results from the REACT-1 programme (not required for this paper) are also available here. Further REACT-1 study materials are available for each round (not required for this paper) here.
Requests for materials should be addressed to Steven Riley and Paul Elliott, s.riley@imperial.ac.uk, p.elliott@imperial. ac.uk, School of Public Health, Imperial College London, Norfolk Place, London, W2 1PG. Requests should contain the data variables and details of why the data is needed (faster consideration will be given where there is a clear benefit to public health). For information on the questions that are asked to participants (data variables available) please refer to the REACT-1 study materials (https://www.imperial.ac.uk/medicine/research-andimpact/groups/react-study/for-researchers/react-1-study-materials/). Aggregate data can only be shared if there is an appropriate number of participants in each category such that it remains unidentifiable. Therefore, requests for fewer data variables are more likely to be successful. If the data requested is identifiable a different subset of data variables may be recommended. Requests will be presented to the steering committee for approval before being made available online.
This project contains the following extended data: • mrc-ide/reactidd/inst/extdata/positive_weighted.csv and mrc-ide/reactidd/inst/extdata/total_weighted. csv daily weighted number of positive tests and total tests respectively (used in calculation of incidence and prevalence over time) • mrc-ide/reactidd/vignettes/PCR_Positivity_Paper/ EstimatingDurationOfSwabPositivity.rmd (Contains 1. Code used to fit and plot the exponential decay model of positivity, with and without initial plateau, demonstrated on simulated individual level data 2. Code used to fit and plot P-spline model of prevalence and incidence data using the data for daily weighted number of positive tests and daily weighted number of tests in rounds 1-13 of REACT-1) • mrc-ide/reactidd/vignettes/PCR_Positivity_Paper/ SupplementaryFigure1.pdf (Supplementary Figure 1) •

Samuel Clifford
Centre for the Mathematical Modelling of Infectious Diseases, London School of Hygiene & Tropical Medicine, London, UK The work is clear but there's a little detail of the statistical methods missing. The explanation of the data source and how it is used here is well explained, with further detail available in previous publications.
From the description in the Methods, the exponential decay models appear to use a normal likelihood function and do not consider the repeat observations structure in the data. A twoparameter sigmoid (e.g. logistic regression) with random effects would be better placed to capture both the individual level variation and group level variation. A three-or four-parameter logistic model can also handle asymptotic values other than 0 and 1, e.g. the implementation in R's nplr package (https://cran.r-project.org/package=nplr).
It would be good to see the model fits (in Table 1 and Figure 2) for all of the plateau models to see how well supported they are even when they are not the preferred model. The mean (or median) duration positive is quite different between the two models so while overall fit may be better for one model, the implications for how long people stay positive for is not. I would like to see the authors present this information and discuss it further, noting that the authors are clear that they do not have data more than 17 days out from the first test, that this work doesn't directly inform policy and that policy is set around extremes.
The likelihood for the spline models is not explicitly stated, are the authors log-transforming the outcome and using a normal likelihood or are they using a likelihood with a log or logit link?
The P-spline model may need to account for changes in testing patterns over the course of the week, e.g. with a random effect. This may not make a huge difference in the end but where there are breaks in the time series may cause there to be local bias in the estimated positivity. Figure 3 should be split up so that positivity and incidence are shown on separate axes rather than a double y-axis plot where the units are the same (%) but the magnitudes are not (and even if they were, two different phenomena are being described by the plot). Faceting such that the x axis is still common but there are two rows (e.g. positivity on top, estimated incidence below) would help show the patterns without being too visually busy.