SARS-CoV-2 detection by a clinical diagnostic RT-LAMP assay

The ongoing pandemic of SARS-CoV-2 calls for rapid and cost-effective methods to accurately identify infected individuals. The vast majority of patient samples is assessed for viral RNA presence by RT-qPCR. Our biomedical research institute, in collaboration between partner hospitals and an accredited clinical diagnostic laboratory, established a diagnostic testing pipeline that has reported on more than 252,000 RT-qPCR results since its commencement at the beginning of April 2020. However, due to ongoing demand and competition for critical resources, alternative testing strategies were sought. In this work, we present a clinically-validated procedure for high-throughput SARS-CoV-2 detection by RT-LAMP that is robust, reliable, repeatable, specific, and inexpensive.


Introduction
The current pandemic caused by novel coronavirus SARS-CoV-2, first detected in late 2019 in the province of Wuhan, China, has rapidly spread worldwide, infecting more than 68 million individuals as of 11 December 2020 [1][2][3] . Infection with SARS-CoV-2 can lead to development of COVID-19, a disease associated with severe acute respiratory syndrome, responsible for hundreds of thousands of deaths globally. Controlling the spread of SARS-CoV-2 relies on the ability of healthcare systems to quickly identify infected individuals, which has mainly relied on RT-qPCR for viral RNA detection 4 . International competition for commercial kits and reagents has negatively impacted the ability of many countries to scale up testing capacity to deal with the increased demand caused by rampant infection. Implementing RT-qPCR testing programs requires specialised laboratory equipment and reagents, presenting additional challenges.
In an effort to increase the diagnostic capacity for SARS-CoV-2 infection in the UK, the Francis Crick Institute, a biomedical research institute based in London, rapidly repurposed its staff and facilities in late March 2020 to serve as a clinical diagnostic testing facility through a partnership between a major local healthcare provider (University College London Hospitals National Health Services Trust) and an accredited clinical diagnostic laboratory (Health Services Laboratories, HSL), termed the CRICK COVID-19 Consortium (CCC) 5 . The pipeline utilises a series of in-house buffers to first inactivate patient samples received from care homes and hospitals, and to then extract RNA before using a CE marked commercial kit to detect SARS-CoV-2 by RT-qPCR. Patient results are reported through a custom online web portal that interfaces with the reference laboratory 5 . In order to avoid dependence on any singular testing methodology, to continue increasing testing capacity, and to provide a potential means to deliver diagnostics at the point-of-care, the CCC was also tasked with developing and validating alternative SARS-CoV-2 testing strategies.
Herein, we describe the use of loop mediated isothermal amplification PCR coupled with reverse transcription (RT-LAMP) as a robust method for SARS-CoV-2 detection in clinical specimens 6 . The strand displacement and amplification procedure is carried out at a single temperature in less than 30 minutes. Diagnostic tests utilising this technique have been developed for RNA viruses and other pathogens [7][8][9] , and there are now several works focusing on the use of RT-LAMP to detect SARS-CoV-2 [10][11][12][13][14][15][16][17] . We set up a SARS-CoV-2 RT-LAMP assay using the WarmStart Colorimetric LAMP 2X Mastermix commercialised by New England Biolabs (NEB), which allows for visual assessment of DNA amplification. Alternatively, DNA amplification can be quantified on a real-time PCR machine by complexing the reaction with the DNA dye SYTO 9 10 . We made use of primers developed by Zhang et al. targeting the nucleocapsid phosphoprotein (N gene) of SARS-CoV-2 10 and, in a parallel reaction, primers that detect human 18S rRNA to control for specimen quality 9 .
Our results demonstrate that within the CCC pipeline, RT-LAMP can readily replace RT-qPCR as a means for detecting SARS-CoV-2 transcripts within RNA extracted from nose-throat swabs and endotracheal secretions/bronchoalveolar lavage fluid. RT-LAMP for the N gene shows absence of non-specific amplification and cross-reactivity with other human coronaviruses or respiratory viruses, and displays a faster reaction time than the gold standard RT-qPCR. Switching to RT-LAMP translates into a ten-fold decrease in total reagent cost and a potential four-fold increase in our current pipeline's output. Additionally, we provide preliminary data suggesting that RT-LAMP can be performed without prior RNA extraction, allowing rapid and cost-effective testing that could potentially be extended to point-of-care. The entire workflow was validated under extended governance by public health authorities during the pandemic and inspected by a qualified UKAS assessor against GenQA guidelines to verify compliance to ISO 15189:2012 equivalent standard (US equiv. CAP/CLIA). As such, the procedure developed here is ready to be deployed for diagnostic testing of SARS-CoV-2.

Patient samples
In response to the pandemic, the Francis Crick Institute formed a partnership with University College London Hospitals (the Crick-COVID-Consortium) and set up a SARS-CoV-2 RT-qPCR testing pipeline (CCC pipeline). Nasopharyngeal swabs tested in this study were obtained from the CCC pipeline 5 .
CCC pipeline procedures and HSL RT-qPCR All samples processed through the CCC pipeline were done in accordance with Standard Operating Procedures (SOPs) described recently 5 , which can be found at https://www.crick. ac.uk/research/covid-19/covid19-consortium. A redefined reaction, running, and reporting SOP can be found in the Supplemental Methods (Extended data 18 ).

Amendments from Version 1
We thank all reviewers for their constructive comments and suggestions. We have addressed their points with additional data and revisions to the text, as detailed below. We hope that the reviewers will find the revised manuscript improved and acceptable for publication.
For the main figures, we made modifications to Figure 2 to include experiments that were previously contained within the supplemental data and clarified data presented in Figure 3. Additionally, we have added four new supplemental figures. These figures include experiments assessing other RT-LAMP primers specific for SARS-CoV-2, analysis of sequence mutations in the SARS-CoV-2 genome targeted by the RT-LAMP N gene primers, confirmation of sufficient viral genomes present in the specificity controls used, and preliminary data regarding RNAextraction free RT-LAMP.
The manuscript has been modified to include additional methods and references, minor modifications to the abstract, expansion of the discussion, and new figure legend descriptions.
RNA from the NISBC SARS-CoV-2 standard was purified using the QIAGEN RNeasy Kit according to the manufacturer's protocol.
RT-LAMP assay RT-LAMP reaction was performed in a total volume of 15 µL, mixing 7.5 µL WarmStart Colorimetric LAMP 2X Master Mix (New England Biolabs, M1800), 1.5 µL of 10X primer mix, 1.5 µL SYTO 9 Green Fluorescent Nucleic Acid Stain (Ther-moFisher Scientific, S34854), and 4.5 µL of sample unless stated otherwise. Indicated experiments utilised only 3 µL of RNA due to sample availability. 1.5 µL of nuclease free water (Ambion, AM9932) was added to the reaction mixture to replace the volume lost by this modification. 10X SYTO 9 solution at 5 µM in nuclease free water was prepared for a final concentration of 0.5 µM in the final RT-LAMP reaction. 10X primer mix was prepared from 100 µM desalted DNA primers obtained from Sigma Custom Oligos Service. 10X primer mixes for N gene and 18S contained FIP and BIP primers at 16 µM, F3 and B3 at 2 µM, LF and LB at 4 µM. RT-LAMP was ran following a SOP (cf. Extended data: Supplementary Methods 1 and 2 18 ) on a 7500, 7500 Fast, QuantStudio 3, 5, or 7 Real-Time PCR System (Applied Biosystems). A negative no template control (NTC) and a positive control (supplied by HSL, SARS-CoV-2 clinical sample) were included on every run. Experiments utilising laboratory grown SARS-CoV-2 were performed in containment level 3 at the Francis Crick Institute (FCI) by trained personnel according to health and safety guidelines.
We found that the assembled reaction mix was unstable when kept at 4°C for more than a few hours, and sensitive to freeze-thaw when kept at -20°C for more than a day. The RT-LAMP should be performed using freshly prepared reaction mix. The individual components should be stored at -20°C until use and avoid repeated freeze/thaw cycles.

Accreditation and governance
As outlined recently 5 , the CCC was formed in partnership with HSL, a UCLH UKAS accredited lab, who already had a COVID-19 RT-qPCR test in scope. All samples were received and communicated by HSL under their accreditation and the CCC RT-LAMP assay was validated against their existing RT-qPCR test and the CCC's validated RT-qPCR test, which uses a CE marked commercial kit (BGI). Given the urgent timeframe required to implement testing, it was not possible to secure official clinical laboratory accreditation for the FCI. However, full measures were taken to ensure that the CCC RT-LAMP test was evaluated, verified and performed for diagnostic use in an environment that adhered to equivalent international standards (ISO 15189:2012, US equiv. CAP/CLIA), overseen and audited by HSL. These measures were implemented under the advice and oversight of registered professionals from existing nearby ISO accredited medical laboratories, and included writing and following clinical diagnostic SOPs for every stage of the pipeline from sample reception, processing to result reporting by qualified clinical scientists prior to results being communicated to patients by HSL. Additional SOPs were followed for sample storage, disposal of materials, batch certification of reagents and incident reporting. Appropriate risk assessments, training and competency assessment procedures were established and documented. Record sheets were created to document the receipt, batch acceptance testing, and start/end of use dates for key reagents and consumables. An inventory of all key equipment was compiled which, where appropriate, included details of service and calibration records. Systems were also established for the control of all key documents (version implementation, distribution and acknowledgement), audit trailing (what samples were tested when, by whom, with what equipment and using which consumable/reagent batches), and the recording of all untoward incidents/issues (thus facilitating appropriate investigation, rectification and recurrence prevention). Samples were barcoded and tracked using the Crick Clarity library information management system (LIMS). All key documents are available at https://www.crick.ac.uk/research/covid-19/covid19-consortium. NHS governance was extended by a specific memorandum of understanding for diagnostic testing between UCLH and the FCI enabled by NHS England. Assurance of the pipeline was performed in collaboration with GenQA, following their checklist for non-accredited laboratories and the lab and CCC workflow were inspected by a qualified UKAS assessor against the GenQA guidelines to verify compliance to IS015189:2012 equivalent standard.

RT-LAMP test validity
RT-LAMP was performed with 4.5 µL of RNA in a total reaction volume of 15 µL unless stated otherwise. Each sample was tested in two separate reactions, one with primers targeting the N gene of SARS-CoV-2, and the second targeting human 18S rRNA to control for specimen integrity and quality ( Figure 1A). The RT-LAMP mastermix contains a colorimetric pH indicator that turns from pink to yellow upon DNA amplification ( Figure 1B). Other primer sets previously described in the early months of the pandemic were also examined for their amplification of SARS-CoV-2 RNA compared to negative controls (Extended data: Figure S1A 18 ) 10 . However, the primers we moved forward with ( Figure 1A) were chosen because they correctly identified positive samples and did not falsely amplify negative samples (Extended data: Figure S1B 18 ) and targets a region that has minimal mutational burden (Extended data: Figure S2 18 ). In addition, we benchmarked the RT-LAMP method by measuring DNA amplification using a SYBR based dye in a real-time PCR machine. When measuring fluorescence are plotted as "Ct = 40" (left y-axis) or "Ct = 25" (right y-axis) for illustrative purposes. Data are normalised by assay run/cycles determined by the two methods for comparative purposes. The clinical call from the reference laboratory is indicated above the graph. every minute ('cycle'), double stranded DNA accumulation follows a characteristic exponential amplification phase that eventually plateaus ( Figure 1C and 1D).
An initial characterization of the technique was performed using 24 RNA samples purified from patient nasopharyngeal swabs, 12 of which were positive for SARS-CoV-2 as assessed by the CCC pipeline via RT-qPCR 5 . RT-LAMP could detect SARS-CoV-2 in the 12 positive samples with no amplification detected in all 12 negative samples, displaying 100% concordance to our current clinical diagnostic platform ( Figures 1C and Figures 1E).
Internal control signal was detected in all samples ( Figure 1D).

RT-LAMP background noise and specificity
The background signal of the SARS-CoV-2 RT-LAMP assay was assessed by running RNA elution buffer (7 samples) or nuclease free water (7 samples) no template controls (NTCs) performed 4 separate times by 4 different operators. Non-specific signal could be detected in one well of one experiment after 25 minutes (Figure 2A). A similar experiment was performed for the 18S rRNA internal control RT-LAMP assay and revealed signal in NTCs after 20 minutes ( Figure 2B). Based on these NTC data, a detection threshold was set for each RT We then asked if melting curves obtained from thermocycler runs could be used to help discriminate true positives from false positives. N gene and 18S rRNA RT-LAMP reactions each gave single peaks in melt curve plots ( ( Figure 2E) and consistent melting temperature values ( ( Figure 2F) over over multiple experiments performed on positive and negative patient samples. Moreover, positive samples that amplified near the assay endpoint (i.e., high time threshold) still displayed a clear melting curve peak consistent with being true positives (Extended data: Figure S3A 18 ) and were distinguishable from false positives (Extended data: Figure S3B 18 ). Therefore, when implementing the assay using a real-time PCR machine (thermocycler) with SYBR-based detection, incorporation of a traditional melting curve stage allows for elimination of false positives and increases reporting confidence for SARS-CoV-2 positive samples (see Extended data: Supplemental Methods 18 ).
To further assess the specificity of the N gene RT-LAMP assay, we first determined possible cross-reactivity with human RNA. 95 wells of RNA from a human cell line extracted by the CCC pipeline gave no signal in the assay ( Figure 3A). To assess the assay's specificity for SARS-CoV-2, the RT-LAMP assay was performed on RNA purified from clinical samples from COVID-19 negative patients infected with a variety of RNA viruses, including seasonal coronaviruses strains 229E, NL63, OC43 and HKU1, MERS-CoV, Influenza A and B viruses, metapneumovirus (MPV), respiratory syncytial virus (RSV) or parainfluenza viruses type 3 and 4 (PIV3, PIV4) ( Figure 3B and Extended data: Figure S4 18 ). All samples gave a signal in the 18S rRNA RT-LAMP, confirming the presence of human RNA ( Figure 3B). With the N gene RT-LAMP, no signal was observed with any of the 14 distinct virus containing specimens tested, confirming that the assay is highly specific for SARS-CoV-2 ( Figure 3B).

RT-LAMP sensitivity and precision
In order to ascertain the sensitivity of the N gene RT-LAMP assay, RNA from a SARS-CoV-2 quantified gene copy number standard obtained from the UK National Institute for Biological Standards and Control (NIBSC) was extracted and assessed by limiting dilution. The results indicate that the limit of reliable detection of the N gene RT-LAMP assay is between 10 2 and 10 3 copies of N gene ( Figure 4A and 4B), representing a limit of detection of > 10 4 N gene copies per mL. This was confirmed using 10-fold serial dilutions of RNA extracted from laboratory-grown SARS-CoV-2 quantified using the NIBSC standard ( Figure 4C). Notably, a highly linear response was observed, even in the presence of 1% Triton-X 100, which has been proposed as a means of inactivating SARS-CoV-2 19 .
The RT-LAMP assay reproducibility and precision were determined by extracting RNA 5 times from a confirmed COVID-19 positive patient sample through the CCC pipeline and assessing by N gene and 18S RT-LAMP in 5 independent experiments, performed by two different operators ( Figure 4D). The coefficient of variation was 0.03151 for the N gene RT-LAMP analysis and 0.03595 for the 18S rRNA internal control. We also verified that RT-LAMP can be performed in a 384-well plate format using a QuantStudio 5 real-time PCR machine with equivalent results (Extended data: Figure S5 18 ).

Clinical validation of RT-LAMP against two clinical diagnostic RT-qPCR assays
The RT-LAMP assay was benchmarked against the RT-qPCR methods used by the reference clinical diagnostic laboratory, HSL, and that used by the CCC validated clinical diagnostic platform by assessing 37 clinical samples processed in parallel by both laboratories with duplicate RT-qPCR analyses. RNA extracted by the CCC pipeline was then tested by RT-LAMP in duplicate runs. RT-LAMP detected 16 positives in both experiments, which was 100% concordant with results obtained by the CCC's RT-qPCR assay using a CE marked kit from BGI and HSL's N gene RT-qPCR assay. However, positives within four cycles of the limit of detection (Ct = 37) of the BGI RT-qPCR assay, which were termed 'borderline positives', were not consistently detected by the N gene RT-LAMP assay ( Figure 5A). In an additional set of experiments, 71 clinical specimens in viral transport medium (VTM) stored at HSL were subjected to RNA extraction through the CCC pipeline and examined by HSL's RT-qPCR and CCC's RT-LAMP assay in duplicate ( Figure 5B). Again, samples with Ct < 39 in the HSL RT-qPCR analysis (which runs for 45 cycles as opposed to 40 in the BGI assay) were confidently detected by RT-LAMP, whilst samples with Ct values 39 < Ct < 41, late cycles indicative of low levels of SARS-CoV-2 RNA, displayed inconsistent amplification by N gene RT-LAMP. Altogether, these data demonstrate that RT-LAMP performed on extracted RNA accurately detects SARS-CoV-2 in clinical samples as verified by a clinical diagnostic laboratory and a validated clinical diagnostic pipeline. However, samples with Ct values nearing the limit of detection of either RT-qPCR assays, were not consistently identified in duplicate RT-LAMP runs, with detection for some samples in one run, but not always in the other. These results suggest that RT-LAMP is less sensitive than RT-qPCR when performed with extracted RNA. However, it is important to bear in mind that the RT-LAMP assay was performed during these validation experiments with a third of the input RNA used by either RT-qPCR methods during these validation experiments (3 µL versus 10 µL), partly due to limitations in sample availability.
Bypassing RNA extraction Lastly, we asked if RT-LAMP could be performed on dry swabs without the need for RNA extraction. Recent work shows that  0.5% Triton X-100 inactivates SARS-CoV-2 19 , and we observed that performing RT-LAMP with samples diluted in 1% Triton X-100 does not affect the sensitivity of the assay ( Figure 4C). We reverse-engineered swab specimens by coating hospitalgrade swabs in serial dilutions with HEK293T cell supernatant of SARS-CoV-2 crude suspension generated in the laboratory. Swabs were left to dry overnight and placed in 0.5% Triton X-100 for 15-30 minutes at room temperature before assessment. The RT-LAMP test used 4.5 µL of the solubilised material to maximise the sensitivity of the assay (Extended data: Figure S6 18 ). In parallel, samples were processed by the CCC pipeline and interrogated by RT-qPCR with a calculated equivalent amount of RNA. 'RT-LAMP Pre' (without RNA purification) results were 94% concordant to those generated by the CCC pipeline using RT-qPCR, suggesting that RT-LAMP may potentially be performed without traditional RNA extraction. However, the use of RT-LAMP on material from swabs without RNA extraction requires clinical validation using bona fide nasopharyngeal specimens to ascertain reliability and compatibility with various transport media.

Discussion
Rapid and reliable detection of SARS-CoV-2 is required to efficiently diagnose infected individuals and to provide governments and health systems with guidance for treatment and quarantine strategies to reduce the risk of transmission. The CCC was formed to address a critical testing shortage in the London area, especially for frontline healthcare workers, with the additional goal of rapidly validating and disseminating SOPs for others to scale up their own diagnostic programs. The majority of testing is currently performed by RT-qPCR amplification of viral RNA obtained from nasopharyngeal samples. In the face of continued global demand and competition for reagents and resources and to minimise reliance on any singular testing strategy, we have outlined here procedures to utilise RT-LAMP as a cost-effective and high throughput alternative to RT-qPCR for detecting SARS-CoV-2 in clinical specimens.
Implementing this testing method would reduce the CCC's operational costs ten-fold when accounting for all consumables (data not shown), provide the ability to scale up our output further (using 384-well vs. current 96-well plate setup, Extended data: Figure S5 18 ), and decrease reporting turnaround time in comparison to our current method. We confirmed that the assay is highly specific for SARS-CoV-2 and displays lack of cross-reactivity with other respiratory viruses, including seasonal coronaviruses. Assay accuracy and robustness is denoted by an absence of false positives and the further introduction of a melt curve stage allows increased confidence in the identification and reporting of genuine positives. Samples suspected of bearing a false-positive result (e.g. late amplification with inconsistent melting temperatures) could be compared against results obtained by performing RT-qPCR. When performed with extracted RNA, the N gene RT-LAMP assay displays a lower sensitivity threshold compared to clinically approved RT-qPCR methods, although determining the true false negative rate of our proposed method would require additional testing of clinical specimens, specifically those that display late stage amplification for SARS-CoV-2 RNA by RT-qPCR. Dilutions containing ≤ 10 2 copies of NIBSC SARS-CoV-2 standard were only sporadically detected in a limited set of experiments (unpublished observations), consistent with our clinical validation data. Lastly, preliminary experiments demonstrate that RT-LAMP can be performed directly without RNA extraction, by inactivating virus-containing dry swabs in a detergent-based solution. However, these experiments did not make use of true nasopharyngeal swabs that are likely to contain material that can cause RNA degradation or inhibit the assay (unpublished observations).
Our methodology makes use of a real-time qPCR machine that measures DNA amplification using a SYBR based dye, allowing for real-time detection and standardised reporting. We did not test extensively if RT-LAMP could be assessed by the colorimetric indicator alone, although some of our data suggests that the colour change readout is concordant with the SYTO 9 dye results (Figure 1, Figure 3 and Figure 4). If RT-LAMP were coupled with a colorimetric read-out and the need for RNA extraction could be obviated, the result would be a testing modality that tremendously reduces the cost and time for SARS-CoV-2 diagnostics and allow its application at point-of-care and in remote areas where sophisticated testing infrastructures currently do not exist.
In the initial months of the pandemic, preprints were released that highlighted alternative strategies apart from RT-qPCR for SARS-CoV-2 detection [10][11][12][13][14][15][16][17] , although robust validation of these techniques using bona fide COVID-19 positive clinical specimens was sometimes lacking. We were able to validate that RT-LAMP could be a specific, reliable, and inexpensive assay, albeit less sensitive, with clinical confirmation by cross comparison with our newly established RT-qPCR pipeline. We also include a method of automatically excluding concerns of false positive amplicons in calling results, which were absent from previous studies. Measures including separating RNA isolation, RT-LAMP reagent preparation, and amplification analysis areas must be in place to avoid contamination of reagents 20 . Even though RT-LAMP has the advantage of having a faster amplification time and utilising alternative materials, the assay was ultimately not adopted by the institute for several reasons: 1) the supply chain for RT-qPCR reagents was restored, 2) samples from a variety of settings were being analysed by our pipeline whether they were from care home facility personnel, patients and staff from COVID-19 free hospitals and surgeries, to employee samples from workplace surveillance programs, and therefore all had different requirements when it came to need for absolute sensitive detection, and 3) the demand for alternative testing strategies had diminished from an institutional perspective by the time governance and clinical validation had finally been established. Regarding the diminished sensitivity of the assay in comparison to RT-qPCR, strategies have been proposed to rectify this issue, such as the addition of GuHCl and the inclusion of multiple primer sets targeting other sections of the genome 21 . However, other methods for detecting SARS-CoV-2 infection, such as lateral flow antigen tests, have known reductions in sensitivity compared to RT-qPCR or even RT-LAMP, but are acceptable as a testing modality for specific situations 22 . RT-LAMP based assays for COVID-19 detection continue to evolve 23 and some of the issues that have hindered its widespread implementation may become inconsequential in future iterations.

Open Peer Review
Veterinary Diagnostic Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL, USA I am fine with the changes made by the authors. Thanks.

Competing Interests:
No competing interests were disclosed.

Reviewer Expertise: Molecular virology and clinical diagnostics.
We confirm that we have read this submission and believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Version 1
Reviewer Through testing water and RNA elution buffer, they optimized the assay with reduced reaction time to minimize background noise. They tested different species of human coronaviruses, and other human viruses and showed no cross reaction with non-SARS-CoV-2 viruses. They also determined the LOD of the assay and performed clinical validation on 37 and 71 samples tested by other labs. To address the false positive of RT-LAMP, evaluation of melting temperature was introduced in the procedure, however, there is no approach taken to address false negative due to the relative lower sensitivity compared with realtime qPCR.

Major comments:
A conventional real-time PCR enables multiplex capability, how can the validated assay of running two separate PCRs for one sample improve the workflow? Based on the data ( Figure 5A and 5B), it seems switching from real-time PCR to LAMP could potentially lose sensitivity, as RT-LAMP missed weak positives. In addition, the incubation time for 18S is shorter than that for SARS-CoV-2, which means not same procedure used and 18S may not serve well as an internal control. The abundance of 18S ( Figure 1D) may not reflect the presence of any inhibition in the test. Please clarify.

1.
As two sets of N gene primers in the reference 10 (Zhang et al.), this study uses one set of them, have the authors tested both of them? While the N gene is highly conserved among different coronaviruses, what is the sensitivity/reliability of using one set of primers? Other publications have noted the use of multiple primer sets to increase sensitivity of the RT-LAMP assay and especially with the increasing discovery of variants of SARS-CoV, it would be crucial to include a combination of target genes.

2.
In the pages 3 and 9, the RT-LAMP assay procedure, 3 µL of the RNA sample is used, however, in page 8 of the manuscript file, and pages 3 and 5 of the Supplementary Method file, 4.5 µL of the RNA was used. It is kind of confused. What correct amount is used in the procedure? 3.
In the Introduction section, they mentioned that the LAMP assay alleviates the need for a thermocycler qPCR system, as the processing time on machine is less 30 minutes. The sentence needs to be rewritten. Instead using a dry bath in the Reference 10, this study still uses thermocycler. qPCR using the new fast chemistry (like TaqCheck) on QuantStudio, only takes 43 minutes on PCR machine. Therefore, the developed procedure may not help much to alleviate the need for thermocyclers.

4.
This study only has a very low number (37 and 71, a total of 108) of the samples used for clinical validation of RT-LAMP, compared with their large testing volume completed by RT-qPCR. This part needs including much more samples for the clinical validation.

5.
On numerous occasions, the authors refer to the RNA extraction through the CCC pipeline. There is missing extraction method in the materials and methods section.

Minor comments:
In Page 3, the authors stated they provided the preliminary data about running RT-LAMP on samples without RNA extraction, where are these data? 1.
In the page 16 of Supplementary Method file, result interpretation table did not include possibility of real SARS-CoV-2 positive, but Internal control negative. Please clarify it.

2.
It is confused that Figure 1B    For colorimetric read-out results, inconsistent data format was presented as Figure 1B was tubes containing the whole reaction, however, Figure 3B and 3C, and Figure 4 B were bottom part of tube, plate, or others? These need clearly indication in the figure legend.
6. Figure 4, 1% Triton-X 100 was shown. It is unclear which step Triton-X 100 was used and what is the purpose of using it.

If any results are presented, are all the source data underlying the results available to ensure full reproducibility? Yes
Are the conclusions about the method and its performance adequately supported by the findings presented in the article? Yes the data ( Figure 5A and 5B), it seems switching from real-time PCR to LAMP could potentially lose sensitivity, as RT-LAMP missed weak positives. In addition, the incubation time for 18S is shorter than that for SARS-CoV-2, which means not same procedure used and 18S may not serve well as an internal control. The abundance of 18S ( Figure 1D) may not reflect the presence of any inhibition in the test. Please clarify.
We have modified the manuscript to include further discussion of the advantages and disadvantages of the outlined RT-LAMP assay in comparison to conventional RT-qPCR.
Switching from RT-PCR to RT-LAMP would result in a loss in sensitivity. However, the loss in sensitivity is less than what is experienced for currently approved antigen tests and the assay provides other advantages. Supplies of RT-LAMP reagents, especially in the beginning of the pandemic, were more readily available, easier to acquire and markedly less expensive, than RT-PCR reagents due to less demand and markedly less expensive. The assay can also be performed without a thermocycler if need be.

In terms of the workflow, the assay can be run on either 96 or 384-well plate format with a thermocycler and both the SARS-CoV-2 and internal control reactions can be ran on the sample plate for the same amount of time as amplification results are only assessed at a defined data even if data is collected beyond that time point (e.g. N-gene assay, amplification before 25 minutes and 18S rRNA assay, amplification before 20 minutes).
As two sets of N gene primers in the reference 10 (Zhang et al.), this study uses one set of them, have the authors tested both of them? While the N gene is highly conserved among different coronaviruses, what is the sensitivity/reliability of using one set of primers? Other publications have noted the use of multiple primer sets to increase sensitivity of the RT-LAMP assay and especially with the increasing discovery of variants of SARS-CoV, it would be crucial to include a combination of target genes. Figure 1 with an explanation of our rationale for moving forward with the particular N gene primers that we used throughout the rest of our studies. Figure 2).

In regards to current questions of reliability in light of new variants, Reviewer #1 had similar concerns and we have now included analyses that compare the mutational rate in the specific sequences that these primers target (see Supplemental
As mentioned with Reviewer #1, we have now included discussion of various strategies, including the use of multiple primer pairs, to increase the sensitivity of the RT-LAMP assay in the revised manuscript.

2.
In the pages 3 and 9, the RT-LAMP assay procedure, 3 µL of the RNA sample is used, however, in page 8 of the manuscript file, and pages 3 and 5 of the Supplementary Method file, 4.5 µL of the RNA was used. It is kind of confused. What correct amount is used in the procedure?

3.
We apologise for the confusion: the standard RNA input amount is 4.5 µL, however some experiments were performed using 3 µL of input due to limitations in sample availability. Experiments that used 3 µL as opposed to the standard amount have been denoted in the corresponding figure legends and text.
In the Introduction section, they mentioned that the LAMP assay alleviates the need for a thermocycler qPCR system, as the processing time on machine is less 30 minutes. The sentence needs to be rewritten. Instead using a dry bath in the Reference 10, this study still uses thermocycler. qPCR using the new fast chemistry (like TaqCheck) on QuantStudio, only takes 43 minutes on PCR machine. Therefore, the developed procedure may not help much to alleviate the need for thermocyclers.
We have modified the specific sentence mentioned from the introduction and have expanded our discussion in the revised text of the advantages and disadvantages of RT-LAMP. There are still instances where access to a thermocycler may not be readily available and a procedure that can be carried out without one if necessary (RT-LAMP) would be a valued asset. We have mentioned this fact in our revised discussion.

4.
This study only has a very low number (37 and 71, a total of 108) of the samples used for clinical validation of RT-LAMP, compared with their large testing volume completed by RT-qPCR. This part needs including much more samples for the clinical validation.
We acknowledge that increasing the number of clinical samples tested by RT-LAMP would allow more robust clinical validation of the assay's performance. However, as seen in our response to Reviewer #1, point 7, the assay was ultimately not adopted for use at the institute. We have included discussion of this particular issue in the revised manuscript.

5.
On numerous occasions, the authors refer to the RNA extraction through the CCC pipeline. There is missing extraction method in the materials and methods section.
We apologise for not including this information in the original submission and have amended the methods accordingly.

6.
Minor comments: In Page 3, the authors stated they provided the preliminary data about running RT-LAMP on samples without RNA extraction, where are these data?
We apologise that these data were not included in the original submission. They now appear in new Supplemental Figure 6. 1.
In the page 16 of Supplementary Method file, result interpretation table did not include possibility of real SARS-CoV-2 positive, but Internal control negative. Please clarify it.

2.
Please see our response to Reviewer #1, point 12, first paragraph.
It is confused that Figure 1B  As the primary readout in Figure 1C-E was Syto9 amplification, we do not have the corresponding images demonstrating the colorimetric conversions.
The Ct values from Figures 1C and 1D are displayed in Figure 1E (grey coloured dots, refer to left y-axis) as indicated in the figure legend.
3. We have removed the pink dotted line, which referred to the threshold set for the assay, from the indicated figures as its depiction is not necessary. Figure 3, C was not labeled/marked for 18S graph. In addition, in the colorimetric run for the RT-LAMp 18S rRNA, the 5th well looks to be pink/red and not yellow, indicating that one sample did not give signal.

4.
We thank the reviewer for their careful eye and have corrected the corresponding figure legend.
As for the 5 th well that appears lighter in hue compared to other samples, we have explained the discrepancy in our response to Reviewer #1 (see response 17, paragraph 3 and response 19)

5.
For colorimetric read-out results, inconsistent data format was presented as Figure  1B was tubes containing the whole reaction, however, Figure 3B and 3C, and Figure 4 B were bottom part of tube, plate, or others? These need clearly indication in the figure legend.
We have amended the indicated figure legends as suggested to clarify how our colorimetric data are presented.  In the ongoing pandemic, alternative methods to detect SARS-CoV-2 infections at a scalable, faster, and cheaper level, are still in need. In this manuscript, Buck et al. report on the use of the NEB warmstart enzyme preparation for colorimetric reverse transcription loop-mediated isothermal amplification (RT-LAMP) in combination with primers published by NEB (Zhang et al.) for the amplification and detection of SARS-CoV-2 viral RNA in clinical samples. For detection, a colorimetric implementation of the RT-LAMP assay (based on a pH shift and color change of an indicator dye upon amplification of nucleic acids) along with a fluorescent dye for DNA detection was used. The assay was tested for COVID-19 diagnostics and the report includes data on sensitivity, specificity as well as a positive sample control reaction. All established procedures were set up in agreement with the requirements for patient diagnostics, including SOPs for all relevant procedures. The authors furthermore claim that the cost of the assays would be a fraction of the ones used for RT-qPCR.

6.
Major points: The authors use colorimetric LAMP reagents but then add a green fluorescent dye for the detection of the fluorescence signal using a light cycler throughout their experiments, with the exception of the few figure panels where they show color images of the reaction tubes? Is this correct? Importantly, the authors advertise the speed of the RT-LAMP assay in the abstract (25 min). This is somewhat misleading to non-expert readers who might think that the total time of the procedure (from sample taking to result) takes only that long. But the time to isolate the RNA and to set up the assays also needs to be considered.
The preprint in its current form appeared online on July 2020, and the authors claim clinical validation of the assay, higher speed, and almost similar sensitivity to qPCR at much lower costs. Given these advantages, may I ask whether the assay has been used for SARS-CoV-2 diagnostics? If yes, for how many samples? It would be essential to present such data including statistics from validation of LAMP positives by qPCR. On the other hand, if the assay was not used for diagnostics, it would be good to carefully explain the reasons behind it; otherwise, this may raise concerns about the validity of some of the data and acclaimed features of the assay.
Average RT-qPCR assays using well-established primer sets meanwhile reach sensitivities where 10 molecules of the viral genome are detected in 95% of the cases, which for most RT-qPCR kits corresponds to a CT approx. = 37. This can be considered as standard. Is this given for the used PCR assay (CRICK BGI)?
The manuscript does not elaborate on the used RNA isolation procedures. This should be included in the methods section. Different RNA extraction protocols exist, either using silica matrix columns or magnetic beads. These protocols allow purification and simultaneous concentration of nucleic acids relative to the original specimen. For diagnostic analysis of nucleic acids in SWAB specimen, a concentration factor of 2 to 5 usually applies. This yields a LOD of approx. 500 or fewer viral RNA molecules per ml of patient specimen for an RT-qPCR that uses 10 μl input volume. The authors here claim a sensitivity of the RT-LAMP of 11000-22000 molecules per ml patient specimen using a SARS-CoV-2 standard (Figure 4), while they claim a sensitivity up to Ct=35 with patient samples (Figure 5a). A Ct=35 in an RT-qPCR assay corresponds roughly to 40 molecules per 10 μl purified RNA, while only 3 μl are used for the RT-LAMP, which would imply a sensitivity limit of approx. 15 molecules per reaction. This is inconsistent with the data shown in Figure 4 and with previous reports for the N-gene LAMP primer sets used here, which was reported to have a sensitivity in the range of 100 molecules per reaction (e.g. 1 ). A closer inspection of the data shown in Figure 5a indicates that all RT-LAMP positive samples have CT<<32 (Figure 4a), but not enough samples in the CT range of 25 -35 are shown to confidently estimate a CT value that corresponds to the LOD. These issues should be clarified and, if necessary, faulty standard calibrations or additional means to calibrate the sensitivity should be used. these data? I suggest either adding the data or removing the corresponding sentences in the introduction and discussion.
The number of cells in a swab can be affected by many factors. Even in the absence of human cells/detectable housekeeping genomes, viral RNA can be or should be present in a positive sample in the form of cell-free, viral particles. How do the authors suggest proceeding if 18S rRNA is not detected in a sample? How does the detection of the housekeeping gene affect the analysis of the clinical samples (e.g. in Figure 5)? For example, in Figure 3B. The fifth sample appears to be 18S rRNA negative, as the reaction is still pink. How is such a result taken into account?
The citations are not exhaustive. Many manuscripts have been published meanwhile that make use of the colorimetric RT-LAMP assay for COVID-19 diagnostics. In addition, a description of the major differences or similarities to previously published work is missing.
The manuscript would benefit from a report on the frequency of mutations in the binding sequences of the primer set used, which could lead to false negatives. This can easily be done using online resources (https://www.gisaid.org or https://covidcg.org) and the great number of viral genomes sequenced in the UK.
The authors need to report on measures to avoid contamination by LAMP products. This is an important point that may be less relevant for a diagnostic setting (since people are aware of this issue), but in an academic environment, researchers may need a reminder, which could be given for example by pointing out relevant literature 4     the RT-LAMP assay precision, why extracting the RNA 5 times independently and then running the assay? I expect higher variations from the RNA extraction than from the RT-LAMP assay itself, which should be highly stable and reproducible on purified samples unless the sample contains only very few amounts of target RNA and the detection can become stochastic. either using silica matrix columns or magnetic beads. These protocols allow purification and simultaneous concentration of nucleic acids relative to the original specimen.
We apologise for not including the information in our original submission as the complete RNA isolation/purification procedure utilised by the institute's RT-qPCR pipeline had already been published in its entirety. However, we have now included a brief description in the revised methods.
For diagnostic analysis of nucleic acids in SWAB specimen, a concentration factor of 2 to 5 usually applies. This yields a LOD of approx. 500 or fewer viral RNA molecules per ml of patient specimen for an RT-qPCR that uses 10 μl input volume. The authors here claim a sensitivity of the RT-LAMP of 11000-22000 molecules per ml patient specimen using a SARS-CoV-2 standard (Figure 4), while they claim a sensitivity up to Ct=35 with patient samples (Figure 5a). A Ct=35 in an RT-qPCR assay corresponds roughly to 40 molecules per 10 μl purified RNA, while only 3 μl are used for the RT-LAMP, which would imply a sensitivity limit of approx. 15 molecules per reaction. This is inconsistent with the data shown in Figure 4 and with previous reports for the N-gene LAMP primer sets used here, which was reported to have a sensitivity in the range of 100 molecules per reaction (e.g. 1). A closer inspection of the data shown in Figure 5a indicates that all RT-LAMP positive samples have CT<<32 (Figure 4a), but not enough samples in the CT range of 25 -35 are shown to confidently estimate a CT value that corresponds to the LOD. These issues should be clarified and, if necessary, faulty standard calibrations or additional means to calibrate the sensitivity should be used.
We apologise for the confusion and have clarified our discussion of the assay's LOD, including the acknowledgement that a true determination of the LOD/false negative rate would require additional clinical specimen testing. We thank the reviewer also for their careful analysis of Figure 5A. Samples that had Ct > 33, not 35 (a typo), were not consistently detected, which would be more consistent with the existing LOD determinations.
We have also modified the text to more explicitly state that the assay is less sensitive than RT-qPCR.

6.
The authors represent a consortium that has generated more than 252000 RT-qPCR results since the beginning of the pandemic. Obviously, knowing the exact sensitivity limit of the RT-LAMP assay and the data of the RT-qPCR would permit an estimation of the false-negative rate of the RT-LAMP assay in a much more realistic scenario as compared to the limited number of samples presented here.
We acknowledge that knowing the exact sensitivity and running the RT-LAMP assay in parallel against RT-qPCR for the same samples would allow a true estimation of the falsenegative rate of the RT-LAMP assay. However, as stated above in response 3, the assay was ultimately not adopted and therefore, we can only speculate. We have now included some discussion of this particular issue in the revised text.

7.
The assay makes use of one primer set only for SARS-CoV-2 detection. Meanwhile, many more primer sets have been described, and a publication by NEB scientists 2 furthermore presented means to improve the sensitivity of the assay by adding GuHCL to the reaction buffer and by using multiple primer sets in combination. This could be discussed and the benefits of using better primers/conditions to be estimated.
In the first months of the pandemic during the initiation of our studies, we tested several LAMP primers specific for SARS-CoV-2 as outlined in Ref 1. We moved forward with our particular primers because they did not display aberrant non-specific amplification seen with the other primers. We have now mentioned this in the revised text and included these data in new Supplement Figure 1.
As the reviewer has suggested, we have expanded our discussion of possible ways to optimise the assay.

8.
Specificity: The authors elude that the RT-LAMP assay does not cross-react with other viruses. If the authors really want to use this experiment as a valid assay to demonstrate the sensitivity of the RT-LAMP assay for SARS-CoV-2, they should provide evidence that the used samples indeed contain viral nucleic acids in sufficient numbers (> 10000 molecules per ml) of the patient specimen. Figure 3B and Supplement Figure 4.

9.
Another issue that needs to be addressed is the relatively well-established propensity of RT-LAMP to produce false-positive results by spurious amplification (e.g. see 3). The frequency of these may be small, i.e. below 1%, and hence such cases escaped detection in this work. As shown by the authors, melting curves allow to identify those, but upscaling and the use of the colorimetric RT-LAMP will certainly bear the risk of false positives. Here, validation of positives by RT-qPCR could be used.
We agree that samples suspected of being false-positives (e.g. late amplification) could be ruled out by validation with RT-qPCR and have included this suggestion in the text.

10.
Additional points: The authors mention twice preliminary data on an RNA-extraction free procedure. Where are these data? I suggest either adding the data or removing the corresponding sentences in the introduction and discussion.
We apologise for this oversight and have described the data now in the corresponding text for new Supplemental Figure 6. 1.
The number of cells in a swab can be affected by many factors. Even in the absence of human cells/detectable housekeeping genomes, viral RNA can be or should be present in a positive sample in the form of cell-free, viral particles. How do the authors suggest proceeding if 18S rRNA is not detected in a sample? How does the 2.
detection of the housekeeping gene affect the analysis of the clinical samples (e.g. in Figure 5)? For example, in Figure 3B. The fifth sample appears to be 18S rRNA negative, as the reaction is still pink. How is such a result taken into account?
We acknowledge that SARS-CoV-2 could be present in samples in cell-free, viral particles that may not be associated with human cells. However, to avoid potential confusion during diagnosis between the following two scenarios: whether a patient's sample may have been inadvertently contaminated during processing with SARS-CoV-2 versus detection of SARS-CoV-2 from a genuine nasopharyngeal swab, we explicitly state in the supplemental methods that a SARS-CoV-2 positive result would only be made if there was sufficient amplification in the internal control.
The analysis of the housekeeping gene should not affect the analysis of SARS-CoV-2 as they are carried out in separate reactions (excluding any possible errors associated with loading of the RNA input into the reaction).
The sample in Figure 3B was 18S rRNA positive and was one of the last samples to amplify (around 18 minutes) as seen in the corresponding amplification plot (now denoted by asterisk). One of the reasons we utilised a thermocycler for detection of Syto9 incorporation was to more accurately quantitate amplification during the assay because as the reviewer points out, sometimes a change in sample colour is not always obvious to the naked eye.
The citations are not exhaustive. Many manuscripts have been published meanwhile that make use of the colorimetric RT-LAMP assay for COVID-19 diagnostics. In addition, a description of the major differences or similarities to previously published work is missing.
We agree that many studies since our original submissions have been published and now reference a recent review that compares and contrasts various RT-LAMP assays for COVID-19 diagnostics that have been developed. We have revised our discussion to include a comparison of our assay against previous work.

3.
The manuscript would benefit from a report on the frequency of mutations in the binding sequences of the primer set used, which could lead to false negatives. This can easily be done using online resources (https://www.gisaid.org or https://covidcg.org) and the great number of viral genomes sequenced in the UK.
We have undertaken an analysis of the frequency of mutations present in the binding sequences of the primers we utilised as the reviewers have suggested. These new data are included in Supplemental Figure 2 and the corresponding revised text.

4.
The authors need to report on measures to avoid contamination by LAMP products. This is an important point that may be less relevant for a diagnostic setting (since people are aware of this issue), but in an academic environment, researchers may need a reminder, which could be given for example by pointing out relevant literature 4 or by indicating strategies to eliminate those in the first place (UNG/UDG), or both.

5.
We have mentioned in the revised discussion the need and strategies to avoid contamination by LAMP products, including the relevant literature that the reviewer has brought our attention to.
Specific comments to the Figures: Figure 1: Suggest moving primer sequences to a separate table.
For increased visibility, we have decided to continue highlighting the primers utilised in Figure 1.
1. Figure 2: Panels D and E are redundant with Panels A and B as the threshold is indicated by the grey background (just add an explanation of the threshold to the legend). Panel C is also redundant and difficult to read, the information is already in A and B. Give an explanation of the false negatives on the 18S control, which already give fluorescent signals at 20 mins into the LAMP reaction. The primer design seems suboptimal as spurious amplification usually occurs at later time points. At what time point did the authors analyze the colorimetric read-out for 18S, e.g. in Fig 3B. Please define up to when is the reaction reliable.
We have removed the grey background in Figures 2A and 2B and removed Figure 2C  We acknowledge that the 18S rRNA specific primers we adopted from a previous study were not exhaustively assessed in our own studies and may not be optimally designed, which may account for the exhibition of spurious amplification. However, the inclusion of the melting curve analyses should help distinguish these false positives from true positives.
The colorimetric reactions were visualised either after 25 minutes of amplification (N-gene SARS-CoV-2) or 20 minutes of amplification (18S rRNA). Although the images are included alongside the amplification plots (corresponding to Syto9 dye incorporation), positive amplification by Syto9 read-out was more sensitive than the subjective colour conversion read-out as explained in response 12.

2.
Suggest moving the melting curves from S1A to the main figures. When calculating sensitivity thresholds-does the melting curve change anything? Does it improve the sensitivity?
We have done as the reviewer has suggested and modified Figure 2 to include the data featured in Figure S1A.
The inclusion of the melting curve analyses does not improve the sensitivity of the reaction (e.g. carrying out amplification for longer periods in the hopes of amplifying low copy number positives), but merely aids in distinguishing true amplification from false 3.

amplification. Samples that have copy numbers below the LOD do not appear to amplify with longer incubation times (data not shown).
Figure 3B, right panel: All different samples are just blue lines. Like this, it is impossible for the reader to understand which line corresponds to which sample. As the coloration is somewhat different between the samples as well as the time points where the fluorescent signal becomes positive, it would be good to see if some sort of correlation exists or not.
For the one sample that had late 18S rRNA amplification and concurrently, suboptimal colour change, we now denote this particular amplification curve and plate well with an asterisk in Figure 3B to assist in the visualisation of the correlation between the two analyses and cite this in the figure legend. The reviewer is correct that the rationale in using Triton-X treated RNA would be applicable for an extraction-free setting. We have included new data in Supplementary Figure 6 that explains the reasoning for having performed the experiment depicted in Figure 4C.

5.
Figure 4 D: When determining the RT-LAMP assay precision, why extracting the RNA 5 times independently and then running the assay? I expect higher variations from the RNA extraction than from the RT-LAMP assay itself, which should be highly stable and reproducible on purified samples unless the sample contains only very few amounts of target RNA and the detection can become stochastic.
One of the criteria that needed to be met for approving the assay for clinical diagnostics (in collaboration with the accredited clinical diagnostic laboratory and hospital partners) was to insure that for any one sample going through the entire pipeline (including RNA extraction from the original swab sample to final read-out), the result would need to be highly reproducible / precise. Therefore, we performed the exact same assessment our institute used to validate their use of a CE marked RT-qPCR assay, which was to process a nasopharyngeal sample that had an accompanying clinical diagnosis for COVID-19, 5 times from start to finish (lysate to result).

6.
Competing Interests: No competing interests were disclosed.