Epidemiology of COVID-19 infections on routine polymerase chain reaction (PCR) and serology testing in Coastal Kenya

Background: There are limited studies in Africa describing the epidemiology, clinical characteristics and serostatus of individuals tested for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We tested routine samples from the Coastal part of Kenya between 17 th March 2020 and 30 th June 2021. Methods: SARS-CoV-2 infections identified using reverse transcription polymerase chain reaction (RT-PCR) and clinical surveillance data at the point of sample collection were used to classify as either symptomatic or asymptomatic. IgG antibodies were measured in sera samples, using a well validated in-house enzyme-linked immunosorbent assay (ELISA). Results: Mombasa accounted for 56.2% of all the 99,694 naso-pharyngeal/oro-pharyngeal swabs tested, and males constituted the majority tested (73.4%). A total of 7737 (7.7%) individuals were SARS-CoV-2 positive by RT-PCR. The majority (i.e., 92.4%) of the RT-PCR positive individuals were asymptomatic. Testing was dominated by mass screening and travellers, and even at health facility level 91.6% of tests were from individuals without symptoms. Out of the 97,124 tests from asymptomatic individuals 7,149 (7%) were positive and of the 2,568 symptomatic individuals 588 (23%) were positive. In total, 2458 serum samples were submitted with paired naso-pharyngeal/oro-pharyngeal samples and 45% of the RT-PCR positive samples and 20% of the RT-PCR negative samples were paired with positive serum samples. Symptomatic individuals had significantly higher antibody levels than asymptomatic individuals and become RT-PCR negative on repeat testing earlier than asymptomatic individuals. Conclusions: In conclusion, the majority of SARS-CoV-2 infections identified by routine testing in Coastal Kenya were asymptomatic. This reflects the testing practice of health services in Kenya, but also implies that asymptomatic infection is very common in the population. Symptomatic infection may be less common, or it may be that individuals do not present for testing when they have symptoms.


Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection began in Wuhan, China in December 2019 and rapidly spread around the world causing a disease called coronavirus disease 2019 (COVID-19) (Cucinotta & Vanelli, 2020). Patients with COVID-19 present with a wide range of symptoms such as fever, cough, shortness of breath, headache, body-aches, fatigue, loss of taste and smell, conjunctivitis, and diarrhea (Guan et al., 2020). The World Health Organization (WHO) declared the disease a pandemic on March 11 th 2020 and two days later 13 th March 2020, the first case of COVID-19 was detected in Nairobi, Kenya (MoH, 2020). In the Kenyan Coast, the first case was detected in Mombasa county on 21 st March 2020. By 30 th June 2021, the total number of confirmed cases in Kenya was 95,843, with 1,655 deaths (WHO, 2021). In 2020 and up to June 2021, Kenya experienced three waves; the first wave was in July -August 2020, the second wave observed shortly after from November -December 2020 attributable to the epidemic spreading to new socio-economic groupings, and a third wave from April-May 2021 attributable to variants of concern (Brand et al., 2021).
The KEMRI-Wellcome Trust Research Programme (KWTRP) is a government designated SARS-CoV-2 testing laboratory for the Coastal region of Kenya that includes six counties namely, Mombasa, Kilifi, Lamu, Taita-Taveta, Tana River and Kwale. The KWTRP laboratory received nasal-oropharyngeal (NP/OP) samples from several hospitals and clinics for acute testing and screening of contacts and occasionally, serum and plasma samples for serology.
There is no published detailed description of the epidemiology and clinical characteristics of SARS-CoV-2 infections in Kenya. Here we report the epidemiology and symptoms for individuals in Coastal Kenya who were tested at (KWTRP), in Kilifi. In a sub-set of the samples, based on sera availability, we also describe the temporal seroprevalence of anti-SARS-CoV-2 IgG and examine the impact of age, sex, viral load and infection status on the time to clear infections.

Study population and data collection
We conducted a retrospective analysis of surveillance data obtained between 17 th March 2020 and 30 th June 2021 from the Coastal part of Kenya, including Mombasa, Taita Taveta, Kilifi, Kwale, Lamu and Tana River. Trained public health rapid response team members and healthcare personnel investigated suspected COVID-19 cases based on contact tracing and history of travel to endemic regions, completed a detailed case investigation form (CIF) and collected a nasopharyngeal and oropharyngeal swab in viral transport media. Collected specimens were triple-packed and transported in cold chain (4°C) to KWTRP where the RT-PCR testing was performed. The data from all the CIFs were entered into Research Electronic Data Capture (REDCap version 10.5.1, RRID:SCR_003445), a web application at KWTRP. COVID-19 test results were relayed daily to the Ministry of Health following their guidelines. The CIF can be found as Extended data (Nyagwange et al., 2022b).

Data cleaning
The COVID-19 surveillance dataset from 17 th March 2020 up to 30 th June 2021 covering the three epidemic waves was systematically curated to ensure uniform entries on the sex categorical variable, all participants had their ages in years and proper spelling of the county of residence variable. For each participant, the timepoint used for analysis was the first positive test. An asymptomatic episode was defined as being positive (by RT-PCR) for COVID-19 at the time of sampling, without any of the 14 symptoms: fever (temperature ≤ 37.5°C), cough, general weakness, history of fever, headache, sore throat, shortness of breath, runny nose, chest pain, nausea, muscular pain, diarrhea, irritation, joint pain and abdominal pain. A symptomatic episode was defined as being positive for COVID-19 at time of sampling, with the above-mentioned symptoms.
RT-PCR testing SARS-CoV-2 RNA was extracted from NP/OP samples using available commercial kits: QIAamp Viral RNA Mini Kit (Qiagen, Catalog Number: 52906), DAAN kit (DAAN Gene Co., Ltd of Sun Yat-sen University, Catalog Number: DA-0591) and SpinX (TCG Pharma Inc., Catalog Number: PPT-TN04) according to the manufacturer's protocol. To identify SARS-CoV-2 infections, RT-PCR amplifications were done using ABI real-time system, model 7500 (Applied Biosystems, Warrington, United Kingdom) with a number of kits as described before (Said et al., 2020). The primer-probe sequences and RT-PCR cycling conditions have been described elsewhere and the primers and probes generally target RNA dependent RNA polymerase (RdRp), envelope (E), (ORF)1ab and nucleocapsid (N) regions (Said et al., 2020). Two negative and positive controls were included in each run for quality control.
Spike antigen production and ELISA assay We have recently described the production and purification of full-length SARS-CoV-2 spike protein in mammalian expression system (Uyoga et al., 2021). The IgG antibodies were measured using a previously described ELISA assay developed and validated at KWTRP, Kilifi, Kenya with sensitivity of 92.7% (95% CI 87.9-96.1%) and specificity of 99.0% (95% CI 98.1-99.5%) (Uyoga et al., 2021). The assay has also been validated in a WHO sponsored multi-laboratory study of SARS-CoV-2 antibody assays and performed as well as the other international laboratories (Adetifa et al., 2021) and as well as WHO endorsed commercial ELISA (Nyagwange et al., 2022a). The sample results were expressed as the optical density (OD) ratio, which is a ratio of test OD to the OD of the plate negative control; OD ratio greater than two was considered seropositive for SARS-CoV-2 IgG (Uyoga et al., 2021).

Statistical analysis
IgG antibody responses were compared between asymptomatic and symptomatic groups using the Wilcoxon test. Participants were stratified into two groups based on age (≤20 and >20 years) as there was a skew towards participants of younger age groups pre-COVID-19. Temporal changes in Ct values and IgG responses were examined using the Wilcoxon and Kruskal-Wallis tests. Differences in Ct value by clinical status (asymptomatic vs. symptomatic) and by age group were compared using the Wilcoxon and Kruskal-Wallis tests. Survival analysis was used to compare (i) the time to the first negative COVID-19 test between asymptomatic and symptomatic patients, (ii) the time taken to the first negative COVID-19 test in asymptomatic individuals, stratified by age (>0-20, >20-50 and >50 age groups). Patients who did not complete the follow up, were censored. All statistical analyses were conducted in R v4.0.2 (R Core Team, 2013, RRID:SCR_001905), all plots were generated using the packages ggplot2 v3.3.2 (Wickham, 2016) and ggpubr v0.4.0. Survival analysis was performed using the survival package v3.2-7 (Therneau & Grambsch, 2000;Therneau, 2020) and Kaplan-Meier plots were generated using the survminer package v.0.4.8 (Kassambara et al., 2018). The survival curves were compared using log-rank test. A linear regression model was used to determine whether age, sex, infection status i.e., asymptomatic or symptomatic and viral load cause a change in the OD ratio. The analysis code can be found as Extended data (Nyagwange et al., 2022b).

Ethical approval
This study was approved by the Kenya Medical Research Unit-Scientific Ethic Review Committee (KEMRI-SERU) under protocol number SERU4081: Integrated studies of the natural history of Sars-Cov-2 infections in Kenya.

Results
General characteristics of the study population A total of 99,694 individuals were tested for SARS-CoV-2 from the six Coastal counties of Kenya between 17 th March and 30 th June 2021. Majority of the samples were from Mombasa (49.3%), followed by Kilifi (22.1%), Taita-Taveta (17.3%), Kwale (7.1%), Lamu (3.1%) and Tana River (0.9%) ( Table 1). The majority of tested individuals were male (73%), with the most common age bracket tested being between 30 and 40, 30.4% (Table 1). The median age was 35 years, with the youngest and oldest participants being 7 months and 105 years old, respectively.
The most common source of the tests was from mass testing exercises (28.3%), followed by truck drivers (27.6%), and healthcare facilities (11.4%). However, 30.2% of tests had missing information on the source ( Table 2).
The bulk of the testing was conducted between May and December 2020 ( Figure 1). Cumulatively, testing increased steadily from March and peaked in June 2020 before declining and rising slightly in August-September 2020 and November 2020 and thereafter in May 2021 ( Figure 1).
Throughout the testing period and the three waves of increasing positivity, asymptomatic cases were predominantly observed ( Figure 1 and Figure 3). There was a slight increase in the proportion of symptomatic cases in wave three, 8.2% (986/1051) in comparison to waves one, 7.4% (218/2959), and two, 7.6% (284/3727). There was no trend over time for the specific type of symptom reported (Figure 3).
On average, symptomatic individuals had significantly higher viral loads (low Ct values) compared to asymptomatic individuals, irrespective of the RT-PCR assay kit and gene tested (Figure 4). Based on longitudinal data obtained from individuals with follow up samples, symptomatic individuals cleared the infection faster than the asymptomatic individuals (p=0.017, Figure 5A). However, there was no difference in time taken to clear SARS-CoV-2 infection by age ( Figure 5B).
Individuals in the oldest age group (>50 years) and the youngest age group (>0-20 years) had the highest and the lowest antibody responses respectively (Figure 7). The antibody responses generally increased with age, with individuals >50 years showing significantly higher antibody responses than all the other age groups (Figure 7). Furthermore, there were significantly higher antibody responses in symptomatic individuals than asymptomatic individuals (p<0.001, Figure 8). Taking into consideration all the data together, it appears that age and viral load are more likely to result in a change in spike IgG antibody levels (Table 5).

Discussion
We report a predominantly asymptomatic case burden (92.2%) along the Coast of Kenya, spread across three waves of peaks         Figure 7. The anti-spike IgG optical density (OD) ratio across different age groups. The median spike OD ratio for each age group was 0.088, 0.156, 0.219, 0.198 and 0.323 for >0-20, >20-30, >30-40, >40-50 and >50, respectively. The total number of samples in each age group is indicated. The OD ratios between each group were statistically compared using a Wilcoxon test. The p-values are shown only for the groups with a significant difference.   in disease transmission. However, since the symptoms were majorly self-reported during mass-testing, or by travellers and truck drivers, the data may be skewed towards the asymptomatic population. The proportion of SARS-CoV-2 asymptomatic infections from other populations around the world range from 5.54% to 75% Yanes-Lane et al., 2020;Zheng et al., 2020). It is also possible some of the individuals classified as asymptomatic were pre-symptomatic at the time of reporting and would become symptomatic on follow-up. He et al. (2021) have reported 48.9% of pre-symptomatic patients categorized as asymptomatic at the time of screening developing symptoms on follow-up . Furthermore, because of our reliance on presentation to routine testing services we cannot be sure why so few individuals present with symptomatic disease. This could reflect reluctance in the population to present for testing when mildly symptomatic or could reflect that symptoms are relatively rare in our population.
However, we can conclude that asymptomatic infection is very common in our setting, and other lines of evidence confirm widespread transmission in Kenya without overwhelming the health system or high numbers of deaths (Brand et al., 2021).
In contrast to 2020 when the Wuhan strain was predominant, the introduction of new variants in the population such as the Alpha, Beta and Delta variants in 2021 led to an increased proportion in symptomatic individuals but with no clear change in the type of symptoms (Figure 2 and Figure 3) unlike in the United Kingdom where the Delta variant was associated with predominantly heavy cold and coryza (Burki, 2021). We did not detect any shift in the type of prevalence of symptoms over time in our population.
The percentage seropositive in the asymptomatic and symptomatic groups in our study was about 45% and 60%, respectively, consistent with proportions reported by Jiang et al. (2020), 31-54% to 21-73% seropositivity among asymptomatic and symptomatic individuals respectively (Jiang et al., 2020). However higher seropositivity has been reported by others among RT-PCR positive samples but in all cases, like our case there was more seropositivity in the symptomatic than the asymptomatic groups (Chen et al., 2021;Long et al., 2020). We also observed a higher viral load in the symptomatic than the asymptomatic group consistent with other studies (Chen et al., 2021;Jiang et al., 2020;Long et al., 2020).
Hypotheses to explain the higher proportion of asymptomatic infections in Africa include protective trained immunity and a younger population structure (Ghosh et al., 2020). Kenya, in common with many other African countries, has a relatively young population compared to high income countries which would directly influence the course of COVID-19 (Ghosh et al., 2020).
There were significantly higher males than females testing positive for COVID-19 which is not unique to this population. Reasons for sex imbalances in COVID-19 datasets include higher expression of angiotensin-converting enzyme-2 (Jin et al., 2020), lifestyle factors such as smoking and alcohol use, and possible variation in adherence to frequent handwashing and wearing of face masks (Bwire, 2020). Given the predominance of asymptomatic infection in our dataset, a bias towards male involvement with essential services such as truck driving and tendency to travel is likely more relevant (Kagucia et al., 2021).
This study has limitations such as missing data regarding sex, age and clinical features of the tested individuals. Since most of the samples were from surveillance activities, there were not many matched sera/ plasma samples limiting the numbers available for stratified analysis, for example the were few symptomatic individuals and this can skew interpretation of data leading to conclusions which do not reflect the trends in the general population. Data on symptoms were self-reported by patients, hence may be inaccurate due to recall bias.
Nevertheless, our data show clearly the predominance of asymptomatic testing in routine health systems in Kenya, and that asymptomatic infection has been very widespread throughout the course of the epidemic in Kenya.

Underlying data
The underlying data are owned by the Kenyan Government through the Ministry of Health and as the data contains highly sensitive and confidential information relating to participants, the authors are not permitted to share the data directly. Users who wish to reuse the source data are able to make a request through the KEMRI-Wellcome Trust Research Programme data governance committee: dgc@kemri-wellcome.org.