Chronic kidney disease (CKD) and associated risk in rural South Africa: a population-based cohort study

Background: In Africa, true prevalence of chronic kidney disease (CKD) is unknown, and associated clinical and genetic risk factors remain understudied. This population-based cohort study aimed to investigate CKD prevalence and associated risk factors in rural South Africa. Methods: A total 2021 adults aged 20-79 years were recruited between 2017-2018 from the Agincourt Health and Socio-Demographic Surveillance System in Bushbuckridge, Mpumalanga, South Africa. The following were collected: sociodemographic, anthropometric, and clinical data; venous blood samples for creatinine, hepatitis B serology; DNA extraction; spot urine samples for dipstick testing and urine albumin: creatinine ratio (UACR) measurement. Point-of-care screening determined prevalent HIV infection, diabetes, and hypercholesterolemia. DNA was used to test for apolipoprotein L1 ( APOL1) kidney risk variants. Kidney Disease Improving Global Outcomes (KDIGO) criteria were used to diagnose CKD as low eGFR (<60mL/min/1.73m 2) and /or albuminuria (UACR ≥ 3.0mg/mmol) confirmed with follow up screening after at least three months. eGFR was calculated using the CKD-EPI (creatinine) equation 2009 with no ethnicity adjustment. Multivariable logistic regression was used to model CKD risk. Results: The WHO age-adjusted population prevalence of CKD was 6.7% (95% CI 5.4 - 7.9), mostly from persistent albuminuria. In the fully adjusted model, APOL1 high-risk genotypes (OR 2.1; 95% CI 1.3 - 3.4); HIV infection (OR 1.8; 1.1 - 2.8); hypertension (OR 2.8; 95% CI 1.8 - 4.3), and diabetes (OR 4.1; 95% CI 2.0 - 8.4) were risk factors. There was no association with age, sex, level of education, obesity, hypercholesterolemia, or hepatitis B infection. Sensitivity analyses showed that CKD risk factor associations were driven by persistent albuminuria, and not low eGFR. One third of those with CKD did not have any of these risk factors. Conclusions: In rural South Africa, CKD is prevalent, dominated by persistent albuminuria, and associated with APOL1 high-risk genotypes, hypertension, diabetes, and HIV infection.


Introduction
Infectious and non-communicable disease comprise substantial risk for chronic kidney disease (CKD) in Africa, but its true prevalence remains unknown. Methodological differences in sampling frames and criteria used to diagnose CKD, and limited understanding of the best measures to assess kidney function in African populations make prevalence data difficult to interpret. Recent large epidemiological studies have highlighted regional differences in CKD prevalence -which was lower in West Africa (Ghana and Burkina Faso) compared to East (Kenya) and South Africa, and higher in eastern compared to southern Uganda 1,2 .
Risk factors associated with kidney disease are understudied. In many African studies traditional risk factors associated with CKD include hypertension, diabetes, HIV infection, older age, and female sex 1,3 . Studies from Tanzania, Malawi, Uganda and Kenya, suggest non-traditional risk factors are an important contributor to CKD risk 2,4,5 . These include endemic and other infectious diseases, such as undiagnosed genitourinary tuberculosis (TB), schistosomiasis, and viruses other than human immunodeficiency virus (HIV) which can manifest as nitrite-negative leukocyturia and/or hematuria, or tubulointerstitial injury related to occupational or environmental toxin exposure 3,5 .
Compared to other US populations groups, African Americans have a three-to-four times higher risk of kidney failure associated with recessive inheritance of apolipoprotein L1 (APOL1) kidney risk variants (KRV) 6 . APOL1 KRV comprise two missense single nucleotide polymorphisms (SNPs) defining the G1 allele and a six base pair-deletion defining the G2 allele. G1 and G2 alleles originated in West Africa with recent positive selection from protection against trypanosomal African sleeping sickness. APOL1 KRV frequencies vary widely in Africa: Nigeria's Igbo and Yoruba people have the highest frequencies (40%), with lower frequencies in South Africa (18%), and near-absence in East Africa 7 .
The role of APOL1 KRV in the pathogenesis of CKD in African populations is unclear. APOL1 KRV have been associated with hypertension-attributed and non-diabetic CKD in Democratic Republic of Congo and Nigeria, persistent albuminuria despite well-controlled HIV disease in Nigeria, and HIV-associated nephropathy, systolic hypertension and low eGFR in South Africa 2,8-11 . One familial study from South African failed to demonstrate an association between APOL1 KRV and hypertension-attributed CKD compared to unaffected family members 12 . Recently, a large population-based study showed an association between APOL1 KRV and albuminuria (but not eGFR), and this association was attenuated when compared to African American populations 13 .
The aim of this study was to determine the prevalence of CKD and identify associated clinical and genetic risk factors in a rural South African population. We hypothesized that CKD prevalence would be high and associated with APOL1 KRV, infectious and non-communicable disease. We added an explanation and reference to this publication in the methods section "Chronic kidney disease prevalence" where we state "eGFR was calculated using the CKD-EPI (creatinine) equation 2009 without adjusting for African American ethnicity as race-based coefficients have been shown to overestimate GFR in this rural South African population".

Study setting and sampling strategy
3. We included data regarding prior vs incidental HIV and hypertension status, stratified by sex, in the footnote of Table 1.
4. We clarified with reviewers that point of care screening was performed for hypertension, HIV, anaemia, hypercholesterolaemia, and hyperglycaemia. If indicated, participants were referred for further investigation and/or treatment. Samples for eGFR and albuminuria testing were batched and shipped for laboratory testing and after obtaining results for kidney function, those with low eGFR (<60ml/min/1.73m 2 ) and/or albuminuria (urine albumin: creatinine ratio >3mg/mmol) were rescreened after a minimum of 3 months. There were no interventions during this minimum 3-month period. However, there may have been some overlap between those referred because of their point of care results (who may also have had low eGFR and/or albuminuria). We have addressed this as a limitation in the discussion section of the revised manuscript.

A biostatistician, Dr
Petra Gaylard (co-author of paper) performed the statistical analysis.
Any further responses from the reviewers can be found at the end of the article REVISED 115,000 people. A minimum sample size of 1800 was required to provide at least 80% power to determine CKD prevalence of at least 5%, provided the true prevalence was equal to or more than 6.5%. Proportional allocation of Black African adults aged 20 to 79 years ensured a representative sample based on the most recent annual population census. Sample size was increased proportionately to 2759 individuals to accommodate a 25% non-participation rate.

Laboratory procedures
A 20µL aliquot of DNA was shipped to the Frederick National Laboratory at the National Cancer Institute, USA, for APOL1 genotyping 16 . DNA was genotyped using TaqMan assays (ThermoFisher Scientific, USA). APOL1 G1 KRV comprised a missense G nucleotide at rs73885319 (G1g) and either a T or G nucleotide at rs60910145; presence of only the G1g (p.342Gly) variant was sufficient to define the G1 KRV 17 . The APOL1 G2 KRV consists of a six-base-pair in-frame deletion, rs717185313. The number of APOL1 KRV (G1 or G2) carried by each participant was coded as 0 for the G0/G0 genotype, 1 for the G0/G1 or G0/G2 genotype, or 2 for the G1/G1, G1/G2, or G2G2 genotypes. APOL1 genotypes were further coded as "high-risk (HR)" if the participant carried any combination of 2 risk alleles or "low-risk (LR)" if the participant had 0 or 1 risk allele. This classification was used for statistical analyses 16 . All remaining specimens were shipped at -80°C to the Central Laboratory Services (CLS) in Johannesburg, South Africa. Serum and urine creatinine was measured by an isotope-dilution mass spectrometry traceable modified Jaffe method, urine albumin by a colorimetric method (Cobas 6000/c501 analyzer), urine albumin:creatinine ratios (UACR) were calculated and reported (mg/mmol), and hepatitis B status was determined using Immulite serological assays (ARCHI-TECT i1000SR analyzer, Abbott USA). The CLS laboratory adhered to standard daily internal quality control procedures and complied with the requirements of the external quality control program through the College of American Pathologists.
Chronic kidney disease prevalence Kidney Disease Improving Global Outcomes (KDIGO) criteria were used to diagnose CKD 22 . eGFR was calculated using the CKD-EPI (creatinine) equation 2009 without adjusting for African American ethnicity as race-based coefficients have been shown to overestimate GFR in this rural South African population 23 . Albuminuria was quantified with spot UACR. Participants with low eGFR (<60ml/min/1.73m 2 ), and/or albuminuria (UACR ≥3.0mg/mmol) were followed up with repeated measures after a minimum of three months. CKD was defined as low eGFR, or albuminuria, or a combination (low eGFR and/or albuminuria) provided these measures were confirmed on repeat testing, and this definition was used for all analyses.

Statistical analysis
Continuous variables were represented as mean (standard deviation [SD]) if normally distributed and median (interquartile range) if non-normally distributed. Categorical variables were expressed as frequencies (percentage). Study variables were compared between sexes using the chi-squared test (Fisher's exact test was used for 2×2 tables). To identify factors associated with CKD, logistic regression analysis was used to estimate odds ratios (OR), with corresponding 95% confidence intervals (CIs). Hierarchical models based on existing knowledge of known CKD risk factors were developed with all models, age-and sex-adjusted. Model 1 incorporated level of education and BMI. Model 2 added APOL1 genotype status, and Model 3 added comorbid infectious and non-communicable conditions: hepatitis B, HIV, hypertension, diabetes, and hypercholesterolaemia. Nested models were compared using the likelihood ratio test. Because CKD was a composite variable (low eGFR and/or albuminuria), sensitivity analyses compared whether there were differences in association between risk factors and (i) low GFR alone, or (ii) albuminuria alone.
Missing data were reported in figures and tables. CKD population prevalence was age-standardized using the revised World Health Organization (WHO) World Standard Population Distribution for ages 20-79 (direct method) 24,25 . Statistical analyses were performed using SAS (Stata Corp, Texas, USA) and can be performed in R (R Core Team, 2014) 26 .

Results
The flow diagram in Figure 1 details sample selection, reasons for non-participation, and CKD screening procedures. Overall, 2021/2759 adults consented (73% participation rate), with the final study sample representative of the Agincourt HDSS population (Figure 2). Participant socio-demographic and clinical characteristics overall, and stratified by sex, are summarized in Table 1. For participants with complete data for eGFR and UACR (n=2004): 32 had low eGFR at first screening, and of these, 12/29 (41%) were confirmed with low eGFR at follow-up; 247 had albuminuria at first screening, and of these, 118/220 (54%) were confirmed with albuminuria at follow-up ( Figure 1). Overall, the WHO age-standardized prevalence for low eGFR was 0.9% (95% CI 0.4 -1.4); for albuminuria was 6.2% (95% CI 5.0 -7.4), and for CKD (low eGFR and/or albuminuria) was 6.7% (95% CI 5.4 -7.9).
Results from multivariable adjusted logistic regression analyses are summarised in Because CKD was a composite variable, sensitivity analyses were conducted to determine whether the associations observed were driven by low eGFR or albuminuria. The number of events was too small for a sensitivity analysis restricted to eGFR <60mL/min/1.73m 2 . Instead, a sensitivity analysis was performed using eGFR <90mL/min/1.73m 2 on initial screen which showed an association with advancing age, obesity, and diabetes, but not with APOL1 high-risk genotypes, HIV infection, or hypertension (Table 3). For albuminuria, two sensitivity analyses were performed for those with albuminuria on (i) initial screening, and (ii) follow up screening (Table 4-Table 5). Both confirmed associations observed with the composite endpoint (CKD defined as low eGFR and/or albuminuria) were primarily driven by persistent albuminuria.
For participants with CKD, overall, there was no identified risk factor in 32% (37/117) of participants (Table 6), most had one risk factor, and none had more than three. Women had fewer identified risk factors than men. CKD risk factors included those identified in the multivariable regression analyses: highrisk APOL1 genotype, hypertension, HIV infection, and diabetes.

Discussion
This rigorously conducted study determined CKD prevalence in a rural South African population using recommended KDIGO criteria for eGFR and albuminuria with confirmatory testing. Far more than low eGFR, persistent albuminuria was the   Column percentages may sum to +/-100 due to rounding; odds ratios presented with 95% confidence intervals; categories presented as frequency (%); 1 N = 1975: total number eligible for inclusion after CKD screening and follow up; 2 N = 1885: total number with complete data for variables included in regression models; 3 BMI: body mass index: non-obese <30.0; obese BMI >= 30.0; Bold text indicates 5% level of significance (p-value <0.05). Column percentages may sum to +/-100 due to rounding; odds ratios presented with 95% confidence intervals; categories presented as frequency (%); 2 eGFR<90: estimated GFR less than 90ml/min/1.73m 2 ; 3 BMI: body mass index: non-obese <30.0; obese BMI >= 30.0; Bold text indicates 5% level of significance (p-value <0.05)   There are several strengths to the study including the rural population-based sampling frame, combined evaluation of eGFR and albuminuria, and confirmation with follow-up testing which reduced the risk of over-reporting prevalent kidney disease. The strong contribution of persistent albuminuria to CKD prevalence is relevant, as many large epidemiological studies rely solely on the estimation of GFR. Limitations include evaluation of few non-traditional risk factors for CKD, and low power for evaluating risk in those with eGFR <60mL/min/1.73m 2 . Participants with hypertension, HIV (not on treatment), anaemia, hypercholesterolaemia, and hyperglycaemia were referred to their local care facility for further investigation and/or treatment. In so doing, those with concomitant low eGFR and / or albuminuria at baseline might have received treatment that affected subsequent kidney function testing. The relatively small proportion of participants with low eGFR might be explained by the absence of appropriate care for those with severe kidney disease, thus creating a survival bias, or potential to overestimate creatinine-based GFR with the CKD-EPI (creatinine) equation with consequent underdiagnosis of CKD 27 .

All models age-and sex-adjusted
The association of APOL1 high-risk genotypes with persistent albuminuria is consistent with population-based studies in continental African and African American populations 13,28,29 .
While the population frequencies of APOL1 high-risk genotypes approximated those reported in African Americans (~10-15%) and the association with persistent albuminuria similar, our study did not show any association with low eGFR 7 . This might relate to limited analytic power because so few had low eGFR, but it is worth noting that similar findings have been described in a population-based study from West, East, and Southern Africa 13 . The absence of longitudinal follow-up to evaluate the impact of APOL1 status on incident CKD, CKD progression, and survival restrict interpretation of current findings.
The study confirmed known associations with HIV infection, hypertension, and diabetes 2,30,31 , but one third of participants with CKD had none of these risk factors. Potential contextspecific risk for kidney disease not accounted for in this study include endemic malaria, endemic genitourinary schistosomiasis, genitourinary tuberculosis, ingestion of traditional and overthe-counter medicines, and environmental exposures such as agricultural pesticides and heavy metal toxins [32][33][34] . Such exposures might result in repeated bouts of acute, or acute-on-chronic kidney injury, or comprise the "second hit" needed for APOL1-induced kidney injury.
Our findings show that CKD is prevalent and those with HIV infection, hypertension, and diabetes may benefit from screening strategies to control risk and prevent progression. Research is needed to evaluate performance of creatinine-based eGFR equations in African populations and investigate the contribution of genetic and non-traditional risk factors to CKD risk in South Africa. Furthermore, there is still a paucity of data on genetic studies in Africa, including the link between CKD and genes; this study going beyond the traditional CKD risk factors in Africa to include genomics is remarkable.

Data availability
We agree and applaud the use of CKD-EPI without race/ethnicity adjustment. Still, it will be good to include that with race/ethnicity adjustment for clarity and future reference. The jury is not quite out in considering the use or not of the race/ethnicity adjustments.
It will be nice to state how many hypertensive participants had pre-existing hypertension or incidental diagnosis. Similarly, prior vs. incidental HIV diagnosis should have been documented.
At the first screening, certain patients had hypertension and/or impaired renal function (eGFR /albuminuria); any intervention done between the initial and second screenings should have been documented. Perhaps, these interventions could explain the drop in the number of participants with eGFR <60mls/min/1.73m 2 or albuminuria from 220 to 118 and 29 to 12, respectively.
Though the statistical analysis seems adequate, a statistician may also review that aspect.

Is the work clearly and accurately presented and does it cite the current literature? Yes
Is the study design appropriate and is the work technically sound? Yes

Are sufficient details of methods and analysis provided to allow replication by others? Yes
If applicable, is the statistical analysis and its interpretation appropriate?
feel it is necessary to include the eGFR results with a race-based adjustment in this paper, as it has been addressed separately with reference to measured GFR -the gold standard for evaluating performance of eGFR equations.
To make this clear in the revised manuscript, we added the above reference to the methods section "Chronic kidney disease prevalence" where we state "eGFR was calculated using the CKD-EPI ( 4. "At the first screening, certain patients had hypertension and/or impaired renal function (eGFR /albuminuria); any intervention done between the initial and second screenings should have been documented. Perhaps, these interventions could explain the drop in the number of participants with eGFR <60mls/min/1.73m2 or albuminuria from 220 to 118 and 29 to 12, respectively." Thank you for raising this point. At baseline screening, participants with hypertension, HIV (not on treatment), anaemia, hypercholesterolaemia, and hyperglycaemia were referred to their local care facility for further investigation and/or treatment -made possible because tests were performed in real time using point of care technology. The eGFR and albuminuria testing was not