Profiles of inflammatory markers and their association with cardiometabolic parameters in rural and urban Uganda [version 1; peer review: 1 not approved]

Background: Inflammation may be one of the pathways explaining differences in cardiometabolic risk between urban and rural residents. We investigated associations of inflammatory markers with rural versus urban residence, and with selected cardiometabolic parameters previously observed to differ between rural and urban residents: homeostatic model assessment of insulin resistance (HOMA-IR), fasting blood glucose (FBG), blood pressure (BP) and body mass index (BMI). Methods: From two community surveys conducted in Uganda, 313 healthy individuals aged ≥ 10 years were selected by ageand sexstratified random sampling (rural Lake Victoria island communities, 212; urban Entebbe municipality, 101). Fluorescence intensities of plasma cytokines and chemokines were measured using a bead-based multiplex immunoassay. We used linear regression to examine associations between the analytes and rural-urban residence and principal component analysis (PCA) to further investigate patterns in the relationships. Correlations between analytes and metabolic parameters were assessed using Pearson’s correlation coefficient. Results: The urban setting had higher mean levels of IL-5 (3.27 vs 3.14, adjusted mean difference [95% confidence interval] 0.12[0.01,0.23] p=0.04), IFN(26.80 vs 20.52, 6.30[2.18,10.41] p=0.003), EGF (5.67 vs 5.07, 0.60[0.32,0.98] p<0.00001), VEGF (3.68 vs 3.28, 0.40[0.25,0.56] p<0.00001), CD40 Ligand (4.82 vs 4.51, 0.31[0.12, Open Peer Review


Introduction
Inflammation is a necessary component of the immune system response to injury and infection. However, chronic, lowgrade inflammation has been implicated in the development and progression of cardiometabolic disorders such as type 2 diabetes 1 and atherosclerosis 2 . The burden of these diseases is increasing globally with cardiovascular diseases being the top contributors to global mortality and morbidity 3,4 . This increase has been attributed to changes in lifestyle and diet as a result of urbanisation and the burden is heavier among individuals in urban areas. Currently, more than 50% of the world's population lives in urban areas and this is expected to increase 5 . Environmental exposures such as microbial exposure, helminth infections, diet and pollution influence the inflammatory status of individuals. This may lead to differences in immune status between individuals in rural and urban settings 6 . Therefore, inflammation may be one of the pathways explaining the disparity in cardiometabolic disease burden and risk between rural and urban areas.
Few studies have investigated, in the general population, ruralurban differences in inflammatory markers. A study in Senegal showed that urban inhabitants had lower pro-inflammatory and T-cell activation profiles than rural inhabitants 6 . Another study in Morocco investigating gene expression in blood leukocyte samples showed differences in the leukocyte transcriptome among rural, urban and nomadic populations 7 . Other studies have focused on particular disease states and the results are mixed and inconsistent. For example, compared with rural upbringing, urban upbringing has been associated with higher concentrations of the pro-inflammatory cytokine interleukin (IL)-6 and suppressed concentrations of the antiinflammatory cytokine IL-10 following acute social stress 8 . A study investigating the effect of exposure to a forest environment on stress levels and cytokines showed that, compared with two hours of forest exposure, urban exposure for two hours resulted in higher levels of the chemokine IL-8 and the pro-inflammatory cytokine tumour necrosis factor (TNF)-alpha 9 . In the investigation of diabetes risk in India, urban dwellers had higher levels of TNF-alpha, IL-6 and the adipokine leptin than rural dwellers 10 . A study conducted in Ghana among individuals with type 2 diabetes showed higher levels of TNF-alpha but lower levels of IL-6 among urban participants than in rural participants 11 . Conversely, plasma levels of the chemoattractant C-C motif chemokine (CCL) 11 have been shown to be lower in urban, compared to rural dwellers in a study investigating the effect of CCL on cognitive function 12 . Among individuals with atopy, urban inhabitants had higher levels of gene expression for IL-4R, a type 2 cytokine receptor, compared with rural inhabitants 13 . In another study in Ghana, urban children had increased gene expression of IL-10 compared with rural children, irrespective of helminth infection status 14 .
Different environments exert different influences on the immune status of the inhabitants. To our knowledge, no study has investigated rural-urban differences in inflammatory markers in Uganda or explored the implications of rural-urban differences for cardiometabolic risk. We, therefore, exploited a unique opportunity to measure inflammatory markers from individuals living in a rural and urban setting. We investigated rural-urban differences in inflammatory cytokines and chemokines and examined correlations between the inflammatory markers and the metabolic parameters HOMA-IR, fasting blood glucose, blood pressure and body mass index (BMI), which we had found to be different between the urban and rural settings 15 .

Study design and setting
This was a cross-sectional comparative study. We measured pro-and anti-inflammatory cytokines and chemokines in stored plasma of individuals who participated in two cross-sectional surveys conducted in rural and urban Uganda. The rural survey was conducted from April to December 2017 in 26 island fishing communities in Lake Victoria, Koome sub-county, Mukono district as part of the Lake Victoria Island Intervention Study on Worms and Allergy-related related diseases (LaVI-ISWA). These fishing villages are located on the shores of the islands and have a high burden of schistosomiasis. LaVI-ISWA was a cluster-randomised trial designed to investigate the effects of community-wide intensive (quarterly single-dose praziquantel and triple-dose albendazole) versus standard (annual single-dose praziquantel and six-monthly single-dose albendazole) anthelminthic intervention on health outcomes 16,17 . The LaVIISWA protocol has been published 18 . This survey, the rural metabolic survey, was conducted in participants aged ≥10 years after four years of the anthelminthic intervention to investigate its effects on metabolic outcomes 16 . The urban survey was conducted between September 2016 and September 2017 in 23 sub-wards of Entebbe municipality, Wakiso district, Uganda. Entebbe is located on a peninsula on the northern shore of Lake Victoria. The urban survey was a cross-sectional study that collected data on allergy-related and metabolic outcomes 19 . In both surveys, data were collected on sociodemographic characteristics (obtained using a questionnaire), fasting plasma glucose, fasting lipid profile (triglycerides, total cholesterol, high-density lipoprotein [HDL]-cholesterol and low-density lipoprotein [LDL]-cholesterol), blood pressure, body mass index, waist and hip circumference. The field tools and procedures in both surveys were aligned to enable a comparison of the outcomes between the two surveys.

Participants and samples
We used plasma samples from 313 healthy participants, 212 from the rural setting (104 from the intensive arm and 108 from the standard arm of the LaVIISWA trial) and 101 from the urban setting. These were randomly selected from each survey using STATA software (STATA v13, College Station, Texas, USA). No formal sample size calculation was performed because this work was exploratory. Participants with incomplete data on helminth infection status, blood pressure, glucose, insulin, lipid profile, BMI, waist and hip circumference or did not have stored plasma samples were excluded from the sampling frame. The selection was stratified by age (10-19 years, 20-29 years, 30-39 years and ≥40 years) and sex. This was to ensure that the age and sex distribution of participants whose samples were used was similar between the two settings because age and sex might be predictive of inflammatory marker levels. Stored plasma samples were retrieved and cytokines/ chemokines measured using a bead-based multiplex immunoassay (Luminex).
Plasma samples, stored in -80 degrees Celsius refrigerators, were thawed to room temperature and tested. The multiplex assays were performed using the Human Premixed Multi-analyte Magnetic Luminex Assay kit (R and D Systems Inc, Minneapolis, Minnesota, USA; catalogue numbers, LXSAHM-01, LXSAHM-02, LXSAHM-27; Lot numbers L128094-6) according to the manufacturer's instructions. The thirty analytes were assessed using three panels, a one-plex assay for VEGF, a two-plex assay for adiponectin and serpin E1 and a 27-plex assay for the other analytes. In brief, all reagents were prepared as instructed by the kit manufacturer. To each well, 50ul of sample (or standard) were added followed by 50ul of diluted Microparticle Cocktail. The mixture was incubated on a shaker for two hours and washed with wash buffer three times. Diluted Biotin-Antibody Cocktail was added (50ul to each well), incubated on the shaker and washed with wash buffer three times. Thereafter, diluted Streptavidin-PE was added (50ul to each well), incubated on the shaker for 30 minutes and again washed three times. Wash buffer was then added to each well and incubated for two minutes on a shaker. All incubations were at room temperature and each time, the shaker was rotating at 800 rpm.
The results were read using the Bio-Plex 200 System (Bio-Rad Laboratories, Hercules, California, USA) and the Bio-Plex Manager software (version 6.0; Bio-Rad Laboratories, Hercules, California, USA) after imputing the assay working range (lower and upper limits of quantification) for each analyte according to the manufacturer's instructions. Raw fluorescence intensity values and observed concentrations were obtained.

Statistical methods
The analyses were performed using STATA version 13.0 (College Station, Texas). The 30 cytokines/chemokines were the outcomes of interest in this analysis. The exposures of interest were the rural-urban setting and the metabolic parameters homeostatic model assessment of insulin resistance (HOMA-IR), fasting plasma glucose, blood pressure and body mass index (BMI). In all our analyses, we used the raw fluorescence intensity values of the cytokines. This is because the analytes were predominantly low in abundance and the observed concentrations were mostly extrapolated beyond the standard range. Using the observed concentrations would require censoring i.e. not assigning values for concentrations that are out of the standard range. This would affect most of the cytokines measured and therefore limit the statistical power of the analysis. Analysis using fluorescence intensities does not require censoring and provides more statistical power than analysis based on concentrations. It also does not require correction for background fluorescence because the automatic calibration steps produce a highly reproducible signal even across different Luminex machines. This approach for analysing Luminex data has been validated 20,21 and used previously 22,23 .
The fluorescence intensities of cytokines were compared between the rural and urban setting. All analyte intensities were first log-transformed. The mean log intensity of each analyte in each setting was then calculated and regression models fitted to assess for differences. Crude and adjusted mean differences (adjusted for age and sex) and their 95% confidence intervals are presented. Further analyses investigated correlation patterns between the outcomes and then used principal component analysis to transform from a set of correlated outcome variables to a set of uncorrelated variables (linear combinations of the original outcomes) aiming to capture as much variability as possible with the first few components. The newly generated uncorrelated variables were then compared between the settings. Selection of the number of principal components to retain was based on the Kayser criterion, therefore, components with an eigenvalue of ≥1 were retained. A scree plot was constructed to give a visual representation of the PCA eigenvalues. In the interpretation of the principal components, we computed the unrotated matrix containing the factor loadings of each variable for each component. For each variable, the principal component with the highest loading for that variable was interpreted as representative of the variable (adapted from 24). In all the analyses, data from the rural setting included all rural participants regardless of LaVIISWA trial arm. This is because no strong differences in the cytokines were found between the study arms.
To determine the relationship between cytokines/chemokines and metabolic parameters we computed Pearson correlation coefficients. This is to investigate the strength and direction of any linear relationship. We obtained coefficients for the correlation of the log-transformed intensity for each analyte with HOMA-IR, fasting plasma glucose, blood pressure (systolic and diastolic) and BMI. We also correlated the metabolic outcomes with the retained components from the PCA. Correlation coefficients and their p-values are reported. No formal adjustment was made for multiple testing.

Ethical approvals and consent
Ethical clearance for this work was obtained from the Uganda Virus Research Institute Research Ethics Committee (reference number GC/127/17/01/573), the Uganda National Council for Science and Technology (reference number HS 2185) and the London School of Hygiene and Tropical Medicine Research Ethics Committee (reference number 9917). All participants provided written informed consent to participate in the surveys and have their samples stored for this work (see consent forms and information sheets in the extended data 25 ).

Characteristics of the participants
The characteristics of the participants selected for this analysis are shown in Table 1 25 . Samples from 313 participants (rural, 212; urban, 101) were analysed. The participants had a median age of 30 years and were selected to have comparable distributions of age and sex. Participants in the urban setting had a lower prevalence of the helminths Schistosoma mansoni (28.7% vs 53.6%) and Trichuris trichiura (1.0% vs 9.5%) than those in the rural setting.

Associations between the analytes and study setting
The associations (both unadjusted and adjusted) between ruralurban setting and level of each analyte are shown in Table 2. Somewhat higher levels of the Th2 associated cytokine IL-5 were observed in the urban setting on adjusted analysis (3.27 vs 3.14, adjusted mean difference [95% confidence interval] 0.12 [0.01, 0.23] p=0.04). However, no differences were observed for the other Th2 associated cytokines (IL-4 and IL-13), Th1 associated cytokines (IL-2, IFN-gamma and TNF-alpha), Th17 associated cytokines (IL-17) or the regulatory cytokines (IL-10, TGF-beta) between the rural and urban settings. Furthermore, no differences were observed for the adipokines adiponectin and leptin.
Comparison of chemokines, growth factors and other cytokines between the rural and urban setting showed interesting differences. Individuals in the urban setting had higher levels of

Associations between the analytes and selected metabolic parameters
The correlations between the analytes and metabolic parameters are shown in Table 3. There was a positive correlation between insulin resistance (HOMA-IR) and the pro-inflammatory analytes CCL22, CXCL1, IL-5, IL-17 and Leptin. Similarly, fasting blood glucose was positively correlated with IL-5, IL-17 and Leptin. Systolic blood pressure and diastolic blood pressure were positively correlated with VEGF-C and Leptin respectively. BMI was positively correlated with IL-5 and Leptin but negatively correlated with CCL22.

Principal component analysis
Correlations were observed among the analytes and the dataset was summarised using principal component analysis. On principal component analysis, six principal components were retained and these accounted for 73% of the variance in the original 30 outcomes. The scree plot constructed to give a visual representation of the PCA eigenvalues is shown in Figure 1.
The resulting principal components were challenging to interpret and did not represent any immediately obvious standard classifications. The first component explained 38% of the total variance and was constituted by CCL3, CCL11, CD40 Ligand, Fas, IFN-alpha, IFN-gamma, IL1-alpha, IL-2, IL-12 and IL-13. These are a mixed selection of chemoattractants, cytokines and chemokines generally associated with inflammation. All the analytes in this component had positive loadings (coefficients). The second principal component explained a further 13% of the variance in the 30 outcomes and was characterised by chemoattractants and vasoactive mediators Serpin E1, CXCL1, CXCL2, CXCL10, EGF, GMCSF and VEGF-C (all had positive loadings except for GMCSF and CXCL10). The third principal component was characterised by negative loadings (coefficients) of CCL4, IL-8/CXCL8, and the classical markers of inflammation, IL1-beta, IL-6, and TNF-alpha, and explained a further 9% of the variance. The other components and variables contributing to them are shown in Table 4. Associations between the principal components and study setting are shown in Table 5 and Figure 2 and Figure 3. Compared with the rural setting, the urban setting had higher scores for principal component 2, implying higher representation of chemoattractants and vasoactive mediators among individuals in the      1.14 (0.68, 1.60 Table 4).
Associations between the principal components and metabolic parameters are shown in Table 6. Principal component 4 (characterised by IL-5 and Leptin) was positively correlated with HOMA-IR, fasting blood glucose, diastolic blood pressure and BMI. The strongest correlations were observed with BMI (correlation coefficient 0.61, p<0.0001) and HOMA-IR (0.31, p<0.0001). Other significant correlations were found between principal component 1 and HOMA-IR (0.13, p=0.02), principal component 2 and systolic blood pressure (0.12, p=0.04), and principal component 3 and BMI (0.17, p=0.003).

Discussion
In this paper, we have shown that, compared with the rural environment, the urban environment was associated with higher levels of pro-inflammatory chemokines: EGF, VEGF-C, CD40L,  serpin E1 but lower levels of GMCSF, CCL2 and CXCL10. Furthermore, the rural environment was associated with lower levels of the Th2 cytokine IL-5 but there was no difference in the other Th2, Th1, Th17 and regulatory cytokines between the settings. On principal component analysis, the urban environment had higher representation of chemoattractants and vasoactive mediators (PC2) but lower representation of the classical inflammatory mediators (PC3). The metabolic parameters HOMA-IR, fasting blood glucose, blood pressure and BMI correlated positively with a largely distinct set of principal components characterised by pro-inflammatory analytes.
It was interesting to find that the levels of the growth factors EGF and VEGF-C were higher in the urban setting. EGF is a pro-inflammatory peptide that stimulates cell growth and proliferation. VEGF-C promotes lymphangiogenesis by directing the proliferation and migration of lymphatic epithelial cells and is also involved in the stimulation of growth of vascular tissue 26 . We also found lower levels of soluble CD40L in the rural compared with the urban population. CD40L is a pro-inflammatory molecule found both in a soluble form and expressed on activated T lymphocytes. It is pro-inflammatory to endothelial cells, has been suggested to be a marker of atherogenesis 27 and is elevated in individuals with diabetes, metabolic syndrome and insulin resistance 27 . The higher levels of these cytokines in the urban setting is consistent with the higher burden of these conditions in urban settings.
We found higher levels of serpin E1 in the urban setting compared to the rural setting and this is consistent with findings elsewhere 28 . Serpins have a role in inflammation and immune function. Serpin E1, also known as plasminogen inhibitor 1, is a serine protease inhibitor that inhibits tissue plasminogen inhibitor and urokinase, which are activators of plasminogen and fibrinolysis. Increased levels of serpin E1 are a risk factor for atherosclerosis and thrombosis 30 and evidence suggests that increased levels accelerate vascular disease in patients with diabetes 31,32 . The lower levels of serpin E1, therefore, accord with evidence that that the rural environment has lower cardiovascular risk than the urban environment. Serpin E1 levels have been found to positively correlate with insulin resistance 33 but we did not find that in our study. On data reduction by principal component analysis, the individual findings were supported by higher levels of PC2 (characterised by various chemoattractants and vasoactive peptides) in the urban, compared to the rural environment.
The rural environment is associated with a higher microbial and parasite load and burden of chronic and recurrent infections 34 . It was therefore surprising to find no differences in the Th1, Th2, Th17 cytokines except for lower levels of IL-5 in the rural population compared to the urban population. Despite the higher burden of helminth infections in this rural setting 15 , there was no difference in the regulatory cytokines IL-10 and TGF-beta between the settings. It is possible that there is substantial exposure of individuals in this particular urban setting to infectious agents (both current and previous) sufficient to achieve a similar regulatory profile to individuals in the rural setting. Owing to the complexity of the distribution of antigens which trigger the immune response, investigating cellular profiles and responses to stimulation, to complement studies of plasma levels of cytokines, would be necessary to obtain a fuller understanding of immunological profiles in different settings. For example, contrary to our findings, a study in Senegal found that rural individuals had significantly higher expression of Th1, Th2, and Th22 cells compared to the urban population in response to stimulation 6 .
In contrast, we found lower levels of the pro-inflammatory analytes GMCSF, CCL2 and CXCL10 in the urban population compared to the rural population, underscoring the complexity of immune interactions. After data reduction with principal component analysis, the urban environment had higher values for PC3, implying lower representation of the classical inflammatory mediators IL1-beta, IL-6 and TNF-alpha. This may suggest that, in our study, the urban setting is associated with less classical inflammation than the rural environment. Despite the difficulty in interpreting the principal components, it is clear that the inflammatory profiles of individuals in the rural and urban settings are different.
Our study had a number of limitations, which we have considered. The analytes were measured in plasma and generally, levels of most cytokines are low in plasma. This could have contributed to the inability to find any differences in the Th1, Th2 and Th17 cytokines. Stimulation of peripheral blood mononuclear cells may have resulted in higher measurable levels of the analytes. Genetic factors influencing the immune system were not studied, however, the ethnic groups in the two settings are similar 15 . Also, many tests were performed with no formal adjustment for multiple testing, therefore, chance findings cannot be ruled out. However, of the 30 analytes tested, 10 showed evidence of differing between the rural and urban settings at p<0.05 (more than the 30 x 0.05=1.5 that we would expect to see by chance alone).
Taken together, our exploratory work on inflammatory analytes suggests that investigating differences in inflammatory profiles between rural and urban populations is complex but differences in these two populations exist. This could have broad implications for health such as susceptibility to infections and response to vaccines. However, the focus for this study was on the implications for cardiometabolic risk. Considering that inflammation correlates with some metabolic diseases, inflammation may be one of the pathways to explain the difference in cardiometabolic diseases between the urban and rural environments. quantification as these are clinically uninterpretable. Even if the calibration curves would be linear. As the calibration curves for 30 different analytes are not identical, the same fluorescence returns different concentrations, even by order(s) of magnitude. In biomedical research, statistics is an auxiliary method, helping in the interpretation of the results. An interpretation should be strictly done in a clinical context on clinically relevant data, and only these clinically relevant data (not physical units) should be statistically evaluated.

Data availability
I suppose that the problem lies in the fact that several markers were under the detection limit of the assay (what could be anticipated as healthy subjects were studied). The statistician did not perceive the fact as a problem since the fluorescent units were available. The physicians did not comment on the fact that they evaluate concentrations and not fluorescent units. A physicist or biochemist who would get alert learning that fluorescence is to be evaluated and serve as a basis for clinical interpretation was probably not in the team. This, unfortunately, resulted in an extended statistical exercise that has no clinical relevance. If the analyte is under the detection limit of the assay it should be acknowledged. These data should be handled as unavailable instead of looking for a way how to overcome this reality eventually leading to misinterpretation.
I suppose that the authors should give the data as concentrations (not fluorescence units); for each analyte, the number of subjects in whom valid concentrations were obtained should be indicated; numbers of should be given also in the case of simple correlations performance; and multivariate analyses should be conducted using selected data sets -those which yielded reliable concentrations in all, or a vast majority of the probands. (the PCA is robust, among others, in the handling of missing data, allowing for a reasonable arbitrary selection of what percentage of data in the set must be available to include the variable into the analysis. And probably the missing data should be random, not always missing in the same subjects).
Even though in the dot plots the 95% Hoteling's ellipse is not depicted, it is obvious that subjects (e.g., in the right lower corner of Figure 1) are major outliers. In the case that such outliers are retained in the analysis, their characteristics should be given and it should be explained why they were retained in the analysis, and what changed after their exclusion.
It is not stated whether the median or the mean of the sum of fluorescence signal intensities has been used, but it is rather marginal in the context.
It is not indicated how the sample size was calculated thus, what was the adequate expected power.

Methods -
The authors should indicate whether standard laboratory methods were used to quantify blood chemistry variables, and the brand and the manufacturer of the employed analyzer(s). Table 1: Regarding decimals, for clearness, FPG, BP, and BMI data should be reported in a manner generally used in clinics. Indication of decimals or more decimals than necessary does not add to