All thresholds of maternal hyperglycaemia from the WHO 2013 criteria for gestational diabetes identify women with a higher genetic risk for type 2 diabetes

Background: Using genetic scores for fasting plasma glucose (FPG GS) and type 2 diabetes (T2D GS), we investigated whether the fasting, 1-hour and 2-hour glucose thresholds from the WHO 2013 criteria for gestational diabetes (GDM) have different implications for genetic susceptibility to raised fasting glucose and type 2 diabetes in women from the Hyperglycemia and Adverse Pregnancy Outcome (HAPO) and Atlantic Diabetes in Pregnancy (DIP) studies. Methods: Cases were divided into three subgroups: (i) FPG ≥5.1 mmol/L only, n=222; (ii) 1-hour glucose post 75 g oral glucose load ≥10 mmol/L only, n=154 (iii) 2-hour glucose ≥8.5 mmol/L only, n=73; and (iv) both FPG ≥5.1 mmol/L and either of a 1-hour glucose ≥10 mmol/L or 2-hour glucose ≥8.5 mmol/L, n=172. We compared the FPG and T2D GS of these groups with controls (n=3,091) in HAPO and DIP separately. Results: In HAPO and DIP, the mean FPG GS in women with a FPG ≥5.1 mmol/L, either on its own or with 1-hour glucose ≥10 mmol/L or 2-hour glucose ≥8.5 mmol/L, was higher than controls (all P <0.01). Mean T2D GS in women with a raised FPG alone or with either a raised 1-hour or 2-hour glucose was higher than controls (all P <0.05). GDM defined by 1-hour or 2-hour hyperglycaemia only was also associated with a higher T2D GS than controls (all P <0.05). Conclusions: The different diagnostic categories that are part of the WHO 2013 criteria for GDM identify women with a genetic predisposition to type 2 diabetes as well as a risk for adverse pregnancy outcomes.


Introduction
Gestational diabetes mellitus (GDM) has been variably defined since criteria were first developed over 50 years ago 1 . The World Health Organization (WHO) introduced diagnostic criteria for GDM in 1999, based on criteria for overt diabetes in the general population, with a fasting plasma glucose (FPG) ≥7.0 mmol/L or impaired glucose tolerance with a 2-hour glucose >=7.8 mmol/L post 75 g oral glucose load as part of an oral glucose tolerance test (OGTT), measured between 24 and 28 weeks gestation 2 . However, lesser degrees of maternal fasting hyperglycaemia have long been associated with a higher risk for adverse perinatal outcomes 3 , so a FPG ≥6.1 mmol/L (indicative of impaired fasting glycaemia in the non-pregnant population 4 ) was also integrated into the WHO criteria.
The Hyperglycemia and Adverse Pregnancy Outcome (HAPO) Study 5 followed 23,316 women who underwent a 2-hour OGTT between 24 and 32 weeks gestation throughout pregnancy and found a continuous association between maternal glucose values and adverse perinatal outcomes, including birth weight ≥90 th centile (large for gestational age, LGA) and primary caesarean section. In 2010, the International Association of Diabetes and Pregnancy Study Groups (IADPSG) determined cut-off values equivalent to 1.75 times the odds for adverse pregnancy outcomes at mean glucose values, resulting in diagnostic thresholds for FPG ≥5.1 mmol/L, 1-hour glucose ≥10 mmol/L and 2-hour glucose ≥8.5 mmol/L 6 . WHO adopted the recommendations of IADPSG in 2013 2 , which has resulted in a higher number of cases identified as GDM due to the lower FPG threshold (estimated up to 17.8% prevalence of GDM for IADPSG 2010 criteria 6 vs 9.4% prevalence for WHO 1999 criteria 7 ). Whilst these thresholds were chosen for their Obstetric risks, the HAPO Follow-Up Study found that women diagnosed by the newer criteria have a higher risk of developing disorders of glucose metabolism, including T2D, 10 years after the episode of GDM 8 . A proportion of this risk can be attributed to genetic predisposition, since genome wide association study (GWAS) data from large, non-pregnant population-based studies have identified multiple loci associated with FPG 9 and type 2 diabetes 10 and some of these are shared with GDM 11-16 . Specific to the WHO 2013 criteria, single nucleotide polymorphisms (SNPs) at the GCK and TCF7L2 loci were shown to be associated with FPG and 2-hour glucose levels post-OGTT in women with GDM 17 . In addition, genetic risk scores for glycaemic traits, including FPG and type 2 diabetes, have been associated with a higher odds for GDM according to the WHO 2013 criteria 18 . However, it is not known whether the underlying genetic predisposition to fasting hyperglycaemia and type 2 diabetes varies depending on how the diagnosis of GDM is met.
The objective of this study was to investigate whether there is a difference in genetic risk for fasting hyperglycemia and type 2 diabetes according to the different diagnostic thresholds of glucose tolerance from the WHO 2013 criteria for GDM. To do this, we used a genetic score (GS) for FPG (FPG GS) or T2D (T2D GS) (consisting of previously-identified loci 9,19 )

Methods
Study population Women of European ancestry (self-reported white ethnicity) with singleton pregnancies and without known pre-existing diabetes from the Hyperglycemia and Adverse Pregnancy Outcome (HAPO) Study 5 (n=2,628) and Atlantic Diabetes in Pregnancy (DIP) study 20 (n=1,084) were included. The HAPO study was an observational, multi-centre study (N=23,316 participants from 15 centres) to which women were recruited during pregnancy if they were over 18 years of age 5 . The 2,665 European-ancestry participants included in the current study were those with genotype data available on selected SNPs (see below). The DIP study had a case-control design: approximately three genotyped control participants without GDM (defined initially as a maternal FPG <5.6 mmol/L and/or 2-hour glucose post oral glucose load <7.8 mmol/L) were available for every genotyped case participant included in our analyses.

Sample collection and clinical characteristics
The study methods used in HAPO and DIP have been described in detail previously 5,7,20-22 . Maternal FPG in mmol/L was measured prior to a standard 2-hour OGTT with 75 g of glucose between 24 and 32 weeks in HAPO and 24 and 28 weeks in DIP. Information on maternal age, pre-pregnancy body mass index (BMI) and systolic blood pressure (SBP, in mmHg) was collected at the OGTT appointment. Clinical characteristics of participants in HAPO and DIP with and without GDM were different (women in DIP were older, had a higher BMI and higher SBP, all P <0.01), hence clinical characteristics (where available) have been presented separately.

GDM diagnostic criteria subgroups
We used the WHO 2013 cut-offs (previously IADPSG 2010) to define fasting and 2-hour hyperglycaemia. Thus, in the

Amendments from Version 2
We would like to thank all of the reviewers for their time in reviewing the second version of the manuscript and providing additional comments to help further improve the work.
We have responded to each of the reviewers' points in detail. The following summarises the changes made to Version 2 of the manuscript: We have clarified the source of effect sizes (Betas) for the T2D GS and added additional information on effect allele frequencies for proxies and how linkage disequilibrium was calculated in Table 1 and Table 2. We have given an indicative calculation of power of the samples to detect associations between FPG and the FPG GS in the Methods section. We have added the mean and SD FPG and T2D GS for the different groups to Table 3. We have removed the multiple instances of mmol/l in Table 4. We have confirmed that sensitivity analyses adjusting for maternal BMI and age did not affect the relationships between GSs and GDM diagnostic categories in the Results section. We have added additional information to the Discussion section describing the possible implications of small sample sizes and power, why we did not meta-analyse the results from both studies and the possibility of analysing incident type 2 diabetes in the HAPO Follow-Up Study in the future.

Genotyping
Genotyping of individual SNPs in DNA samples from both the DIP and HAPO studies was carried out at LGC Genomics (Hoddesdon, UK), using the PCR-based KASP TM genotyping assay. We first selected 41 SNPs that had been previously associated with type 2 diabetes, and 16 SNPs associated with fasting glucose in non-pregnant individuals, for genotyping in the DIP study. Overlap between the type 2 diabetes and FPG SNPs meant that seven FPG loci were also in the list of type 2 diabetes loci. The median genotyping call rate in the DIP samples was 0.992 (range 0.981-0.996), and there was >99% concordance between duplicate samples (8% of total genotyped samples were duplicates). We excluded one FPG SNP and one type 2 diabetes SNP that showed deviation from Hardy-Weinberg Equilibrium (Bonferroni-corrected P value <0.05). For details of included and excluded SNPs and their sources, see Table 1 and  Table 2. In the HAPO study, we selected SNPs from the same 16 FPG and 41 type 2 diabetes loci for genotyping in women of European ancestry with DNA available. The selection and genotyping of SNPs in the HAPO study was performed at different times from that in the DIP study. Owing to the differing availability of published GWAS results at these times, the genotyped SNPs differed between HAPO and DIP at 9 of the associated loci. The HAPO SNPs at the nine loci were generally well correlated with those genotyped in DIP (r 2 >0.7, apart from at the ADAMTS9 locus where r 2 = 0.45). The median genotyping call rate in the HAPO samples was 0.984 (range 0.955-0.991), and the mean concordance between duplicate samples was >98.5% (at least 1% of samples were duplicated). We excluded one SNP that showed deviation from Hardy-Weinberg Equilibrium in the HAPO study (Bonferroni-corrected P value <0.05; see Table 1 and Table 2). After exclusion of SNPs that showed deviation from Hardy-Weinberg equilibrium and one SNP from the type 2 diabetes score whose main effect was on BMI (rs11642841 (FTO locus) 23 , a total of 15 SNPs at FPG-associated loci and 38 SNPs at type 2 diabetes-associated loci were available in both studies for analysis. All glucose values are in mmol/L. The 1-hour and 2-hour glucose measures refer to the glucose level measured at 1 and 2 hours, respectively, following a 75 g oral glucose load as part of an oral glucose tolerance test. Women with a FPG ≥5.1 mmol/L and either a 1-hour glucose ≥10 mmol/L or 2-hour glucose ≥8.5 mmol/L, or both, were combined as one group for analyses. Generating a genetic score for FPG and type 2 diabetes Weighted genetic scores for FPG (FPG GS) and type 2 diabetes (T2D GS) were generated using the 15 SNPs and 38 SNPs, respectively. All weights for the FPG GS and T2D GS were taken from Dupuis et al. 9 and Voight et al. 26 , respectively. The GSs were calculated by taking the sum of the number of FPG-raising or type 2 diabetes risk alleles (0, 1 or 2) for each SNP, multiplied by its corresponding beta value (effect size, see Table 1 and Table 2) for association with FPG or type 2 diabetes, divided by the sum of all beta values and multiplied by the total number of SNPs analysed (see Figure 2 for formula). GS were generated for participants with complete data for all included SNPs only.

Statistical analyses
Analysis of clinical characteristics. Clinical characteristics were compared between participants with and without GDM in HAPO and DIP using unpaired t-tests for normally distributed data and the Wilcoxon Rank-Sum test for non-normally distributed data. P values were corrected for 24 comparisons using the Bonferroni method.

Analysis of associations between FPG GS or T2D GS with glucose levels and GDM.
Associations of the FPG GS or T2D GS with FPG, 1-hour and 2-hour glucose in women with and without GDM (cases and controls) were analysed using linear regression in HAPO (which was a representative sample of European participants from the whole study cohort) and DIP. P values were corrected for 24 comparisons using the Bonferroni method. Our sample of N=2,628 HAPO and N=1,084 DIP participants both provided 100% power to detect associations between fasting glucose levels and the fasting glucose GS at α=0.002, assuming the GS explains 6% variance in glucose levels, which was estimated in 849 pregnant  individuals without diabetes in an external study (Exeter Family Study of Childhood Health (EFSOCH)) 43 . Means for FPG GS and T2D GS in women with and without GDM were compared using unpaired t-tests in each study cohort separately, as the genetic scores were higher overall in DIP. P values were Bonferroni corrected for 16 comparisons Sensitivity analyses adjusting GS for maternal pre-pregnancy BMI and age (where available) were performed using ANCOVA.
Statistical software. All statistical analyses were performed using Stata version 14.0 (StataCorp LP, College Station, TX, USA). P-values <0.05 were considered to indicate evidence of association, unless otherwise stated.

Ethics approval
Ethics approval was obtained from the Northwestern University Office for the Protection of Research Participants for HAPO (Protocol # 0353-001). The HAPO study protocol was approved by the institutional review board at each field center and all participants gave written, informed consent. Ethics approval was obtained from the local Galway University Hospital Research Ethics Committee for Atlantic DIP (Ref: 54/05) and all participants gave written, informed consent.

Clinical characteristics in women with and without GDM
Clinical characteristics for women with and without GDM are summarised in Table 3 for HAPO and DIP, respectively. Women with a FPG ≥5.1 mmol/L (on its own or with either 1-hour or 2-hour hyperglycaemia) had a higher pre-pregnancy BMI than women without GDM in HAPO and DIP (P values <0.001). Women with both fasting and either 1-hour or 2-hour hyperglycaemia were older compared with controls in HAPO (P value <0.05 after Bonferroni correction). In HAPO we observed a higher SBP for women diagnosed with GDM by a FPG ≥5.1 mmol/L only compared with controls (P value <0.001) and they had a higher SBP when either their 1-hour or 2-hour glucose was also raised, but the P value was >0.05 after Bonferroni correction. In DIP there was a higher SBP for women diagnosed by both fasting and either 1-hour or 2-hour hyperglycaemia criteria compared with controls (P value <0.05 after Bonferroni correction).  Table 4. Adjusting for the different measures of glucose tolerance suggested that these associations were not independent of one another.

Variables
Women diagnosed with GDM by fasting glucose criteria have a higher FPG GS We observed a higher FPG GS in women diagnosed with GDM by fasting hyperglycaemia only and by both fasting and either 1-hour or 2-hour criteria, compared with controls ( Figure 3A, all P values for comparison with control group <0.05 after Bonferroni correction). There was also evidence that women with a raised 1-hour glucose only had a higher FPG GS in HAPO (P value for comparison with controls <0.01 but >0.05 with Bonferroni correction), but this was not as strong in DIP (P value =0.05). In contrast, women diagnosed with GDM by 2-hour only criteria did not have a higher FPG GS overall (P values for comparison with controls >0.05 in both studies). Sensitivity analyses adjusted for maternal BMI and age did not materially alter the GS relationships (Extended data Tables 1A and 1B).
Women diagnosed with GDM by fasting, 1-hour or 2-hour criteria have a higher T2D GS than controls The T2D GS was higher than controls in women with fasting, 1-hour or 2-hour hyperglycaemia in HAPO and DIP ( Figure 3B): all P values for comparison with controls were <0.05 after correction except for the fasting and 1-hour only groups. As with the FPG GS, sensitivity analyses adjusted for maternal BMI and age did not materially affect the associations seen (Extended data Tables 1A and 1B).

Discussion and conclusions
In this study of 3,712 pregnant women of European ancestry, we have confirmed that women diagnosed with GDM according to the WHO 2013 criteria have a raised genetic risk for type 2 diabetes and shown for the first time that this risk was raised across all of the different measures of glucose tolerance. A genetic predisposition to a higher FPG was present for women who met the fasting glucose criteria (and 1-hour glucose criteria in HAPO), but was not present for women who met the 2-hour criteria.
We confirmed that FPG in pregnant women both with and without GDM was positively associated with a FPG GS generated using SNPs identified in a non-pregnant population 9 . The 1-hour and 2-hour glucose values were also correlated with the FPG GS, but this could potentially be explained by their association with FPG, since this association was not as strong once this was taken into account. Thus, the observation that the FPG GS was not higher in women diagnosed with GDM due to a 2-hour glucose ≥8.5 mmol/L alone was likely because these women did not have fasting hyperglycemia. However, larger sample sizes would be needed to confidently rule out differences in FPG GS between these groups. Maternal FPG was also associated with the T2D GS, which would be expected, as there are loci within the T2D GS which also raise fasting glucose (e.g. GCK, MTNR1B) 9 . The ADCY5 locus has also been found to be associated with 2-hour glucose values 44   for 1-hour glucose values was not available at the time of writing, but since we found the T2D GS to be associated with 1-hour glucose values in HAPO, it is likely that this explains the higher T2D GS seen in the women meeting this criterion for diagnosis of GDM, and will contribute to the higher T2D GS seen in women with both a fasting and either 1-hour or 2-hour hyperglycaemia. However, it is important to note that the relationships between the T2D GS and the different glucose categories did not appear to be independent of one another, and again, although women meeting the diagnosis for GDM in one category may not meet the thresholds for GDM in other categories, they are likely to have a degree of fasting and postprandial hyperglycaemia which will contribute to their higher genetic risk for type 2 diabetes compared with women without GDM.

Table 4. Associations for fasting plasma glucose (FPG) and type 2 diabetes (T2D) genetic scores (GS) with different measures of glucose tolerance in women with and without diabetes in the Hyperglycemia and Adverse Pregnancy Outcome Study (A) a and the Atlantic Diabetes in Pregnancy
One might expect that women with both fasting and postprandial hyperglycaemia would have the highest genetic risk for type 2 diabetes, but we did not observe this for the T2D GS in women with both a FPG ≥5.1 mmol/L and either a 1-hour glucose ≥10 mmol/L or 2-hour glucose ≥8.5 mmol/L. On the whole, the relationship between GDM and a higher T2D GS was clearest for women with a raised 2-hour glucose or a combination of raised fasting and 1-hour or 2-hour glucose, but studies with greater statistical power will be needed to confirm whether genetic risk of T2D is heterogeneous across the different thresholds of glucose tolerance that are part of the WHO 2013 criteria for GDM.
Previous studies investigating the association between genetic risk scores for glycaemic traits and GDM have provided interesting insights into the biology of GDM. For example, in a study including women from HAPO as well as a Canadian cohort-study (Gen3G), a fasting glucose genetic risk score was strongly associated with FPG in pregnant women, explaining a similar variance in FPG to the non-pregnant population 18 . Furthermore, genetic risk scores for insulin sensitivity and insulin secretion were associated with these traits in pregnancy, emphasising there to be an important shared genetic component to these both in and outside of pregnancy. Several SNPs at risk loci included in the genetic scores for FPG and type 2 diabetes risk in this study have previously been associated with GDM at genome-wide significance (CDKAL1, G6PC2, GCKR, MTNR1B 14,45 ) and at lesser-degrees of significance (e.g. HNF1A, TCF7L2, HHEX/IDE, PPARG 15,17,18,45 ). These loci have been implicated in diverse physiological processes influencing glucose metabolism, such as beta cell function and insulin secretion, and insulin resistance secondary to lipodystrophy and disrupted liver lipid metabolism 46,47 . Along with previous studies showing associations between GDM and genetic risk scores including SNPs at risk loci associated with type 2 diabetes 15,18 , our study supports the growing evidence that genetic determinants of glycaemic traits influence both of these phenotypes.
Although genetic predisposition will contribute to the underlying pathophysiology of GDM, it explains only part of GDM risk. So far, models including genetic risk scores for glycaemic traits have shown limited predictive ability 18,48 , suggesting they may not be sufficiently accurate to be used on their own in determining who should be screened for GDM. Therefore, it is still important to consider other well established risk factors, such as parity, maternal age, BMI, ethnicity and socioeconomic background in stratifying risk of GDM 5 .
This work specifically examining the genetic risk of type 2 diabetes in women diagnosed with GDM according to different measures of glucose tolerance supports the results from the recent HAPO Follow-Up Study 8 , which showed that women diagnosed with GDM post-hoc according to WHO 2013 criteria had a higher risk for type 2 diabetes 10 to 14 years after pregnancy. We observed the highest BMIs in women diagnosed with GDM by fasting hyperglycaemia only or both criteria, which is consistent with previous research showing that women diagnosed with GDM by the WHO 2013 criteria were more overweight than those diagnosed by WHO 1996 criteria 7,49 . However, the associations seen for GDM with FPG GS and T2D GS are not driven by BMI (the genetic variants included within the scores do not primarily affect FPG and T2D risk because of an effect on BMI), suggesting that women with fasting hyperglycaemia in pregnancy are likely to have both BMI-related metabolic factors and a genetic predisposition contributing to type 2 diabetes risk. Furthermore, women with an isolated 2-hour hyperglycaemia did not have a significantly higher BMI than controls, which could suggest a more important role for genetic predisposition in this group of women. However, non-genetic environmental factors related to development of type 2 diabetes such as diet, exercise and socioeconomic deprivation remain a key consideration, as genetics will explain only a portion of risk for type 2 diabetes in women with a history of GDM. For example, a recent study of 2,434 women of white ethnicity found that ~26% women with a history of GDM and a type 2 diabetes genetic risk score in the highest quartile had developed diabetes at follow-up compared with ~23% of women in the lowest genetic risk score quartile 16 . The GSs in this study were not analysed for their association with incident T2D in the HAPO Follow-Up Study, but this would be useful to establish and make direct comparisons with the results of this work in the future In the longer-term, although using the lower FPG threshold from the WHO 2013 criteria for identifying GDM will result in more cases diagnosed, these women will be an important target for long-term follow-up. It is not known whether lifestyle interventions such as guided diet and exercise programmes could modify risk of progressing to type 2 diabetes in women with a history of GDM and a high genetic risk. A study of 1,744 white women suggested that risk of developing type 2 diabetes after a pregnancy affected by GDM was greatest in women with a high genetic risk score and poor diet 16 . On the other hand, while the Diabetes Prevention Program (DPP) 50 trial found that lifestyle intervention or metformin treatment reduced risk of progression to type 2 diabetes in women with impaired glucose tolerance and a history of GDM (according to relevant criteria at time of diagnosis), a genetic risk score for type 2 diabetes did not influence treatment response 51 . Neither of these studies included women specifically diagnosed by WHO 2013 criteria, but it is clear from this work and that of the HAPO Follow-Up Study that these women would benefit from monitoring after pregnancy and should be considered for targeted lifestyle interventions in public health policies focussing on prevention of type 2 diabetes in adults.
There are limitations of this study that are important to consider. The small number of cases of GDM included has been mentioned and this could have meant that the study was underpowered to show clear differences in T2D GS between the different diagnostic categories. We also studied women from two different studies, where there were notable differences in clinical characteristics, even for women without GDM. Additionally, the FPG and T2D GS were consistently higher in DIP than in HAPO. This is likely to reflect differences in SNPs used to generate the genetic scores and possibly a slightly higher genetic disposition to a raised FPG and type 2 diabetes in DIP. Meta-analysis would have improved power, but was not appropriate due to these differing aspects of each study. However, there were remarkably similar patterns for the genetic score associations amongst the different diagnostic groups in both studies. The results of these analyses are therefore likely to be applicable to women of European ancestry, but further larger-scale studies, including analysis of women with diverse ancestry, will be needed to confirm the associations identified in this study In conclusion, women diagnosed with GDM according to the newest WHO 2013 criteria, regardless of how the diagnosis is met, have a higher genetic risk for type 2 diabetes compared with women without GDM. Overall, the criteria identify an important group of women at risk for adverse pregnancy outcomes as well as a higher risk for developing future type 2 diabetes 8 , which can partly be explained by genetic predisposition. In addition, this study has added to the literature confirming genetic predisposition to type 2 diabetes in women with GDM and supports the call for considering GDM as a key area of investigation in the field of genetics-led precision medicine.

Data availability
Underlying data Data is not freely available due to it consisting of potentially identifiable information, and as such is held securely to protect the interests of research participants in line with the guidance from the relevant ethics committees. However, the ethics committees will allow data analysed and generated in this study to be available The file contains an extended data table with sensitivity analyses adjusting the genetic scores for maternal pre-pregnancy BMI and age and a figure with a directed acyclic graph (DAG) showing how the relationships between the genetic scores and GDM diagnostic category are not driven by maternal pre-pregnancy BMI or age. Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Department of Medicine, McMaster University, Hamilton, ON, Canada
Hughes et al. derived a fasting plasma glucose (FPG) as well as a type 2 diabetes (T2D) genetic risk scores (GS) in participants from the Hyperglycemia and Adverse Pregnancy Outcome (HAPO) and Atlantic Diabetes in Pregnancy (DIP) studies. Pregnant women were assigned to gestational diabetes (GDM) cases or control groups based on their 75g oral glucose tolerance test results using WHO's 2013 criteria. Four different subgroups of cases were devised depending on whether participants exceeded the diagnosis threshold for fasting, 1h-, 2h-glucose measures, or a combination of these. The authors show that FPG GS was lower in GDM controls compared to GDM cases. This association was the strongest when comparing women with a FPG ≥5.1 mmol/L only (1-hour glucose <10 mmol/L and 2-hour glucose <8.5 mmol/L) vs. controls. T2D GS was also lower in GDM controls compared to cases. I concur with all the comments from my fellow Reviewers (1 and 2) on this 2 nd version of the manuscript.
In addition, I have the following comments/suggestions: Tables 1 and 2: Please specify the effect/other allele, effect allele frequency, and beta of the proxy SNPs.
○ Table 3: describe the mean and SD of the two GSs in controls as well as in each case group. ○ All non-significant results in relation to GS and GDM associations (results shown in Figure 3): There seems to be a big issue with the small samples size in each case group and hence, large SEs. Non-significant results (or results that do not pass multiple testing) are likely due to a lack of statistical power. Similarly, I do not agree with statement (P9 -discussion):" the observation that the FPG GS was not higher in women diagnosed with GDM due to a 2-hour glucose ≥8.5 mmol/L alone was expected". The authors could (and should) meta-analyze results from HAPO and DIP to increase their power. Given the differences between the two ○ studies, adjustments for age and BMI should be considered to reduce between study heterogeneity. Other Minor Modifications: There is no mention in the methods about how LD was estimated in UK Biobank (data version, samples selection, software) ○ Table 4: please remove all the mmol/L within the table to facilitate reading, (units can be mentioned in the table's headers or in the legend). ○ Page 9: under "Women diagnosed with GDM by fasting, 1-hour or 2-hour criteria have a higher T2D GS than controls", please refer to Figure 3B instead of Figure 2B.  Table 2.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Partly

Are the conclusions drawn adequately supported by the results? Partly
Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Genetics and genomics of complex cardio-metabolic traits
I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Author Response 08 Mar 2021 Alice Hughes, University of Exeter, Exeter, UK We would like to thank reviewer 3 for their detailed comments on version 2 of the manuscript. We have addressed their points in turn below: 1. Tables 1 and 2: Please specify the effect/other allele, effect allele frequency, and beta of the proxy SNPs.
We have provided the effect/other alleles, EAF and beta for the proxy SNPs in the Tables 1 and 2 as requested. We used the same beta value for proxy SNPs and have clarified this in the table footnote. Thank you for alerting us to this. Table 3: describe the mean and SD of the two GSs in controls as well as in each case group.

2.
We have provided the mean and SD of the GSs to Table 3 as requested. Figure 3): There seems to be a big issue with the small samples size in each case group and hence, large SEs. Non-significant results (or results that do not pass multiple testing) are likely due to a lack of statistical power. Similarly, I do not agree with statement (P9 -discussion):" the observation that the FPG GS was not higher in women diagnosed with GDM due to a 2-hour glucose ≥8.5 mmol/L alone was expected". The authors could (and should) meta-analyze results from HAPO and DIP to increase their power. Given the differences between the two studies, adjustments for age and BMI should be considered to reduce between study heterogeneity. We agree with the reviewer's comment that the wide confidence intervals and nonsignificant P values after adjustment for multiple testing is likely due to the small sample sizes and resulting low power. We mentioned the small sample size could lead to less distinct associations between the diagnostic categories in the eighth paragraph of the discussion, but have clarified this to include the fact that this could be related to low power. It now reads: "The small number of cases of GDM included has been mentioned and this could have meant that the study was underpowered to show clear differences in T2D GS between the different diagnostic categories".

All non-significant results in relation to GS and GDM associations (results shown in
Regarding the comment on it being expected that there was not a higher FPG GS in individuals with a raised 2-hour glucose, we agree that in the context of a small study, that this might represent the sample size, rather than a considerably lower genetic predisposition to raised fasting glucose. However, we do think it remains the case that women who meet the criteria for GDM because of a raised 2-hour glucose alone but not a raised fasting glucose will likely have a lower FPG GS than women who meet the criteria because of fasting hyperglycaemia, and that was supported by the results in both cohorts. We have changed the wording of this to reflect these points to: "Thus, the observation that the FPG GS was not higher in women diagnosed with GDM due to a 2-hour glucose ≥8.5 mmol/L alone was likely because these women did not have fasting hyperglycemia. However, larger sample sizes would be needed to confidently rule out differences in FPG GS between these groups." Meta-analysis of the results from both cohorts would improve power, and we did intend to do this when we set-out to complete this work. However, due to the different structures of the separate studies (in HAPO a random subset were genotyped, whilst DIP genotyped three women without GDM for every one woman diagnosed with GDM using the clinical criteria at the time of recruitment), different SNPs (proxies were used in the HAPO GS, please see the Methods section for more details), higher GSs in the DIP cohort and different proportions of women in the different diagnostic categories, we felt that meta-analysis would not be combining like-for-like. We do think that the similar patterns of association support the findings in both studies, but for the reasons mentioned we took the decision not to meta-analyse. These reasons are shown in paragraph 8 of the discussion, but we agree that a comment making it clear that these were reasons precluding meta-analysis should be added and have done so in paragraph 8 of the discussion: "Meta-analysis would have improved power, but was not appropriate due to these differing aspects of each study".

There is no mention in the methods about how LD was estimated in UK Biobank (data version, samples selection, software)
Thank you for pointing this important detail about LD estimates out. We have added to the footnotes of Tables 1 and 2 how LD was calculated. We have edited Table 4 to remove the multiple instances of mmol/L to help make it more readable.

5.
6. Page 9: under "Women diagnosed with GDM by fasting, 1-hour or 2-hour criteria have a higher T2D GS than controls", please refer to Figure 3B instead of Figure 2B.
Thank you for pointing out this typo, we have corrected it accordingly.  Table 2.
Thank you for pointing this out. We have added a sentence to that effect in the Methods section (please see response 2 to reviewer 1 for more details).
expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
ancestries. Thus, the authors need to comment how this approach could have affect their results and why they select this approach over using a non-weighted GRS or using only estimates from the largest study in European ancestry regardless of the p-value reported. An alternative approach is to use one of the cohort to define the estimates for each variant and then use this estimates in the second cohort to calculate the GRS. and 95% confidence intervals for each GDM diagnostic criteria and controls in an Extended Data table which can be found at the following link: https://doi.org/10.6084/m9.figshare.14180033 The estimates and associations are very similar to those shown in the main manuscript (see Figure 3; to aid comparison we have added a column in the Extended data table with the unadjusted values). Since BMI and maternal age were not available in all participants (93% of HAPO participants and 51% of DIP participants), the sample size for the adjusted analyses was smaller and there was a corresponding attenuation of the strength of associations (as shown by wider confidence intervals). However, as can be seen by comparing the adjusted and non-adjusted GSs, the estimates were within the 95% CI of the non-adjusted associations, indicating that they were consistent between the adjusted and non-adjusted analyses. We would prefer not to include this data in the main manuscript for the reasons discussed above and in our previous response, but we have included the sentences "sensitivity analyses adjusted for maternal BMI and age did not materially alter the GS relationships (Extended data Tables 1A and 1B)" and "As with the FPG GS, sensitivity analyses adjusted for maternal BMI and age did not materially affect the associations seen (Extended data Tables 1A and 1B)." in the "Women diagnosed with GDM by fasting glucose criteria have a higher FPG GS" and "Women diagnosed with GDM by fasting, 1-hour and 2hour criteria have a higher T2D GS than controls" sections of the results, respectively.
4. Although this was not the scope of the study, the authors should add a short comment why the performance of these GRS for T2D in the HAPO and DIP cohort was not studied, particularly when they mentioned that women in HAPO cohort with GDM had higher risk for T2D 10 to14 years after pregnancy. This is a question that probably every reader would have.
The individuals genotyped in this study do not completely crossover with individuals included in the HAPO Follow-Up Study and since the follow-up time has been short so far, we are unlikely to have the power to study the performance of the T2D score with incident T2D in this cohort, although this is something that would be valuable to do in the future. We agree that a comment on this would be helpful and have added the following sentence to the discussion at the end of the sixth paragraph "The GSs in this study were not analysed for their association with incident T2D in the HAPO Follow-Up Study, but this would be useful to establish and make direct comparisons with the results of this work in the future". The manuscript by Hughes A.E. and collaborators titled: All thresholds of maternal hyperglycaemia for the WHO 2013 criteria for gestational diabetes identify women with a higher genetic risk for type 2 diabetes" test the hypothesis that each diagnosed criteria have different genetic risk using a FPGGS and a T2D GS.

Are sufficient details of methods and analysis provided to allow replication by others? Yes
If applicable, is the statistical analysis and its interpretation appropriate? I cannot comment. A qualified statistician is required.
genetic risk for fasting hyperglycaemia in women who had a met the fasting plasma glucose criteria (>5.1 mmol/L). We observed a higher genetic risk for type 2 diabetes across all diagnostic categories. Therefore, these criteria identify an important group of women with a genetic risk for type 2 diabetes. This is important for the clinical community to be aware of, since there are many different criteria for GDM used across the world which utilise different criteria (for example, in the United Kingdom the fasting glucose cut-off is 5.6 mmol/L and in Denmark a fasting glucose is not included). Our findings support the findings of the HAPO Follow-Up Study, which showed women diagnosed with GDM according to the WHO 2013 criteria had higher rates of disorders of glucose metabolism 10 years later (Reference: https://doi.org/10.1001/jama.2018.11628). The genetic risk is of particular interest for women who meet the criteria for GDM due to a high 2-hour glucose, as we did not observe a significantly higher BMI in this group of women. Therefore, if they had not been identified as having GDM, these women may not necessarily be considered a high-risk group who would require monitoring and follow-up (e.g. with an annual HbA1c). We have added to the discussion to emphasise the potential implications for long-term follow-up (paragraph 6, "Discussion and conclusion" section).
Of course, it is not known whether preventative lifestyle interventions could modify this genetic risk; we touch on this in the discussion in relation women with a history of GDM in the Diabetes Prevention Program (DPP) trial (Reference: https://dx.doi.org/10.2337%2Fdc13-0700). A recent study suggested that a genetic score for type 2 diabetes was more strongly associated with developing type 2 diabetes in women with a history of GDM and a poor diet (Reference: https://dx.doi.org/10.1136%2Fbmjdrc-2019-000850). But, they also point out that the relationship between the T2D genetic risk score and incident type 2 diabetes was modest. We have added to this area of the discussion to provide a more balanced discussion on the potential implications of a high genetic risk for type 2 diabetes, emphasising the uncertainty of whether the genetic risk could be of benefit in targeting public health interventions in relation to post-GDM care (paragraphs 6 and 7). We agree that the genetic risk for fasting hyperglycaemia and type 2 diabetes will only explain a portion of risk for GDM. The genetic risk will not explain risks associated with BMI, maternal age, parity and socioeconomic deprivation. In addition, these genetic scores will not explain the higher risks for GDM seen in women of African and South Asian heritage, for example. Whilst this study did not aim to assess the power of the genetic scores to diagnose or predict GDM, we have added to the discussion taking into account these considerations and referencing studies that have focussed on this as a new paragraph in the discussion (paragraph 5 of "Discussion and conclusion" section).

Having an increase genetic risk for T2D or fasting hyperglycemia is irrelevant if the authors cannot show what are the clinical implications of this genetic susceptibility. It
is known that women with GDM are at higher risk for developing T2D in the future, but the real question is whether a high genetic susceptibility or other clinical factors are the main drivers of this risk in women who already experienced GDM.
We have partially responded to the clinical implications in response to comment 1 and point to the updated version of the discussion. Similar to comment number 2, we agree that genetic risk is likely to explain only a portion of risk for type 2 diabetes and we have added to the discussion to emphasise the importance of other factors (please see paragraph 6 of the "Discussion and conclusion" section and the response to comment 1 above). However, we also believe that genetic risk for type 2 diabetes and fasting hyperglycaemia are likely to explain part of the relationship between GDM and type 2 diabetes in later-life. We discuss this in more detail in paragraphs 4 and 6 of the "Discussion and conclusion" section.

A major drawback of using GS is that they cannot provide mechanistic insights, so it is unclear how the use of GS for T2D and FPG for GDM can provide any relevant information in GDM.
We believe that genetic scores provide important insights into the biology of GDM. For example, the risk loci included in the scores have been implicated in beta cell function, proinsulin secretion and impaired insulin action secondary to unfavourable metabolic patterns of adiposity and liver lipid metabolism (References: https://doi.org/10.1371/journal.pmed.1002654 and https://doi.org/10.1038/s41588-018-0084-1). The relevance of this to GDM was underlined in a recent paper by Powe et al.
(Reference: https://doi.org/10.2337/db18-0203), which showed associations between genetic risk scores for fasting glucose, fasting insulin, and insulin secretion and sensitivity with GDM. In particular, they also observed strong associations between a fasting glucose GS and fasting glucose in pregnancy, similar to the associations seen outside of pregnancy, emphasising an important shared genetic predisposition. We did not seek to repeat the biological relevance of these associations demonstrated in this paper in our work, but rather show that there is a higher genetic risk for type 2 diabetes associated with the different measures of glucose tolerance in the WHO 2013 criteria for GDM. However, we agree that a discussion considering the likely underlying biology of genetic scores in relation to type 2 diabetes and GDM will help add to the context of the paper and have included a new paragraph in relation to this in the discussion (paragraph 4).

Please provide a more comprehensive explanation for the GS calculation.
We have provided a description of the genetic score (GS) calculation in the Methods section under the heading titled "Generating a genetic score for FPG and type 2 diabetes". We have also included the formula used in Figure 2. We have added a sentence referring to Tables 1 and 2 which show where the beta (effect size) was obtained for each SNP used in the score to clarify this further.

What is the definition of European ancestry: self-reported white or was based on population stratification analysis.
European ancestry was based on self-reported white ethnicity in both HAPO and DIP. We have added this to the Methods section under the heading titled "Study population".
7. Pre-pregnancy BMI and maternal age are well known risk factors for GDM. I suggest supporting the statement that the association seen between the GS and the different diagnostic criteria are not driven by BMI with data. We cannot assume that variants have the same impact in BMI variability in non-pregnant and pregnant populations.
The SNPs at the risk loci included in the genetic scores influence glycaemic traits independent of BMI. We did not include SNPs which had their main effect on BMI (e.g. FTO).
As BMI is not on the causal pathway between the genetic score and the outcome (GDM diagnostic criterion) it is not necessary to adjust for it in analyses, hence why it is not included in the data presented in this paper. Adjusting for BMI also has the potential to bias estimates due to collider-stratification bias (Reference: https://doi.org/ 10.1093/hmg/ddw433). For example, we observe a paradoxical negative correlation between the T2D GS and pre-pregnancy BMI in HAPO (Spearman's Rho -0.05, P value 0.01). This is not a true association and comes about because individuals with a higher genetic risk score are more likely to develop type 2 diabetes at a lower BMI than individuals with a lower genetic risk score. However, we agree that there are key differences in GDM risk according to BMI, which we have shown in the comparison of BMI between the different diagnostic criteria in Table 3 summarising the clinical characteristics of the participants in the studies. We have expanded on the importance of the role of BMI contributing to risk of type 2 diabetes and GDM in the discussion (see responses to Comments 1 and 3 and paragraphs 5 ad 6 of the "Discussion and conclusions" section).

Table 4, please confirm that the outcomes follow normal distribution and provide results for DIP.
Fasting glucose, 1-hour and 2-hour glucose have a positive skew in both HAPO and DIP (hence their presentation of median with IQR in Table 3). A normal distribution of outcomes is not a pre-requisite for linear regression models (Reference: https://doi.org/10.1017/CBO9780511790942), so we do not think this is of concern for this analysis, but we are happy to supply additional information if the reviewer would like this.
We have now included a second part to the table with the associations in DIP (which had a similar pattern of associations to HAPO), but with a cautionary note (footnote b to Table 4) that the beta coefficients seen in a case-control study design will not be applicable to a general pregnant population due to the relative over-representation of cases of GDM. Now we have included these analyses P values for association have been corrected for 24 comparisons and we have updated the Methods section under the heading "Statistical analyses".

Competing Interests:
We have no competing interests to declare.