Article Text

Download PDFPDF

Development and validation of an early pregnancy risk score for the prediction of gestational diabetes mellitus in Chinese pregnant women
  1. Si Gao1,2,
  2. Junhong Leng3,
  3. Hongyan Liu3,
  4. Shuo Wang4,
  5. Weiqin Li4,
  6. Yue Wang3,
  7. Gang Hu5,
  8. Juliana C N Chan6,7,
  9. Zhijie Yu8,
  10. Hong Zhu1,2,
  11. Xilin Yang1,2
  1. 1Department of Epidemiology and Biostatistics, School of Public Health, Tianjin Medical University, Tianjin, China
  2. 2Tianjin Key Laboratory of Environment, Nutrition and Public Health, Tianjin Medical University, Tianjin, China
  3. 3Department of Child Health, Tianjin Women and Children’s Health Center, Tianjin, China
  4. 4Project Office, Tianjin Women and Children’s Health Center, Tianjin, China
  5. 5Chronic Disease Epidemiology Laboratory, Pennington Biomedical Research Center, Baton Rouge, Louisiana, USA
  6. 6Department of Medicine and Therapeutics, Prince of Wales Hospital-International Diabetes Federation Centre of Education, The Chinese University of Hong Kong, Hong Kong, China
  7. 7Hong Kong Institute of Diabetes and Obesity, The Chinese University of Hong Kong, Hong Kong, China
  8. 8Population Cancer Research Program and Department of Pediatrics, Dalhousie University, Halifax, Nova Scotia, Canada
  1. Correspondence to Dr Hong Zhu; zhuhong{at}tmu.edu.cn

Abstract

Objective To develop and validate a set of risk scores for the prediction of gestational diabetes mellitus (GDM) before the 15th gestational week using an established population-based prospective cohort.

Methods From October 2010 to August 2012, 19 331 eligible pregnant women were registered in the three-tiered antenatal care network in Tianjin, China, to receive their antenatal care and a two-step GDM screening. The whole dataset was randomly divided into a training dataset (for development of the risk score) and a test dataset (for validation of performance of the risk score). Logistic regression was performed to obtain coefficients of selected predictors for GDM in the training dataset. Calibration was estimated using Hosmer-Lemeshow test, while discrimination was checked using area under the receiver operating characteristic curve (AUC) in the test dataset.

Results In the training dataset (total=12 887, GDM=979 or 7.6%), two risk scores were developed, one only including predictors collected at the first antenatal care visit for early prediction of GDM, like maternal age, body mass index, height, family history of diabetes, systolic blood pressure, and alanine aminotransferase; and the other also including predictors collected during pregnancy, that is, at the time of GDM screening, like physical activity, sitting time at home, passive smoking, and weight gain, for maximum performance. In the test dataset (total=6444, GDM=506 or 7.9%), the calibrations of both risk scores were acceptable (both p for Hosmer-Lemeshow test >0.25). The AUCs of the first and second risk scores were 0.710 (95% CI: 0.680 to 0.741) and 0.712 (95% CI: 0.682 to 0.743), respectively (p for difference: 0.9273).

Conclusion Both developed risk scores had adequate performance for the prediction of GDM in Chinese pregnant women in Tianjin, China. Further validations are needed to evaluate their performance in other populations and using different methods to identify GDM cases.

  • gestational diabetes mellitus
  • risk factor modeling
  • behavioral interventions
  • epidemiology
http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Significance of this study

What is already known about this subject?

  • Gestational diabetes mellitus (GDM) is prevalent globally. Lifestyle modification before the 15th gestational week can reduce the risk of GDM.

What are the new findings?

  • We have developed a set of clinical risk scores for the prediction of GDM among Chinese pregnant women before the 15th gestational week and at the screening for GDM.

  • The performance of the two risk scores was adequate with good calibration and moderate discrimination.

  • The first risk score (including six baseline predictors) was preferentially recommended with respect to its acceptable validation and relative simplicity.

How might these results change the focus of research or clinical practice?

  • Further validations are needed to evaluate the performance of the risk scores in other populations. In addition, randomized controlled trials are required to verify whether women identified with high risk of GDM by our risk scores can benefit more from early lifestyle intervention than those identified with low risk, so that lifestyle intervention can be done in a more cost-effective manner.

Introduction

Gestational diabetes mellitus (GDM) is prevalent, affecting about 16.4% of women globally and 25.0% in the Southeast Asia region.1 GDM is associated with both short-term and long-term adverse health consequences for both the mother and her offspring. Women with GDM are at increased risk of perinatal morbidity.2 These women are also at particular high risk of diabetes and cardiovascular disease in their later life.3 4 Offspring born to women with GDM are at increased risk of obesity in childhood5 and cardiovascular disease traits in adulthood.6

Several randomized controlled trials (RCTs) demonstrated that lifestyle intervention among patients with GDM could improve pregnancy outcomes and reduce insulin resistance in the female offspring around 5 years of age.7 8 However, there were no studies reporting that intervention of GDM during pregnancy was able to reduce long-term risk of diabetes in this high risk group of women or reduce risk of childhood obesity in offspring of GDM mothers.9 10 So it is critical to prevent the occurrence of GDM. In this regard, several published RCT studies, such as UPBEAT,11 RADIEL,12 St CARLOS,13 have demonstrated that the interventions on modifiable risk factors or lifestyle during pregnancy could decrease the incidence of GDM among pregnant women. Early interventions, for example with the Mediterranean diet, have shown benefits in women even at low risk of GDM13 or diagnosed as GDM.14 Furthermore, our meta-analysis15 found that lifestyle modification before the 15th gestational week (GW) could reduce the risk of GDM, but such intervention turned out ineffective once the pregnancy advanced beyond the 15th GW. Besides, we also showed that benefits of lifestyle intervention was not limited to overweight or obese women but also extended to women with normal body weight prior to pregnancy. Therefore, the key issue is to identify the group at high risk of GDM before the 15th GW or in early pregnancy so that lifestyle intervention can be done in a more cost-effective manner.

To achieve this purpose, some risk scores have been developed for the prediction of GDM.16 However, till now, almost all the risk score models have been derived from European or North American countries, such as UK,17 Germany,18 Netherlands,19 Canada,20 America,21 22 Australia23 24, and only a few were from Asian25 or African population.26 Thus, a specific risk model targeted to Asian population is urgently needed because of the heterogeneity of different ethnicities. Besides, the internal and external validity of some previous risk scores might be limited due to relative small sample size,17 22 retrospective27or cross-sectional design,26 or single-center source of sample.19 24 25 In addition, changes of lifestyle and behaviors during pregnancy were not considered in most risk score models.16

Thus, the current study, using an established population-based prospective cohort in Tianjin, China, aimed to develop and validate risk scores for the prediction of GDM based on baseline characteristics and during-pregnancy modifiable risk factors.

Materials and methods

Study population and settings

This study was conducted in Tianjin, a metropolitan city in Northern China, ranking fourth in population size (14 millions in 2012) among Chinese cities. Antenatal care in Tianjin urban districts was delivered by a three-tiered antenatal care network (consisting of primary, secondary, and tertiary hospitals) in a relatively structured manner.28 29 In brief, all pregnant women were registered at a primary hospital and received the antenatal care until the 32nd GM. Then, they were referred to one of the secondary or tertiary care hospitals of their choice for continued care till delivery.

From October 2010 to August 2012, 22 302 pregnant women were registered to receive their antenatal care and attended the screening for GDM. The detailed methods of establishment of this cohort were described previously.30–32

Screening for and diagnosis of GDM

A two-step GDM screening procedure, which was initiated in 1998, was used for the screening of GDM. First, all pregnant women were offered 50 g 1-hour glucose challenge test (GCT) at primary hospitals between 24th and 28th weeks of gestation. Then, women with plasma glucose (PG) at GCT ≥7.8 mmol/L were referred to Tianjin Women and Children’s Health Center for a standard 75 g 2-hour oral glucose tolerance test (OGTT).

When The International Association of the Diabetes and Pregnancy Study Groups (IADPSG) criteria were developed in 2010, we changed the old WHO’s criteria for GDM to the IADPSG’s. GDM was defined by meeting any one of the cut-off values: fasting PG ≥5.1 mmol/L, 1-hour PG ≥10.0 mmol/L, or 2-hour PG ≥8.5 mmol/L.33 However, to maintain the logistic and operation of the screening and management system, that is, GCT at primary care hospitals and OGTT at a centralized GDM clinic within Tianjin Women and Children’s Health Care Center, we continued to use a two-step procedure to identify GDM. Considerations of use of the two-step procedure were available in previous publications.30–32

Data collection

Data were collected longitudinally using self-administered questionnaires, anthropometric and laboratory measurements at two time points: at registration for pregnancy (≤15th GW, mean±SD: 10.2±1.9) and at the time of GCT (24th and 28th GWs, mean±SD: 24.8±2.5).30–32 Firstly, baseline information, for example, demographic and socioeconomic information, lifestyle, personal and family history of disease was collected at registration; then, at the time of GCT, information on changeable lifestyle were remeasured and recorded, such as sleeping time and quality, smoking, physical activity, and weight gain from registration to GCT.

The definitions of the variables were as follows: maternal age at registration was calculated as the period in years from the date of birth to the date of registration. Family history of diabetes was defined as having one or more first degree relatives with diabetes. Active smoking before pregnancy or during pregnancy was defined as continuously smoking one or more cigarettes per day for at least 6 months before pregnancy or smoking one or more cigarettes per day during pregnancy. Passive smoking information was collected by asking “are you currently exposed to cigarette smoking from others in working and/or living places before pregnancy or during this pregnancy?” Information on sleeping status during pregnancy was collected by asking two questions: “how many hours of sleep (including nap) did you get during the index pregnancy?” and “how did you feel about your sleep quality during the index pregnancy, good, moderate or poor?”34 Physical activity (including occupational, commuting, leisure-time, and housework physical activity) during pregnancy was assessed and categorized into low level and middle-to-high level.31 Sitting time at home referred to the hours daily spent on sitting at home, including watching TV, reading, using the computer, and other sitting times at home, including meal time. Detailed definitions of these variables could be referred to our previous reports.30–32

Maternal height, weight, waist circumference, and blood pressure (BP) were measured by uniformly trained nurses at primary care hospitals using a standardized protocol.30 Body weight at registration was treated as prepregnancy weight due to small weight gain during the first 12 GWs.35 Weight gain from registration to GCT was calculated as the difference in body weight from registration to GCT. Weight change during pregnancy was also assessed using gestational weight gain rate (GWGR, kg/week) according to the following formula: Embedded ImageWe categorized GWGR as inadequate, adequate, or excessive according to the 2009 Institute of Medicine guidelines.36 Body mass index (BMI) was calculated as weight in kilogram divided by the square of body height in meter. Obesity and overweight were defined by the criteria recommended by the Working Group on Obesity in China,37 that is, underweight: BMI <18.5 kg/m2, normal weight: BMI 18.5 to 23.9 kg/m2, overweight: BMI 24.0 to 27.9 kg/m2 and obesity: BMI ≥28.0 kg/m2.

ABO blood types and serum alanine aminotransferase (ALT) were measured after an overnight fasting. ABO blood types were determined by hemagglutination reactions between antigens and antibodies by the slide method. ALT was measured using an automated enzymatic method (Toshiba TBA-120FR, Japan).

Women who had any of the following conditions were excluded from data analysis: who had history of type 1 or type 2 diabetes before pregnancy, who got registered to receive their first antenatal care at >the 15th GW, or who did not complete the two-step GDM screening procedure.

Statistical analysis

IBM SPSS Statistics V.19.0 (IBM SPSS, Chicago, Illinois, USA) was used to perform the statistical analyses. The characteristics of the study population were summarized by means±SD for continuous variables and by percentages for categorical variables. Characteristics at first antenatal care visit and the change of lifestyle during pregnancy (from prepregnancy to GCT) was compared between GDM and non-GDM groups using Student’s t-test or χ2 test where appropriate.

The dataset was randomly divided into two parts using a computer-generated random number: the training dataset and the test dataset, with the ratio of sample size of 2:1. The training dataset was used to develop the risk score and the test dataset was used to validate its performance. In view of the relatively big sample size, simple randomization method, rather than block randomization or stratified randomization, was used for allocation without need to consider any variables or characteristics of participants.

Risk score development

We chose to develop two sets of risk scores, one only including predictors collected at the first antenatal care visit for early prediction of GDM and the other also including risk factors collected during pregnancy, that is, at the time of GCT.

In the training dataset, binary logistic regression was performed to obtain ORs and 95% CIs of related factors for GDM. The dependent variable was the development of GDM, and the candidate independent variables were the characteristics of participants before and during pregnancy which had a univariate significance level of p<0.20 and/or were judged to be of clinical importance and/or had been proved to be associated with GDM by our previous analyses. In multivariate logistic regression, enter method rather than stepwise method was used for the selection of independent variables to avoid overfitting, and only statistically significant variables (p<0.05) remained in the model. The shrinkage factor was calculated using (χ2-k)/χ2, where χ2 denotes the likelihood ratio χ2 and k the number of the predictors in the model (below 0.85 raises concern of overfitting). If necessary, the regression coefficients of the predictors were multiplied by the shrinkage factor (uniformly shrunken) to adjust for optimism.38 All continuous independent variables, such as age, BMI, weight gain, systolic BP, ALT, and body height were included in the model without being categorized with the aim of minimizing the loss of information caused by dichotomization.

The interactions between independent variables were assessed by generating new variables (the value of “1” represented any of two variables was abnormal, and the value of “0” represented both of two variables was normal) and recruiting them into the model.

Validation of the developed risk scores

Validation of the developed risk score was performed in the test dataset. Calibration and discrimination were used to check the performance of the developed risk score. First, Hosmer-Lemeshow χ2 test was used to check the calibration. Pregnant women in the test dataset were divided into deciles according to their predicted probability of GDM. The observed and expected probabilities of GDM in the deciles were compared using the Hosmer-Lemeshow test (df=8). A p value of more than 0.10 indicated similarity in the predicted and observed probability or an acceptable calibration. Second, discrimination was assessed using area under the receiver operating characteristic curve (AUC). Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) at different cut-off points of the risk score was calculated for possible use of the risk score at other different antenatal care scenarios.

Results

Characteristics of the study population

Among 22 302 pregnant women, we sequentially excluded 21 women who had history of type 1 or type 2 diabetes before pregnancy, 936 who registered and attended their first antenatal care in more than the 15th GW, 1163 women who did not undergo GCT, and 851 women who had a positive GCT but did not undergo OGTT. Finally, 19 331 women were included in the analysis (online supplementary figure 1).

Supplemental material

Among 19 331 eligible participants, 1485 women (7.68%) developed GDM. Women with GDM were more likely to be older, Han-ethnicity, multiparous, have non-AB blood type, have habitual use of tobacco before pregnancy, and have family history of diabetes in first-degree relatives. They also had higher level of BMI, waist circumference, BP, and ALT, but shorter height than those without GDM. Besides, during pregnancy, passive smoking, shorter (<7 hours/day) or longer (≥9 hours/day) duration of sleep, and more sitting time at home was also more common among GDM cases than their counterparts. No significant differences were found between two groups with respect to education, weight gain, and physical activity during pregnancy (table 1).

Table 1

Characteristics of participants according to the occurrence of GDM

In addition, we compared the basic characteristics of participants between the training dataset and the test dataset and found the two groups had good similarity with respect to almost all profiles, except for active smoking during pregnancy (p=0.006) and sitting hours per day during pregnancy (p=0.029), which demonstrated that our simple random allocation method was reasonable (data not shown).

Risk score development

The training dataset had 979 or 7.6% GDM cases (n=12 887). The selected predictors, their regression coefficients (β), and ORs for the first risk score and the second one were listed, respectively, in model 1 and model 2 of table 2.

Table 2

Parameter estimates of the risk score models for the prediction of GDM in the training dataset

Among the potential predictors collected at the first antenatal care visit, non-AB blood type, active smoking before pregnancy, additive interactions between overweight and high ALT, and additive interactions between overweight and height were no longer significant in multivariate analysis and thus not included in the first risk score. Waist circumference was also not recruited in the final model because of its collinearity with BMI at registration. Consequently, the first risk score consisted of six baseline predictors: maternal age, BMI at registration, body height, SBP, ALT, and family history of diabetes in first-degree relatives. Their β coefficients were shown in model 1 of table 2.

Based on the first risk score, we further tested the predictive values of the variables collected at GCT, like physical activity during pregnancy, sitting time at home during pregnancy, weight gain from registration to GCT, passive smoking during pregnancy, sleeping quality and sleeping time during pregnancy, as well as the additive interaction between overweight and passive smoking. We found that sleeping quality and sleep time during pregnancy and additive interaction between overweight and passive smoking were not significant in multivariate analysis. Consequently, the second risk score consisted of physical activity during pregnancy, sitting time at home during pregnancy, weight gain from registration to GCT, passive smoking during pregnancy, as well as the predictors in the first risk score. Their β coefficients were shown in model 2 of table 2.

The shrinkage factor of model 1 and model 2 was 0.985 and 0.968, respectively, significantly higher than the value of overfitting criteria (λ=0.85), indicating that the performance of the two models was only overestimated by 1.5% and 3.2%, respectively. Thus, it was not vitally necessary for our data to make adjustment of parameters by shrinkage factor. Based on the unadjusted values of β coefficients, the final risk score of GDM in model 1 and model 2 were constructed as follows:

Model 1: GDM risk score=0.0941×maternal age (year)+0.1278×BMI at registration (kg/m2)+0.0093×SBP (mm Hg)+0.6816×Log10(ALT) (U/L)+0.5129×family history of diabetes (1 if yes, 0 if no)−0.0270×body height (cm)−5.7469.

Model 2: GDM risk score=0.0978×maternal age (year)+0.1366×BMI at registration (kg/m2)+0.013×SBP (mm Hg)+0.7004×Log10(ALT) (U/L)+0.4909×family history of diabetes (1 if yes, 0 if no)−0.0215×body height (cm)−0.2374×physical activity during pregnancy (1 if middle-to-high level, 0 if low level)+0.1825×sitting time at home (1 if <2 hours/day, 2 if 2–4 hours/day, 3 if >4 hours/day)+0.0351×weight gain (kg)+0.3058×passive smoking during pregnancy (1 if yes, 0 if no)−8.0732.

Risk scores validation

The test dataset had 506 or 7.9% GDM cases (n=6444). The first risk score (model 1) had an acceptable calibration, with the predicted probabilities of GDM being similar to the observed probabilities (χ2 for Hosmer-Lemeshow test=10.052, p>0.25). In the same way, the second risk score (model 2) had similar predicted probabilities of GDM with the observed ones (χ2 for Hosmer-Lemeshow test=7.995, p>0.25) (figure 1).

Figure 1

The predicted and observed probability of gestational diabetes mellitus (GDM) based on model 1 and model 2 in the test dataset.

The first risk score achieved an AUC of 0.710 (95% CI: 0.680 to 0.741). After further including physical activity during pregnancy, sitting time at home during pregnancy, weight gain from registration to GCT, and passive smoking during pregnancy, the discrimination of the second risk score slightly improved (AUC: 0.712, 95% CI: 0.682 to 0.743), but not statistically significant (p=0.9273) (figure 2).

Figure 2

Area under receiver operating characteristic (ROC) curve (AUC) of model 1 and model 2 in test dataset. Se, sensitivity; Sp, specificity.

In the second risk score, we also evaluated the predictive value of GWGR by recruiting GWGR, instead of weight gain from registration to GCT, into the second risk score. We found that the proportion of inadequate, adequate, or excessive weight gain in our whole population were 17.8%, 24.9%, and 53.7%, respectively. In univariate analysis, excessive GWG might increase the risk of GDM relative to adequate GWG, with the OR (95% CI) of 1.26 (1.07 to 1.48) (p=0.006), but this significance disappeared in the multivariate analysis (model 2), with the OR (95% CI) of 0.95 (0.76 to 1.18) (p=0.641) (data not shown). Besides, recruiting GWGR in the model 2 did not improve the AUC of model 2 (0.706, 95% CI: 0.684 to 0.729) when compared with including weight gain into the model (AUC: 0.712, 95% CI: 0.682 to 0.743). Based on all these results, and considering the simplicity of the model, we still recommended using weight gain, rather than GWGR, as a GDM predictor in the second risk score.

Sensitivity, specificity, PPV, and NPV of the risk scores at different cut-off points were summarized in table 3. To avoid missed diagnosis of GDM (false negatives), the relatively lower cut-off values of risk score should be recommended. For example, at the cut-off point of 2.50 for the first risk score, sensitivity, specificity, PPV, and NPV were 93.3%, 25.1%, 9.6%, and 97.8%, respectively. For the second risk score, at the cut-off point of 4.7, sensitivity, specificity, PPV, and NPV were 93.5%, 20.9%, 9.2%, and 97.4%, respectively. Using these two cut-off values, more than 93% patients with GDM could be identified, with the missed diagnosis rate of less than 7%. If we applied a risk score of 2.80 in model 1 as a threshold to identify women “at high risk” for GDM, 57.4% of all women would need to undergo OGTT or receive preventive intervention. The corresponding PPV and NPV was 11.2% and 96.7%, respectively. For the second risk score, if the threshold to proceed to diagnostic test was set at, for example, 5.1, 54.7% of all women would be subjected to OGTT. The PPV and NPV were 11.1% and 96.6%, respectively.

Table 3

Sensitivity, specificity, predictive values at selected risk score and the predicted probability of GDM in test dataset

Moreover, sensitivity analysis was conducted to think about the OGTT missing data (1163 who did not undergo GCT, and 851 women who had a positive GCT but not undergo OGTT). First, basic characteristics (variable in table 1) of this population were compared with those who had completed the two-step GDM screening procedure and were recruited in our analysis (n=19 931). Almost all profiles were similar between two groups, except for education and parity. Women who did not receive GCT or OGTT were more likely to be multiparous (7.5% vs 3.5%) or have education less than 12 years (23.5% vs 16.7%) than their counterparts. Second, multiple imputation was conducted,39 and the missing OGTT measurements were estimated on the basis of the results of the GCT tests as well as the characteristics of the participants (variables in table 1). We found that the included predictors in the two risk score models based on the imputed data were same as those based on the data without imputation, and no significant changes on calibration and discrimination were observed. Besides, excluding 686 women who had at least once parity did not influence the performance of the two models (data not shown).

Discussion

Our study developed and validated a set of early pregnancy risk scores for the prediction of GDM using a representative sample of 19 331 pregnant women in Tianjin, China. We found that six predictors collected at the first antenatal care visit (maternal age, BMI, height, systolic BP, ALT, and family history of diabetes in first-degree relatives) and four during-pregnancy modifiable risk factors (physical activity, sitting time at home, passive smoking, and weight gain from registration to GCT) were associated with an increasing risk of GDM. The first risk score including only baseline variables and the second risk score including both baseline and during pregnancy variables had similar and acceptable calibration (both p for Lemeshow test >0.25) and discrimination (AUC for the first and the second risk score was 0.710 (95% CI: 0.680 to 0.741) and 0.712 (95% CI: 0.682 to 0.743), respectively).

Although numerous risk factors for GDM have been identified, the ability to accurately identify women before or early in pregnancy who are at the high risk of GDM and could benefit most from interventions remains limited. Only a few studies have summarized their results and developed predictions model or scoring systems to estimate the risk of GDM individually. Recently, Lamain-de Ruiter et al16 performed an external validation of 12 published GDM prediction models. He found that most of the published models showed acceptable discrimination and calibration, with the AUCs ranged from 0.67 to 0.78. Of these 12 models, 2 models19 24 were assessed by another researcher in a cohort of 510 Finland women.40 However, the results showed that both models underestimated the GDM incidence in this population. These inconsistent results suggested the marked heterogeneity of GDM in different populations.

Our risk scores based on Chinese population achieved similar calibration and discrimination as those based on European or North American populations. Some similar predictors30 ,40–45 ,16 has been identified in our analysis, like maternal age, maternal BMI, family history of diabetes, systolic BP, and ALT level. However, our study did not observe significant association of GDM with history of GDM and ethnicity. The non-significant association of GDM with GDM history was partly due to the overwhelming proportion of nulliparous (95.9%) women in our cohort who had no previous pregnancy and no chance to get GDM. Hence, based on our data, the role of GDM history could not be assessed thoroughly. As for ethnicity, the ethnic heterogeneity of our study was lower than that of Lamain-de Ruiter’s report which included Caucasian, African, Asian, mixed, and other ethnicity. This might be partly the reason for our insignificant association between ethnicity and GDM risk. External validation is needed to test whether our risk score could be generalized to other Asian populations, as well as to non-Asian ethnicities.

Short body height was observed to be associated with an increased risk of GDM in our data. Similar conclusion was also drawn from a Korean study.41 Some scholars46 suspected that the association between short height and increased risk of GDM was particularly seen among Asians and may not warrant biological plausibility for use as a GDM predictor. In our opinion, even though body height might not have causal association with GDM, the significant improvement of the performance of risk score after inclusion of height in the model convinced us that keeping this variable in the model was reasonable. The role of height in predicting GDM among non-Asian populations should be further studied.

Moreover, four modifiable risk factors during-pregnancy (physical activity, sitting time at home, passive smoking, and weight gain from registration to GCT) were found to be associated with an increased risk of GDM. Although adding these four indicators into the risk score model did not increase the AUC significantly, their value for clinical intervention was potentially huge. Lifestyle intervention such as increasing physical activity, decreasing sitting time, keeping reasonable weight gain, and avoiding passive smoking should be promoted during pregnancy. Nevertheless, till now, it seems still unclear whether women at high risk of GDM compared with those at low risk could benefit more from early intervention. Further studies are needed to clarify which kind of intervention strategy (whole-population strategy or high-risk population strategy) is more cost-effective.

In our study, two risk scores and their corresponding cut-off values were developed. To simplify the utility of risk scores in clinical practice, the first model was preferentially recommended with respect to its acceptable validation and relative simplicity (only including six easily detected variables). The cut-off value of 2.80 or 3.00 could be used before the 15th GW to identify the high-risk pregnant women. However, these recommended threshold values were arbitrary. To determine the optimal threshold applied to diagnostic testing, more information should be obtained, such as the feasibility of the model in practice, the preferences of obstetricians, the incidence of GDM, and the costs and the availability of diagnostic testing.19

The strength of our study included that the risk score models was developed and evaluated based on a prospective cohort with a large sample size and enough GDM cases. An unselected population of pregnant women registered in the three-tiered antenatal care network could guarantee good representativeness of our sample.

However, there were still some limitations in our study. First, our risk score was derived and validated by pregnant women population with GDM identified using a two-step procedure. Further validation studies in other care settings such as use of different antenatal care system and different GDM identification procedures are warranted. Presumably, it is needed to upcalibrate the absolute risk of GDM in those places where one-step GDM identification procedures are in use. Second, diet information before and during pregnancy were not collected when taking account of the feasibility of the survey and the simplicity of the models. Instead, BMI at registration and gestational weight gain was selected as predictors based on the present evidence that these two anthropometrics were closely associated with die quality.47 48 To our knowledge, only one risk score model had included diet as a predictor of GDM.23 Further studies are needed to explore the sensitive and valid diet-related items for GDM prediction. Moreover, external validation of our risk scores is required to evaluate the generalizability and applicability of our findings in other populations and different settings.

In conclusion, we have developed a set of clinical risk scores for the prediction of GDM among pregnant women before the 15th GW and at the time of GCT. The performance of the two risk scores was adequate with good calibration and moderate discrimination. Further validation is needed to evaluate the performance of the risk scores in other populations. In addition, RCTs are urgently required to verify whether GDM high-risk women identified by our models can benefit more from early lifestyle intervention than the low-risk ones, so that lifestyle intervention can be done in a more cost-effective manner.

Acknowledgments

We thank all doctors, nurses and research staffs at 65 primary care hospitals, 6 district-level women and children’s health centers (WCHCs), and Tianjin Women and Children’s Health Center (TWCHC) for their participation in this study.

References

Footnotes

  • SG, JL and XY contributed equally.

  • Contributors XY, HL, GH, JCNC and ZY conceived and designed the study; all authors, except GH, ZY, and JCNC contributed to the collection of the data. SG and JL analyzed the data and wrote the first draft; XY revised the draft critically for important intellectual content. All authors gave critical comments and contributed to the writing of the manuscript; SG, JL, XY and HZ take full responsibility for the work as a whole, including the study design, access to data, and the decision to submit and publish the manuscript.

  • Funding This work was supported by National Key Research and Development Program of China (Grant nos: 2018YFC1313900 and 2018YFC1313903), and National Natural Science Foundation of China (Grant nos: 81870549, 81602922, and 81900724).

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval The study ethics was approved by the Ethics Committee for Clinical Research of TWCHC. Written informed consent was obtained from all pregnant women before data collection.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement The raw data were generated at TWCHC. Data supporting the findings of this study are available from the corresponding author on reasonable request.