Article Text

Performance and costs of multiple screening strategies for type 2 diabetes: two population-based studies in Shanghai, China
1. Yanyun Li1,
2. Huiru Jiang2,3,
3. Minna Cheng1,
4. Weiyuan Yao2,3,
5. Hua Zhang2,3,
6. Yan Shi1,
7. Wanghong Xu2,3
1. 1Shanghai Municipal Center for Disease Control and Prevention, Shanghai, China
2. 2School of Public Health, Fudan University, Shanghai, China
3. 3Key Lab of Health Technology Assessment (National Health Commission), Fudan University, Shanghai, China
1. Correspondence to Dr Wanghong Xu; wanghong.xu{at}fudan.edu.cn; Ms Yan Shi; shiyan{at}scdc.sh.cn

## Abstract

Introduction To compare the performance and the costs of various assumed screening strategies for type 2 diabetes mellitus (T2DM) among Chinese adults, and identify an optimal one for the population.

Research design and methods Two multistage-sampling surveys were conducted in Shanghai, China, in 2009 and 2017. All participants were interviewed, had anthropometry, measured fasting plasma glucose (FPG), hemoglobin A1c (A1c) and/or postprandial glucose. The 1999 WHO diagnostic criteria was used to identify undiagnosed T2DM. A previously developed Chinese risk assessment system and a specific risk assessment system developed in this study were applied to calculate diabetes risk score (DRS) 1 and 2. Optimal screening strategies were selected based on the sensitivity, Youden index and the costs using the 2009 survey data as the training set and the 2017 survey data as the validation set. A twofold cross-validation was also performed.

Results Of numerous assumed strategies, FPG ≥5.6 mmol/L alone performed well (Youden index of 71.8%) and cost least (US$18.4 for each case detected), followed by the strategy of DRS2 ≥8 combining with FPG ≥5.6 mmol/L (Youden index of 71.7% and US$20.2 per case detected) and the strategy of DRS1 ≥17 combining with FPG ≥5.6 mmol/L (Youden index of 72.0% and US$21.6 per case detected). However, FPG alone resulted in more subjects requiring oral glucose tolerance test (OGTT) than did combining with DRS. The strategy of FPG ≥5.6 mmol/L combining with A1c ≥4.7% achieved a Youden index of 72.1%, but had a cost as high as US$48.8 for each case identified. Twofold cross-validation also supported the use of FPG alone, but with an optimal cut-off of 6.1 mmol/L.

Conclusions Our results support the use of FPG alone in T2DM screening in Chinese adults. DRS may be used combining with FPG in populations with available electronic health records to reduce the number of OGTT and save costs of screening.

• diabetes mellitus, type 2
• early diagnosis
• risk assessment
• glucose tolerance test

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

## Statistics from Altmetric.com

### Significance of this study

• Almost half Chinese patients with undiagnosed diabetes have a normal fasting plasma glucose (FPG), but an elevated postprandial glucose level, making oral glucose tolerance test (OGTT) an essential test in type 2 diabetes mellitus (T2DM) screening in China.

• Our previous study suggests that combining use of FPG and hemoglobin A1c (A1c) had potential to replace OGTT in T2DM screening in Chinese adults; however, the strategy is not the best choice due to the high cost of A1c assay.

• Risk assessment has been used as a complementary to FPG or A1c assay in T2DM screening to reduce the number of subjects requiring blood tests and decrease costs of screening, showing its potential in T2DM screening in Chinese adults.

• So far, however, very few studies have focused on both performance and costs of screening strategies for T2DM in Chinese population.

#### What are the new findings?

• A specific risk score system developed in this study includes age, sex, body mass index, waist circumstance, systolic blood pressure and family history of T2DM, which are similar to those in the Chinese risk score system developed based on a nationwide survey in China.

• Assumed screening strategies are established using a stepwise approach for the first time, with the risk score as the first stage (by each score), followed by FPG (by 0.1 mmol/L) or A1c level (by 0.1%); or with FPG (by 0.1 mmol/L) as the first stage, followed by A1c level (by 0.1%).

• The strategies of risk score followed by a blood test are not superior to FPG alone in screening T2DM among Chinese adults when taking both performance and costs into consideration, but may reduce the number of subjects requiring OGTT.

### Significance of this study

#### How might these results change the focus of research or clinical practice?

• FPG assay, the currently used screening strategy for T2DM in China, may be a good choice for Chinese adults under the WHO diagnostic criteria for T2DM.

• Combining use of risk score and FPG level may be recommended in populations with available electronic health record to reduce workload of OGTT and save costs of screening.

## Introduction

Type 2 diabetes mellitus (T2DM) is a major global health problem affecting over 451 million individuals worldwide.1 Over the past decades, a continuous increase in prevalence of T2DM has been observed in both high-income2 and low-income and middle-income countries.3 In Chinese adults, the prevalence of T2DM was 9.7% in 2007–2008 according to the 1999 WHO diagnostic criteria,4 and reached 11.6% in 2010 based on the 2010 American Diabetes Association diagnostic criteria.5 Due to lack of specific symptoms at early stage of T2DM, 60.7% of patients with diabetes remained undiagnosed in China,4 and a considerable proportion of patients suffered from at least one complication at first diagnosis.6–8 T2DM screening in general population aiming at early diagnosis is of great significance in clinic and public health in China.

Usually, T2DM screening is based on biochemical glycemic assays such as the fasting plasma glucose (FPG) and glycosylated hemoglobin A1c (A1c) and the 2-hour postprandial oral glucose tolerance test (OGTT). With high sensitivities and specificities, these invasive measurements have been used either alone or combined to identify T2DM in high-income countries and areas.9 10 As almost half undiagnosed patients with diabetes had a normal FPG but an elevated postprandial glucose level in China,11 12 OGTT, the uncomfortable, inconvenient and time-consuming test, is still widely used for screening and diagnosis purpose in the country. In our previous report, we found that combining use of FPG and A1c performed well and had potential to replace OGTT in Chinese population.13 Due to the high cost of A1c assay, however, the strategy may not be the best choice for a country with a huge population but limited healthcare resources.

Relative to blood tests, risk assessment using a score system is a cheap, convenient and non-invasive approach to identify patients with T2DM. Several risk score systems, such as Finnish,14 Danish,15 Canadian,16 Oman17 and Thailand18 systems, have been developed and widely used for diabetes screening. The diverse risk score systems across populations and their low sensitivities and specificities have greatly limited their utilities. However, risk assessment can be used as a complementary to FPG or A1c assay to reduce the number of subjects requiring blood tests and remarkably decreased costs of screening.19–21 In China, a diabetes risk score (DRS) system was established based on a nationwide diabetes survey. The performance of the DRS system was externally validated in two populations in Qingdao, achieving an area under receiver operating curve (AUC) >0.70 in predicting incidence of T2DM in both exploratory and validation settings.22 So far, only one study has evaluated the performance and the costs of screening tests for undiagnosed diabetes in Chinese adults, but only focused on the fasting capillary glucose test alone and the Chinese DRS system alone.23

In this study, taking advantage of two population-based diabetes surveys conducted in Shanghai, China, we established a new risk assessment system, and evaluated the performance and costs of multiple assumed screening strategies based on two DRS systems, FPG level and A1c level in identifying patients with T2DM. We aimed to seek a valid and economical screening strategy for T2DM in Chinese community settings.

## Methods

### Study design and populations

Two different population-based diabetes surveys were conducted among Chinese adults in Shanghai, China in 2009 and 2017. Both surveys were based on a multistage sampling process. The survey in 2009 was described in our previous report.13 Briefly, 4 districts and 2 counties were randomly selected from all 12 districts and 7 counties. Then one to two subdistricts or towns were randomly selected from each selected district or county. And then, one to two communities or villages were randomly selected from each selected subdistrict or town. Finally, 1000–2000 eligible subjects (permanent residents of Shanghai, 35–74 years of age and having resided in the city for at least 5 years) were randomly selected from each selected community or village and invited to participate in the survey (online supplementary figure 1A). A similar sampling process was conducted in the 2017 survey (online supplementary figure 1B). A total of 2500 participants of the 2009 survey were included in the 2017 survey.

### Supplemental material

Pregnant women, individuals with type 1 diabetes and those physically or mentally disabled were excluded from participation. In the survey of 2009, of 11 844 eligible adults, 7964 participated in the survey, resulting in a response rate of 67.2%. In the 2017 survey, a total of 23 993 subjects aged 35 years or above were recruited and 22 246 subjects (8898 men and 13 348 women) participated in the survey.

### Data collection

In-person interview was conducted to collect information on demographic characteristics, lifestyle factors and previous diagnosis of diabetes in both surveys. At the interview, body weight, standing height, waist circumstance (WC) and blood pressure were measured for each participant according to a standard protocol, as described previously.12 Two measurements were taken and the mean value was used in the analyses. Body mass index (BMI) was calculated as weight in kilograms divided by the square of height in meters (kg/m2). Hypertension was defined as systolic/diastolic blood pressure (SBP/DBP) ≥140/90 mm Hg or using antihypertensive drugs.

### Laboratory assays

After at least 10 hours overnight fasting, a 1.0~1.5 mL venous blood sample was collected for each subject in a vacuum tube containing sodium fluoride to measure FPG level. For those with an FPG <7.0 mmol/L, a standard 75 g glucose load was given and a second blood sample was collected to measure 2-hour postprandial blood glucose (2hPG).

Biochemical assay was conducted in one lab according to a standardized protocol. Plasma glucose was tested using glucose oxidase-peroxidase method. A1c level was assayed using high-performance liquid chromatography, which was recommended by the National Glycohemoglobin Standardization Programme.24

### Costs of screening

Individuals with T2DM were diagnosed according to the 1999 WHO. Costs of screening were estimated based on the costs in risk assessment, FPG, A1c and OGTT assay, which were determined as US$0.2, US$0.9, US$9.3 and US$1.4 (¥1.5, ¥6, ¥65 and ¥10) per assay, respectively, according to the charge standard for biochemistry assay in Shanghai, China (http://wsjkw.sh.gov.cn/ylsfbz/index.html, access date: August 22, 2019). Costs per case detected were calculated as the total costs divided by the number of patients with diabetes detected.

### Statistical analyses

A validated Chinese DRS system with a summary risk score ranging from 0 to 5122 was used as a risk assessment tool in this study (DRS1). A specific risk assessment system was also developed in this population with a specific risk score (DRS2). First, univariate regression was used to identify potential risk factors for T2DM based on the 2009 survey data. All variables with p value <0.10 were included in a multivariable logistic regression model. After excluding variables with p value >0.05, we established a final model and estimated β-coefficient for each variable. The accuracy and fitness of the model was evaluated using the Brier scores. DRS2 was calculated by multiplying the β-coefficients by 10 and rounding to the nearest integer,22 and was validated using the 2017 survey data. The optimal cut-off points of the DRS1 and DRS2 were identified based on the Youden index, which was at the maximum sum of the sensitivity and specificity−1.25

Non-linear relationships of the DRS1 or DRS2 with the levels of FPG, 2hPG or A1c were evaluated by restricted cubic splines using the 5th, 25th, 75th and 95th percentiles as fixed knots.

Stepwise approach was used to establish combining screening strategies, with DRS1 or DRS2 (by each score) as the first stage to identify individuals at high risk, followed by FPG (by 0.1 mmol/L) or A1c assay (by 0.1%); or with FPG level (by 0.1 mmol/L) as the first stage, followed by A1c level (by 0.1%). Only positive subjects in the last step and with FPG <7.0 mmol/L (if available) would take an OGTT as a diagnostic test (figure 1). The performance of assumed screening strategies using DRS or blood tests alone or combined were tested in participants of the 2009 survey and validated in participants of the 2017 survey. Of hundreds of assumed stepwise strategies, we only focused on those with sensitivity ≥85% and Youden index ≥70%. We also performed a twofold cross-validation by using randomly selected half participants of the two surveys as the training set and the remaining half as the validation set. A bootstrap resampling method was conducted 100 times to obtain CIs for all related estimates.

Figure 1

Diabetes screening strategies and costs at each step. A1c, hemoglobin A1c; FPG, fasting plasma glucose; OGTT, oral glucose tolerance test

Diagnostic accuracy of screening strategies was assessed using AUC.26 An AUC >0.9 indicates a high diagnostic value, 0.7<AUC≤0.9 indicates a moderate diagnostic value and 0.5<AUC≤0.7 indicates a low diagnostic value.

We used SAS V.9.2 for all statistical analyses, and considered p<0.05 being statistically significant for a two-sided test.

## Results

### Characteristics of the participants

After excluding subjects with incomplete questionnaires, at age of 75 years or above, having a prior history of T2DM or with missing values of A1c or BMI, 6649 subjects (3050 men and 3599 women) of the 2009 survey and 16 103 subjects (6253 men and 9850 women) of the 2017 survey were finally included in this analysis (online supplementary figure 1A,B).

Table 1 presents characteristics of study participants in the 2009 and the 2017 surveys by sex. The subjects in the 2017 survey were older than those in the 2009 survey, with an average age of 60.6 years in men and 59.9 years in women vs 54.3 years in men and 54.2 years in women. Compared with the participants of the 2009 survey, the subjects of the 2017 survey obtained more education, had higher levels of BMI, WC, SBP, DBP, and were more likely to drink alcohol and have a family history of T2DM (all p<0.001). The women in the 2017 survey were less likely to smoke while the men were more likely to smoke than those in the 2009 survey (p<0.0001).

Table 1

Demographic and lifestyle characteristics of study participants

### DRS in subjects with diabetes and without diabetes

Online supplementary table 1 shows the risk score for each significant variable in the specific risk assessment system developed based on the 2009 survey data. The summary risk score, namely DRS2 in this study, ranged from 0 to 49 and had an optimal cut-off point of 22 in subjects of the 2009 survey and 24 in participants of the 2017 survey.

Table 2 shows the comparison of risk score and related risk factors between subjects with diabetes and without diabetes. In the 2009 survey, a total of 454 patients with diabetes were identified, with prevalence of T2DM being 6.8%. In the 2017 survey, the prevalence reached 11.6%. Generally, patients with diabetes were older, tended to have a family history of diabetes and had higher average levels of BMI, WC, SBP, DBP and DRS comparing with non-diabetes in both surveys (all p values <0.05). The mean DRS1 was 5~6 points higher in patients with diabetes than those without in the 2009 survey, but only about 3 points higher in the 2017 survey. By contrast, the mean DRS2 was 6~7 points higher in patients with diabetes than those without in the 2009 survey, it was about 4~5 points higher in the 2017 survey.

Table 2

Comparison of risk scores and related factors between diabetes and non-diabetes

### Diagnostic value of the DRS

The non-linear does-response relationship was observed for DRS with FPG, 2hPG and A1c levels, with all p values for non-linear association <0.0001. As shown in online supplementary figure 2A,B, the levels of FPG, 2hPG and A1c were observed to increase with increasing DRS1 in both surveys in a non-linear pattern, with p values for non-linear association <0.0001. A similar non-linear does-response relationship was observed for FPG, 2hPG and A1c levels with DRS2 (online supplementary figure 2C,D).

FPG level ranked first in AUC in diagnosis of T2DM, followed by A1c, DRS2 and DRS1 in both surveys (figure 2A,B). The optimal cut-off point was 34 for DRS1, 22 for DRS2, 5.7 mmol/L for FPG and 6.1% for A1c. Generally, the two risk score systems had a moderate diagnostic value in this population, with an AUC of 0.728 and 0.671 for DRS2 in the 2009 and the 2017 surveys, respectively, slightly higher than 0.678 and 0.633 for DRS1 in the two surveys.

Figure 2

AUC of DRS1, DRS2, FPG and A1c level in the 2009 survey (A) and the 2017 survey (B). DRS1 was the risk score calculated based on the Chinese diabetes risk score system; DRS2 was the risk score calculated based on the specific system developed in this study. A1c, glycosylated hemoglobin A1c; AUC, area under receiver operating curve; FPG, fasting plasma glucose; DRS, diabetes risk score.

### Performance and costs of assumed screening strategies

Listed in online supplementary table 2 are the performance of hundreds of assumed screening strategies in identifying T2DM in the 2009 survey. When used alone, DRS1, DRS2, FPG and A1c achieved a Youden index of 27.4%, 35.3%, 72.5% and 64.9%, respectively. In combining screening strategies, those with A1c level cost much more than those without.

We only focused on screening strategies with sensitivity ≥85% and Youden index ≥70%. For screening strategies using same index in each step, the one with the lowest cost was selected. As shown in table 3, FPG ≥5.6 mmol/L alone performed well with Youden index of 71.8% and cost least (US$18.4 for each case detected), followed by the strategy of DRS2 ≥8 followed by FPG ≥5.6 mmol/L (Youden index of 71.7% and US$20.2 per case detected) and the strategy of DRS1 ≥17 followed by FPG ≥5.6 mmol/L (Youden index of 72.0% and US\$21.6 per case detected). The strategy of FPG ≥5.6 mmol/L followed by A1c ≥4.7% performed well but had a higher cost than the strategies without A1c. The strategy of DRS2 ≥8 followed by FPG≥5.6 mmol/L had the fewest subjects taking OGTT compared with other screening strategies.

Table 3

Validity and costs of screening strategies in the 2009 survey (as the training set) and in the 2017 survey (as a validation set)

Twofold cross-validation was further performed using randomly selected half subjects in the 2009 and the 2017 survey as training sets and the remaining half subjects as validation sets. Of screening strategies with sensitivity ≥80% and Youden index ≥70%, we identified four screening strategies with the lowest cost, as shown in table 4. Similarly, FPG alone performed well and cost least, but with an optimal cut-off point of 6.1 mmol/L.

Table 4

Performance and costs of screening strategies in the training set (randomly selected half subjects in the 2009 and the 2017 survey) and the validation set (the remaining half subjects in the 2009 and the 2017 survey)

## Discussion

In this study of Chinese adults randomly selected from community settings of Shanghai in 2009 and in 2017, we established a DRS system to detect undiagnosed T2DM and compared the performance and costs of various assumed screening strategies. We did not find that strategies of DRS followed by a blood test were superior to FPG alone in screening diabetes among Chinese adults when taking both validity and costs into consideration. To the best of our knowledge, this is the first large-scale study to evaluate the validity and costs of diabetes screening strategies in Chinese population.

Risk assessment has been widely used to identify individuals at high risk of T2DM.27 28 A proper risk score system may reduce screening costs without loss of much validity.14 In this study, we established a specific diabetes risk score system including age, sex, BMI, WC, SBP and family history of T2DM, which was consistently with the Chinese diabetes risk score system developed based on a nationwide survey in China.22 We found that both DRS were positively associated with FPG, 2hPG and A1c levels in a non-linear fashion. However, the Chinese DRS system achieved an AUC <0.700 in our population, much less than those in original exploratory Qingdao population (0.748) and the two validation populations (0.725 and 0.717).22 The optimal cut-off point of DRS1, which was based on the Chinese diabetes risk score system, was 34 in the 2009 survey, much higher than 25 in Qingdao population, and had sensitivity of only 63.7% and 64.8% in the 2009 and the 2007 surveys, and specificity of only 63.8% and 53.7%, respectively. The specific DRS system developed in this study performed better than did the Chinese DRS system, but with an AUC of only 0.728 in the 2009 survey and 0.671 in the 2017 survey. Our results did not support the application of DRS alone in diabetes screening. As a non-invasive tool, it may be used as the first step in stepwise screening strategies combining with blood tests to balance the validity and the costs.

A number of previous studies have used simulation models to evaluate the cost-effectiveness of screening strategies for T2DM.14–16 However, none of these studies was conducted in China, the country holding approximately half of the world’s diabetes population.4,5 In this study, we found that FPG ≥5.6 mmol/L alone or ≥6.1 mmol/L alone performed well and cost least in detecting T2DM, providing supportive evidence for the screening strategy currently used in China.29 When using the Chinese DRS system, the strategy of DRS1 ≥17 followed by FPG ≥5.6 mmol/L had comparable performance and costs, and had 148 (2.23%) patients free from invasive blood tests and 6 (0.09%) free from OGTT relative to the strategy of FPG ≥5.6 mmol/L alone. Whereas, the strategy of DRS2 ≥8 followed by FPG ≥5.6 mmol/L had 861 (13.0%) patients free from blood tests and 58 (0.87%) patients free from further OGTT. Considering that many cities in China have established electronic health record systems (eHR),30–33 risk assessment based on eHR system may save enormous costs on manpower and materials to identify high-risk population. In this case, combining use of DRS and FPG assay can be used in Chinese adults. Otherwise, FPG alone may be good enough for diabetes screening in the population.

The strengths of the study include large sample size, random sampling process and various screening strategies established at the level of 1.0 for risk score, 0.1 mmol/L for FPG and 0.1% for A1c. However, the response rate was relatively low (67.2%) in the 2009 survey, and lack of information on non-participants’ characteristics limited our ability to evaluate the potential selection bias. Moreover, the accuracy of DRS, and FPG and A1c measurements was the key for stepwise screening. In the real world, the risk assessment and blood tests would be conducted in different labs. The possible interlab bias in measurements may influence the effect of related screening strategies. Finally, the screening costs were estimated mainly based on the costs of laboratory assay. It may lead to underestimation of screening costs, because the costs for manpower, phone calls and transportation were not taken into account in our analyses.

## Conclusions

In conclusion, in view of performance and costs, the strategy of FPG assay alone is comparable to the strategies of non-invasive risk assessment combining with a blood test in Chinese adults. Our results support the use of FPG alone, the currently used strategy in China and suggest that combining use of risk score and FPG level may be a good choice for Chinese adults with available eHR system. Further epidemiological and clinical studies are warranted to validate our results.

## Acknowledgments

The authors would like to thank study participants of the two cross-sectional surveys and the healthcare workers in all communities involved. The authors would also like to thank the research assistants for their great contributions to data collection and data entry.

## Footnotes

• YL and HJ contributed equally.

• Contributors YL and HJ drafted the manuscript. WX and YS coordinated the study and contributed to study design, statistical analysis, YL and MC contributed to data acquisition, HZ contributed to data cleaning and statistical analysis. All authors contributed to the interpretation of data and revision of the manuscript. WX and YS are the guarantors of this work and had full access to all the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

• Funding This study was funded by the Three-year Action Plan on Public Health, Phase IV, Shanghai, China (15GWZK0801).

• Competing interests None declared.

• Patient consent for publication Not required.

• Ethics approval The Institutional Review Board (#IORG0000630) of the Shanghai Municipal Center of Disease Control and Prevention approved both surveys (#2016-25 for the 2017 survey). Informed written consent was obtained from each participant before interview and bio-specimen collection.

• Provenance and peer review Not commissioned; externally peer reviewed.

• Data availability statement Data are available on reasonable request. De-identified data collected for this study and a data dictionary are available from the corresponding author on reasonable request.

## Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.