Article Text

Accuracy of breath test for diabetes mellitus diagnosis: a systematic review and meta-analysis
  1. Wenting Wang1,
  2. Wenzhao Zhou2,
  3. Sheng Wang3,
  4. Jinyu Huang1,
  5. Yanna Le4,
  6. Shijiao Nie1,
  7. Weijue Wang5,
  8. Qing Guo3,5
  1. 1Affiliated Hangzhou First People’s Hospital Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
  2. 2Department of Biology and Chemistry, Zhejiang Institute of Metrology, Hangzhou, China
  3. 3Department of Medicine, Hangzhou Normal University, Hangzhou, Zhejiang, China
  4. 4Hangzhou Medical Association, Hangzhou, China
  5. 5School of Humanities and Management, Zhejiang Chinese Medical University, Hangzhou, Zhejiang, China
  1. Correspondence to Dr Jinyu Huang; jinyu_h{at}; Professor Qing Guo; louisguoqing{at}


The review aimed to investigate the accuracy of breath tests in the diagnosis of diabetes mellitus, identify exhaled volatile organic compounds with the most evidence as potential biomarkers, and summarize prospects and challenges in diabetic breath tests. Databases including Medline, PubMed, EMBASE, Cochrane Library and Science Citation Index Expanded were searched. Human studies describing diabetic breath analysis with more than 10 subjects as controls and patients were included. Population demographics, breath test conditions, biomarkers, analytical techniques and diagnostic accuracy were extracted. Quality assessment was performed with the Standards for Reporting Diagnostic Accuracy and a modified QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies 2). Forty-four research with 2699 patients with diabetes were included for qualitative data analysis and 14 eligible studies were used for meta-analysis. Pooled analysis of type 2 diabetes breath test exhibited sensitivity of 91.8% (95% CI 83.6% to 96.1%), specificity of 92.1% (95% CI 88.4% to 94.7%) and area under the curve (AUC) of 0.96 (95% CI 0.94 to 0.97). Isotopic carbon dioxide (CO2) showed the best diagnostic accuracy with pooled sensitivity of 0.949 (95% CI 0.870 to 0.981), specificity of 0.946 (95% CI 0.891 to 0.975) and AUC of 0.98 (95% CI 0.97 to 0.99). As the most widely reported biomarker, acetone showed moderate diagnostic accuracy with pooled sensitivity of 0.638 (95% CI 0.511 to 0.748), specificity of 0.801 (95% CI 0.691 to 0.878) and AUC of 0.79 (95% CI 0.75 to 0.82). Our results indicate that breath test is a promising approach with acceptable diagnostic accuracy for diabetes mellitus and isotopic CO2 is the optimal breath biomarker. Even so, further validation and standardization in subject control, breath sampling and analysis are still required.

  • diabetes mellitus
  • experimental
  • diagnostic techniques and procedures
  • biomarkers
  • meta-analysis

Data availability statement

All data relevant to the study are included in the article or uploaded as supplemental information.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Diabetes mellitus is a common metabolic disease with pathologically high blood glucose levels, causing damage to various organs and nerves. According to the latest International Diabetes Federation Diabetes Atlas, about 463 million people suffer from diabetes mellitus all over the world; however, more than half of them are undiagnosed and unaware of their status.1 Currently, the criteria for diagnosis of diabetes mellitus, including oral glucose tolerance test, fasting plasma glucose (FPG) test and glycosylated hemoglobin A1c (HbA1c) test, are all invasive blood-based assays,2 3 which limits screening for the disease.

In human exhaled breath, a wide variety of volatile organic compounds (VOCs) are observed and associated with health conditions.4 Analysis of breath VOCs provides a non-invasive approach to diagnosis of some diseases and monitoring of physiological effects and therapeutic efficacy,5–7 which enhanced the acceptability of patients for disease screening. To date, breath tests such as 13C urea test (for Helicobacter pylori), hydrogen-methane test (for gastrointestinal diseases), exhaled nitric oxide test (for asthma), Heartsbreath test (for heart transplant rejection) and breath carbon monoxide test (for neonatal jaundice) have been applied in clinical practice.8 The association of diabetes mellitus and breath VOCs has been observed since the 1940s9 and is being reported by many studies up to now.10 Despite numerous studies, the applicability and diagnostic accuracy of breath tests for diabetes mellitus remain controversial.

In this work, we systematically reviewed studies that described the use of breath tests for diagnosis of diabetes mellitus, summarized exhaled characteristic VOCs in diabetes mellitus and assessed their diagnostic accuracy. The aim of this review is to identify breath VOCs with the most evidence as potential biomarkers in breath tests and investigate the accuracy of these VOCs in the diagnosis of diabetes mellitus.


Search strategy and selection criteria

This systematic review was conducted in accordance with the guidelines of the Preferred Reporting Items for Systematic Review and Meta-Analysis of Diagnostic Test Accuracy studies. The review was registered in PROSPERO (International Prospective Register of Systematic Reviews; registration ID: CRD42020222249).

The search was performed in the Medline, PubMed, Cochrane Library, EMBASE and Science Citation Index Expanded databases dated to November 23, 2020. Keywords including diabetes, diabetes mellitus, breath, exhaled, expired gas and expired air were used for searching. All search results including titles and abstracts were checked by two reviewers independently (WZ, WtW). The detailed search strategy is provided in online supplemental table S1.

Supplemental material

Studies were included according to the following selection criteria: (1) human studies describing breath tests for diabetes mellitus diagnosis or discriminant; (2) all participating patients with diabetes mellitus were diagnosed by gold standard assay; and (3) at least two cohorts including patients and controls with more than 10 subjects in each group were studied and compared. Moreover, studies were included only if the full text was available in English language. We excluded studies meeting the following criteria: (1) review articles, conference abstracts, comments, case reports, viewpoints and editorials; (2) studies focusing on other diabetic complications (eg, diabetic nephropathy or ketoacidosis) rather than the diabetes mellitus itself; and (3) analyte was exhaled breath condensate.

Data extraction and quality assessment

Data extraction and assessment of research quality were performed by two reviewers (WZ, WtW) independently and judged by the third reviewer (JH) for controversial studies. Information including authors, country, year of publication, population characteristics (diabetes type, number of patients and controls, age, gender), analytical method, breath biomarkers and outcomes was collected. Different breath biomarkers in one study were all listed. Some data were calculated, converted or corrected based on the original data provided in the studies. The Standards for Reporting Diagnostic Accuracy (STARD) and a modified QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies 2) appropriate for phase I studies on biomarker discovery were used to assess the quality of included studies and risk of bias. This modified QUADAS-2 referenced to Hanna et al’s study,11 and a signaling question was further modified to adapt to diabetes mellitus.

Statistical analysis

To estimate the diagnostic accuracy of the included breath tests, a random-effect meta-analysis was performed to generate pooled sensitivity, specificity, diagnostic OR (DOR), positive likelihood ratio (PLR), negative likelihood ratio (NLR) and area under the summary receiver operating characteristic (ROC) curve. Subgroup analysis by diabetes type and biomarkers was also conducted. Breath biomarkers that were reported at least over two times were analyzed, and pooled sensitivity, specificity and 95% CIs were assessed. The significance of heterogeneity was estimated using I2 (I2 >50% or p<0.05). Leave-one-out sensitivity analysis was also conducted to identify heterogeneity. Publication bias was assessed using Begg’s and Egger’s tests. Diagnostic threshold was estimated via Spearman’s correlation test. All data were analyzed in STATA V.12.0 software (Midas module).


Study selection and characteristics

As depicted in figure 1, a total of 6229 studies were identified from the search of databases and other sources and 2648 duplicates were first removed. Then, 1394 records were excluded by type of literature. After screening by title and abstract, 2015 studies were excluded. Afterwards, 172 full-text articles were further screened, and 44 studies were included for qualitative data analysis and 14 eligible studies were used for meta-analysis.

Figure 1

Flow diagram of study search and selection process. SCIE, Science Citation Index Expanded.

The characteristics of the included studies are presented in online supplemental table S2. A total of 2699 patients with diabetes mellitus, including 265 cases of type 1 diabetes (T1D), 1376 cases of type 2 diabetes (T2D) and pre-diabetes (PD), 49 cases of chemical diabetes (abolished), and 1009 cases of patients with indiscriminate diabetes, from 14 countries were studied. Overall, participants were in the 4–91 age range and the subjects in 14 eligible studies for meta-analysis were all adults. Detailed information on the population, including body mass index, FPG and HbA1c, is presented in online supplemental table S3. Among the 44 studies, 16 independent compounds were reported for diabetes mellitus diagnosis, and 4 of these biomarkers, namely acetone (n=19), isotopic carbon dioxide (CO2) (n=11), isopropanol (n=5) and dimethyl sulfide (n=2), were reported more than twice. Note that isotopic CO2 is an exogenous metabolite which derives from the ingested 13C-glucose or normal glucose. Majority of these breath VOCs were increased in exhaled breath, except 13CO2 and m-xylene which were decreased. Moreover, acetone was also decreased in the breath of patients with diabetes after dialysis. Analytical methods used for diabetic breath analysis within these studies involved spectroscopic, chromatographic, mass spectral and sensor-based methods. Sensor (n=14) was the most frequently used technique for diabetic breath tests, and gas chromatography-mass spectrometry (GC-MS) (n=7) was the most commonly used method, with qualitative ability to identify diabetic biomarkers in exhaled breath.

Quality assessment

The results of quality assessment using QUADAS-2 are presented in online supplemental figures S1 and S2. Detailed information on the modified QUADAS-2 is provided in online supplemental table S4. The STARD scores of each study are listed in online supplemental table S5; a mean value of 11.3 and SD of 5.0 were obtained.

Data analysis

To assess the overall diagnostic accuracy of breath test for diabetes mellitus, 14 studies were included in the meta-analysis and the highest diagnostic accuracy in each study was adopted. Pooled PLR of 11.531 (95% CI 7.165 to 18.558), NLR of 0.101 (95% CI 0.056 to 0.183), DOR of 114.333 (95% CI 42.083 to 310.626), sensitivity of 90.7% (95% CI 83.8% to 94.8%) and specificity of 92.1% (95% CI 87.9% to 95.0%) were obtained. Summary ROC analysis was performed with AUC of 0.97 (95% CI 0.95 to 0.98) (online supplemental figure S3). Confidence results showed the potential of breath tests in diabetes mellitus diagnosis. Nevertheless, substantial heterogeneity was also observed. Therefore, these studies were further grouped according to type of diabetes and breath biomarkers.

T2D was the most widely investigated in these studies; 10 studies including T2D and PD (the precursor of T2D) were analyzed. Yan’s12 study was also included since only 1 case of T1D was mixed with 86 T2D samples. The diagnostic accuracy for T2D was elevated and the specificity showed no significant heterogeneity; however, heterogeneity in sensitivity remained substantial. Leave-one-out sensitivity analysis was carried out and three studies (Li et al,13 Yatscoff et al14 and Zhou et al15) which mainly contributed to heterogeneity were excluded. After removing these three studies, heterogeneity was remarkably reduced and no significant changes were observed in sensitivity and specificity before and after exclusion. As shown in figure 2, the diagnostic accuracy for T2D was 11.591 (95% CI 7.579 to 17.725) for PLR, 0.089 (95% CI 0.042 to 0.187) for NLR, 130.461 (95% CI 45.054 to 377.770) for DOR, 91.8% (95% CI 83.6% to 96.1%) for sensitivity, 92.1% (95% CI 88.4% to 94.7%) for specificity and 0.96 (95% CI 0.94 to 0.97) for AUC.

Figure 2

Forest plot of (A) sensitivity, (B) specificity and (C) SROC for the type 2 diabetes subgroup. Reference details provided in online supplemental file 1. AUC, area under the curve; SENS, sensitivity; SPEC, specificity; SROC, summary receiver operating characteristic curve.

According to group of biomarkers, isotopic CO2 including 13CO2 and C18O2 were the most frequently used in diabetic breath tests. Among the seven studies using isotopic CO2 as biomarker, heterogeneity in sensitivity was observed. By leave-one-out analysis, Yatscoff et al’s14 study was excluded and heterogeneity was reduced to an acceptable level (I2 <50%, p>0.05) and no significant changes in diagnostic accuracy were observed. As shown in figure 3A, isotopic CO2 had PLR of 17.716 (95% CI 8.376 to 37.473), NLR of 0.054 (95% CI 0.020 to 0.144), DOR of 328.275 (95% CI 77.049 to 1398.643), sensitivity of 0.949 (95% CI 0.870 to 0.981), specificity of 0.946 (95% CI 0.891 to 0.975) and AUC of 0.98 (95% CI 0.97 to 0.99). Acetone was also commonly used as diabetic breath biomarker. The included five studies exhibited heterogeneity in diagnostic sensitivity and specificity. Through the leave-one-out approach, heterogeneity was reduced after exclusion of Zhou et al’s study; however, sensitivity and specificity also changed obviously. In this study, patients and health volunteers were recruited from different provinces of China (patients with diabetes were from Jilin Province and volunteers were from Sichuan Province), with visible differences in location, climate, environment and diet, which have an impact on VOC levels in exhaled breath. These differences between patients and health volunteers may amplify the distinction of breath VOCs in the two groups, which is likely to increase discriminant accuracy. Thus, this study was excluded to obtain a more reliable result. Acetone had pooled PLR of 3.199 (95% CI 2.152 to 4.756), NLR of 0.453 (95% CI 0.341 to 0.601), DOR of 7.069 (95% CI 4.191 to 11.922), sensitivity of 0.638 (95% CI 0.511 to 0.748), specificity of 0.801 (95% CI 0.691 to 0.878) and AUC of 0.79 (95% CI 0.75 to 0.82), respectively (figure 3B).

Figure 3

Forest plot of sensitivity, specificity and SROC for (A) isotopic carbon dioxide and (B) acetone. Reference details provided in online supplemental file 1. AUC, area under the curve; SENS, sensitivity; SPEC, specificity; SROC, summary receiver operating characteristic curve.

Publication bias

The Begg’s and Egger’s tests were applied to assess publication bias as presented in online supplemental figure S4. No publication bias was suggested in the T2D group, isotopic CO2 group and acetone group (Egger’s test p=0.931, p=0.300 and p=0.888).


According to the overall diagnostic accuracy results, breath test is a promising approach to non-invasive diagnosis of diabetes mellitus with prominent performance, although most studies showed risk of bias, which may overestimate diagnostic accuracy. By subgroup analysis, it was found that breath test is suitable for T2D diagnosis and isotopic CO2 was the most discriminant breath biomarker. Although the T1D breath analysis has been investigated by a number of studies, only one study reported the diagnostic accuracy, which is not sufficient to draw a conclusion.

CO2 is a common component in human exhaled breath, which is mainly from oxidation of glucose. The glucose is first converted to pyruvate via the glycolysis process and then oxidized by O2 to generate CO2. During this process, ATP is synthesized to produce essential energy. As a metabolic disease, diabetes mellitus may change the production of cellular energy and thereby alter the breath CO2 sequentially. By ingestion of 13C-glucose, 13CO2 was observed and presented reduced concentration in diabetic breath. By contrast, C18O2 was increased in diabetic breath with ingestion of normal glucose. This phenomenon is mainly due to the different source of isotope. The 13C isotope in 13CO2 was directly from the metabolism of 13C-glucose, but the 18O isotope in C18O2 was converted from H218O in the human body and catalyzed by the carbonic anhydrase.16 For diabetes mellitus diagnosis, evaluation indexes such as the delta over baseline (δDOB),14 insulin sensitivity index (ISI0,120)17 and changes in carbonic anhydrase activities (ΔCA)18 were proposed to estimate isotopic CO2 changes over time.

Acetone is an attractive breath biomarker which has been reported to correlate with various diseases.19 This metabolite can be derived from decarboxylation of acetoacetate and dehydrogenation of isopropanol.20 The relationship between diabetes mellitus and breath acetone has been investigated since the 1940s.9 However, the evidence of its validity in diabetes mellitus diagnosis is still limited. Pooled analysis showed acetone to have moderate diagnostic accuracy when used as an independent biomarker and its sensitivity is especially limited. In addition, during literature screening, we found that a number of research claimed sensors to have potential in the diagnosis of diabetes mellitus by directly citing an acetone concentration of 1.8 parts per million (ppm) as the criterion;21–25 however, their citations seemed unreliable to support this conclusion. Breath acetone concentrations reported in the included studies are listed in table 1, and all units were converted to ppm. As can be seen, breath acetone is significantly influenced by various factors, such as gender, age, diabetes type, diet, exercise and drug treatment. Based on available studies, the threshold of breath acetone for diabetes mellitus diagnosis is still inconclusive. It is inappropriate to cite the 1.8 ppm acetone as criteria.

Table 1

Breath acetone concentrations in the included studies

Embedded Image(1)

According to equation (1), the average concentrations of breath acetone (Cacetone) in T1D (n=205), T2D (n=738) and healthy subjects (n=417) were calculated to be 7.86 ppm, 1.66 ppm and 0.68 ppm, respectively. Cx and n represent mean concentration of acetone and sample size in each study, and N is the overall sample size. Data in Yu et al’s26 study were excluded, since the data showed apparently higher acetone concentrations in both healthy and diabetic breath than that observed in other studies (more than almost 100 times), which may be due to misuse of gas concentration units. The weighted mean difference (WMD) was also used as effect size to show distinction in breath acetone in the three groups (T1D vs non-diabetes (ND), T2D vs ND, and T1D vs T2D). Breath acetone presented significantly higher concentration in both T1D (WMD=1.374, 95% CI 0.986 to 1.762, z=6.94, p=0.000) and T2D (WMD=0.845, 95% CI 0.605 to 1.085, z=6.91, p=0.000) groups, while there was no significant difference between T1D and T2D (p=0.162>0.05). Detailed information is depicted in online supplemental figure S5.

Isopropanol was used as potential biomarker in diabetic breath analysis in five studies and exhibited higher levels in the breath of patients with diabetes. This compound is mainly metabolized from propanoates in the human body. It is also a substrate of acetone synthesis from enzyme isopropanol dehydrogenase catalysis.27 Recently, research showed that alcohol dehydrogenase in the liver is capable of reversely converting acetone to isopropanol in some abnormal conditions.5 Based on available research, this compound showed moderate diagnostic accuracy with higher specificity than sensitivity. Unfortunately, only two studies provided the diagnostic accuracy of isopropanol. It does not make much sense to pool the data for further analysis.

Dimethyl sulfide was also reported twice. It is considered as a metabolite of microbial activities in the human body.28 Variations of this compound in diabetic breath may be due to alteration in the gut microbiota induced by insulin deficiency. Hence, this component may not be suitable as a stable biomarker.

In terms of methods, sensor is the most frequently used technique for diabetic breath analysis. It is a fast, convenient and cost-effective approach for VOC detection. However, this technique is not capable of identifying the exhaled components to pick out characteristic biomarkers. Therefore, the principle of sensor array for diabetes mellitus diagnosis is difficult to interpret. GC-MS is another common analytical technique which enables qualitative and quantitative analyses of breath VOCs. This is a costly laboratory instrument with time-consuming analytical process. In breath analysis, spectroscopic methods were usually applied for detection of specific compounds such as CO2 and acetone. In addition, as an online mass spectra method, proton transfer reaction-mass spectrometry was also used for diabetic breath analysis. This method is quick and sensitive to VOC, with strong proton affinity; however, its applicability is relatively limited and inadequate in qualitative analysis.

Prospects and challenges

Despite promising results shown by diabetic breath tests, there are still some problems to be solved before these can be applied to clinical practice. First, most available studies were initial open trials, which are helpful in identifying diabetic biomarkers, but increased the risk of bias. More blind tests should be carried out in future work for further validation.

Second, standardization of diabetic breath analysis including subject control, breath sampling and detection is essential, which have significant impact on the test results. Since breath VOCs can be influenced by various factors, subject control is the first step before breath tests. Available studies demonstrated that diet control is a crucial condition, and overnight fasting was the most widely used. It is also a standard condition for FPG test. Potential biomarkers such as acetone and CO2 were significantly affected by food intake.29 30 Overnight fasting effectively reduced interference from food. Accordingly, diet control is a requisite for diabetic breath tests unless the adopted biomarkers can be proven to be not affected by food ingestion. Other control conditions such as restraining physical exercise and conducting gas exchange before the test have also been applied in some breath analyses.13 The specific control conditions should consider the metabolic properties and influencing factors of the biomarker used. Breath sampling is the second important step which is also relevant to detection technique. Conventional analytical techniques such as GC-MS and gas chromatography-flame ionization detector are typical offline methods which require collection of breath in a container, such as a sampling bag or a sampling bottle. Breath collection for offline analysis should ensure enough breath is filled into the container. Moreover, effective duration of the sampled breath and possible contamination from the container should be considered. Some sensors, online mass spectrometry and spectroscopic methods allowed direct analysis of exhaled breath in real time. For online breath sampling, the core is standardization of breath exhaling. Breath exhaling should be carried out in a standard way to guarantee comparability of test results. Thus, training of subjects seems necessary before an online breath test. For breath detection, analytical methods ought to be further optimized and tested to obtain reliable stability and reproducibility for clinical practice.


In this systematic review and meta-analysis, breath tests for diabetes mellitus diagnosis were investigated and the diagnostic accuracy of breath biomarkers was further estimated. The applicability of breath test for T2D diagnosis was demonstrated with great diagnostic accuracy. Besides, among the included biomarkers, isotopic CO2 exhibited optimal sensitivity and specificity, while the diagnostic accuracy of acetone and isopropanol was relatively moderate. Our results suggest that breath test is a promising approach to non-invasive diagnosis of diabetes mellitus and is especially appropriate for large-scale preliminary screening. Even so, before clinical practice, there is still a lot of work to do with standardization of breath tests, including subject control, breath sampling and analyzing.

Data availability statement

All data relevant to the study are included in the article or uploaded as supplemental information.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.


  • WtW and WZ are joint first authors.

  • Contributors WZ and WtW were involved in the study design, data extraction, data analysis and drafting of the manuscript. JH and QG were responsible for data screening and results checking. SW, SN and WjW contributed to the statistics. All authors were involved in data interpretation and manuscript revision.

  • Funding This study was funded by Zhejiang Provincial Natural Science Foundation of China (no. LQ20B070001), National Natural Science Foundation of China (no. 71774147), Hangzhou Agricultural and Social Development Research Project (no. 20201203B178) and Hangzhou Soft Science Project(no. 20200834M23).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.