Article Text

Non-invasive wearables for remote monitoring of HbA1c and glucose variability: proof of concept
  1. Brinnae Bent1,
  2. Peter J Cho1,
  3. April Wittmann2,
  4. Connie Thacker2,
  5. Srikanth Muppidi3,
  6. Michael Snyder3,
  7. Matthew J Crowley2,
  8. Mark Feinglos2,
  9. Jessilyn P Dunn1,4
  1. 1Biomedical Engineering, Duke University, Durham, North Carolina, USA
  2. 2Endocrinology, Duke University Health System, Durham, North Carolina, USA
  3. 3Department of Medicine, Stanford University, Stanford, California, USA
  4. 4Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA
  1. Correspondence to Dr Jessilyn P Dunn; jessilyn.dunn{at}duke.edu

Abstract

Introduction Diabetes prevalence continues to grow and there remains a significant diagnostic gap in one-third of the US population that has pre-diabetes. Innovative, practical strategies to improve monitoring of glycemic health are desperately needed. In this proof-of-concept study, we explore the relationship between non-invasive wearables and glycemic metrics and demonstrate the feasibility of using non-invasive wearables to estimate glycemic metrics, including hemoglobin A1c (HbA1c) and glucose variability metrics.

Research design and methods We recorded over 25 000 measurements from a continuous glucose monitor (CGM) with simultaneous wrist-worn wearable (skin temperature, electrodermal activity, heart rate, and accelerometry sensors) data over 8–10 days in 16 participants with normal glycemic state and pre-diabetes (HbA1c 5.2–6.4). We used data from the wearable to develop machine learning models to predict HbA1c recorded on day 0 and glucose variability calculated from the CGM. We tested the accuracy of the HbA1c model on a retrospective, external validation cohort of 10 additional participants and compared results against CGM-based HbA1c estimation models.

Results A total of 250 days of data from 26 participants were collected. Out of the 27 models of glucose variability metrics that we developed using non-invasive wearables, 11 of the models achieved high accuracy (<10% mean average per cent error, MAPE). Our HbA1c estimation model using non-invasive wearables data achieved MAPE of 5.1% on an external validation cohort. The ranking of wearable sensor’s importance in estimating HbA1c was skin temperature (33%), electrodermal activity (28%), accelerometry (25%), and heart rate (14%).

Conclusions This study demonstrates the feasibility of using non-invasive wearables to estimate glucose variability metrics and HbA1c for glycemic monitoring and investigates the relationship between non-invasive wearables and the glycemic metrics of glucose variability and HbA1c. The methods used in this study can be used to inform future studies confirming the results of this proof-of-concept study.

  • algorithms
  • biomedical technology
  • pre-diabetic state
  • diabetes mellitus
  • type 2

Data availability statement

Data are available upon reasonable request. The data sets generated during and/or analyzed during the current study (the prospective cohort) will be made available 1 year from the date of publication to a public repository that is linked to the Digital Biomarker Discovery Pipeline (DBDP.org)[58].

http://creativecommons.org/licenses/by-nc/4.0/

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.

Statistics from Altmetric.com

Significance of this study

What is already known about this subject?

  • Several studies report that wearables can non-invasively capture metrics reflecting autonomic nervous system activity, which is a demonstrated correlate to glycemic health.

What are the new findings?

  • Glycemic variability metrics can be estimated with high accuracy using non-invasive wearables.

  • Non-invasive wearables can be used to predict hemoglobin A1c (HbA1c) with similar accuracy to a continuous glucose monitor.

How might these results change the focus of research or clinical practice?

  • Our findings from this proof-of-concept study suggest that wearables could potentially be used as part of a strategy to remotely monitor diabetes and detect undiagnosed pre-diabetes. Because wearables are so prevalent in the general population, leveraging these ubiquitous devices for purposes including glycemic monitoring and pre-diabetes detection and monitoring could represent a major advance in clinical pre-diabetes care.

Introduction

Pre-diabetes affects over one-third of people in the USA.1 Up to 70% of individuals with pre-diabetes eventually develop type 2 diabetes (T2D), which is 1 of the 10 leading causes of death globally2 and is associated with comorbidities including cardiovascular disease, nephropathy, neuropathy, and retinopathy.

While pre-diabetes is highly prevalent and has serious consequences, it is also seriously undiagnosed and mismanaged—only 10% of those with pre-diabetes are aware that they have the condition.3 For those who have been diagnosed, pre-diabetes is often poorly managed.4–6 Pre-diabetes can be diagnosed using serum glucose measurement, hemoglobin A1c (HbA1c), fasting glucose, and/or oral glucose tolerance testing,7 none of which is ideally suited for efficient pre-diabetes screening in the general population due to the requirement for drawing blood; and screening criteria are wholly insufficient. In a nationally representative sample, fewer than half of those who met the American Diabetes Association (ADA) screening criteria were screened.8 This has led to a diagnostic gap in pre-diabetes. Further, management of pre-diabetes is limited to point-of-care clinical visits and there is currently no way of assessing day-to-day or week-to-week progression of the condition. Further, diagnosis of pre-diabetes is limited to point-of-care clinical visits, which makes it extremely challenging to obtain a diagnosis of pre-diabetes for the 20% of Americans who are uninsured9 and the 57 million Americans live in remote, rural areas10 with limited accessibility to the point-of-care clinical visits. The economic burden of diabetes and pre-diabetes is growing rapidly, requiring the adoption of more comprehensive screening approaches as well as better prevention and treatment strategies.11

Diagnosing and treating pre-diabetes early can prevent the progression to T2D and mitigate tissue damage resulting from chronic hyperglycemia. T2D in the pre-diabetic stage is reversible with lifestyle changes, therefore, proper management of pre-diabetes is critical. In fact, the Finnish Diabetes Prevention Study found a 58% reduction in conversion to T2D with dietary, weight loss, and physical activity interventions during the pre-diabetic stage.12 Innovative, practical strategies to improve detection, monitoring and management of pre-diabetes before conversion to T2D are desperately needed.

Non-invasive wrist-worn biometric sensors, often referred to as ‘wearables,’ are becoming nearly ubiquitous in the USA, with 117 million currently in use and an expected 100% growth in the next 3 years.13 Because of this widespread use, wearables have important potential to aid in the development of digital biomarkers which will facilitate detection and monitoring of chronic diseases.14 Digital biomarkers are digitally collected data (eg, heart rate measurements from a wearable) that may be used as indicators of health outcomes (eg, pre-diabetes). Digital biomarker algorithms enable the aggregation of high-resolution, intraindividual data into summary metrics that are interpretable and actionable. One hundred seventeen million individuals already have these non-invasive, wrist-worn wearables13 ; by leveraging these data already being collected, we may be able to improve recognition of pre-diabetes and health outcomes.

Using wearables to generate digital biomarkers capable of monitoring glycemic metrics in people with pre-diabetes would represent a major advance in T2D prevention. Glycemic health has been shown to be correlated with glucose variability metrics15–22 and HbA1c.15–17 Wearables can non-invasively capture metrics reflecting autonomic nervous system activity,22–26 including heart rate variability, electrodermal (sweat) activity, and skin temperature. The known association between glycemic variability and metrics of autonomic neuropathy27–30 that can be measured using wearables22–25 31 provides a strong rationale for attempting to develop digital biomarkers to monitor pre-diabetes based on non-invasive data from wearables. In order to develop digital biomarkers, it is critical to explore the relationships between features that can be derived from non-invasive wearables and measures of glycemic health.

This study sought to determine the feasibility of using digital biomarkers from wearables to estimate glucose variability metrics and HbA1c among patients with pre-diabetes and high-normal blood glucose (figure 1). Additionally, this study aims to investigate the relationship between glycemic metrics, including glucose variability and HbA1c, and features that can be derived from non-invasive wearable sensors. This study is a proof-of-concept study designed to establish the feasibility of using non-invasive wearables to estimate glucose variability metrics and HbA1c and to develop methods to inform larger cohort studies.

Figure 1

Graphical abstract of study. Our prospective cohort consisted of 16 participants who wore the CGM and wrist-worn wearable simultaneously for 8–10 days after a clinical HbA1c was measured. Our retrospective validation cohort was our external test set and consisted of 10 participants who wore the wrist-worn wearable for up to 10 days after a clinical HbA1c was measured. We developed a random forest model estimating HbA1c (Watch eA1c) and random forest models estimating each of the 27 glucose variability metrics. We compared our Watch eA1c model with three comparison models: ADA CGM eA1c, CGM eA1c, and Linear Watch eA1c. We were able to obtain a mean average per cent error (MAPE) of 5.1% on the external test-validated Watch eA1c and a MAPE of <10% on 11 of 27 metrics of glucose variability. ADA, American Diabetes Association; CGM, continuous glucose monitor; eA1c, estimated A1c; HbA1c, hemoglobin A1c.

Materials and methods

Experimental design

In this study, we sought to determine the feasibility of using digital biomarkers from wearables to estimate both glucose variability metrics and HbA1c among patients with pre-diabetes and high-normal blood glucose (figure 1). Given the body of evidence suggesting that autonomic nervous system metrics, measurable with non-invasive, wrist-worn wearables, are associated with variability in blood glucose,27 29 we hypothesized that we could develop models using features engineered from non-invasive wearables data to estimate blood glucose variability and HbA1c, previously only measurable through continuous glucose monitoring and clinical blood tests, respectively.

Study population

Patients were recruited for the prospective study from the Duke Endocrinology Clinic through medical record review. Included patients were between the ages of 35 and 65 years with high-normal blood glucose (HbA1c 5.2–5.6) or pre-diabetes (HbA1c 5.7–6.4). Participants were excluded if they had cancer, chronic obstructive pulmonary disease, cardiovascular disease, food allergies, or were taking any antidiabetic drugs.

For the retrospective, external validation cohort, study participants with data collected between 2017 and 2018 in the Integrated Personal Omics Profiling Study cohort were included if they had high-normal blood glucose or pre-diabetes (HbA1c 5.2–6.4) and non-invasive wrist-worn wearable (Empatica E4) data during the 10 days following their clinic visit. The retrospective, external validation cohort did not have continuous glucose monitoring.

Study protocol

Prospective study participants (N=16) had HbA1c measured in the clinic on day 0. Participants wore a Dexcom G6 continuous glucose monitor (CGM) and a wearable wrist-based device (Empatica E4) 24 hours a day for 8–10 days after day 0 (figure 1). High glycemic meals were used to induce glucose variations over the course of the study. Specifically, standardized breakfast meals (1.5 cups of frosted flakes and 1 cup lactaid 2% milk) were ingested every other morning prior to ingesting any other food, drink, or medication. The standardized breakfast meals were used to induce hyperglycemia regularly and repeatedly in order to have repeated measures from the same individual. Participants were requested to not make any changes to their typical diet other than the standardized breakfast meals. Comprehensive diet logging of all meals and snacks, including beverages, was done with a food diary.

For the retrospective external validation cohort (N=10), we used data from wrist-worn wearables (Empatica E4) worn 24 hours a day for up to 10 days following the HbA1c measurement (figure 1). Patients in the retrospective external validation cohort did not have simultaneous data from a CGM and did not have dietary interventions or diet logging.

The Dexcom G6 records interstitial glucose concentration (mg/dL) every 5 min. The Empatica E4 contains four sensors: photoplethysmography (optical heart rate), electrodermal activity (galvanic skin response, related to sweat activity), skin temperature, and triaxial accelerometry. Heart rate is recorded once per second, (calculated from photoplethysmography sampled at 64 Hz), electrodermal activity and skin temperature are recorded at 4 Hz, and accelerometry is recorded at 32 Hz. The accelerometry data were preprocessed and the vector magnitude of the three axes was used in this study.32

Sample size

For this pilot feasibility study, a sample size of 16 participants was targeted and achieved, including 8 with pre-diabetes and 8 with high-normal glucose levels.

Model development

Interstitial glucose summary and glucose variability metrics were calculated from the continuous glucose monitoring data (online supplemental table 1).16 17 19 33–39 Metrics summarizing the non-invasive wearable sensor data were developed and calculated for each participant (online supplemental table 2). In total, 84 metrics from four non-invasive wearable sensors (21 features per sensor) were calculated.

Supplemental material

The objective of this study was to use non-invasive wearables to estimate glucose variability metrics and HbA1c, which have been shown to be indicative of glycemic health.15 17 To accomplish this, we built separate random forest models estimating each of the 27 glucose variability metrics (Watch RF eGluVar) and estimating HbA1c (Watch RF estimated A1c (eA1c)). Models were tuned using leave-one-person-out cross-validation (LOOCV) on the prospective cohort by removing all features that had a variable impurity-based importance (cut-off: 0–<0.05) in each training fold. Mean squared error (MSE) was used as the stopping condition.40 Each of the 27 Watch RF eGluVar models were evaluated using LOOCV on the prospective cohort (N=16). Because we had a retrospective, external validation cohort with HbA1c and wearable sensor data only, we could evaluate the Watch RF eA1c model by two separate methods: (1) using LOOCV on the prospective cohort (N=16) and (2) using a retrospective external validation out-of-sample cohort (N=10). An example of a decision tree from the Watch RF eA1c model is shown in online supplemental figure 1.

For all models, the root MSE (RMSE) and mean average per cent error (MAPE) were used to assess model performance.41 While an acceptable MAPE has been widely debated, a MAPE of less than 10% is generally accepted as highly accurate.42

We developed two additional models estimating HbA1c to compare with the Watch RF eA1c model using either a different model structure with input features from the same data source, or the same model structure with different input variables calculated from a more invasive wearable sensor (the CGM). We first developed a multiple regression model to estimate HbA1c (Watch LM eA1c) using the same features as the Watch RF eA1c model (fitting algorithm: restricted maximum likelihood, online supplemental equation 1). Second, we built a random forest model to predict HbA1c (CGM RF eA1c) using interstitial glucose summary and glucose variability metrics that we calculated from the CGM data (online supplemental table 1). We also compared the Watch RF eA1c model to the ADA estimated HbA1c linear regression model that is based on mean glucose measurements from CGM (ADA CGM LM eA1c).38 While the Glucose Management Indicator (GMI) has been recently used as an alternative to the ADA eA1c,37 in our population the eA1c had lower error than the GMI in estimating HbA1c (mean error of 0.312±0.215 for eA1c and mean error of 0.369±0.227), so we used eA1c for model comparisons. Ultimately, we evaluated all of the eA1c models against one another (figure 1).

All models were developed using Python V.3.8.3. Specific Python packages used for data handling and modeling include numpy, pandas, statsmodels, and sklearn.

Statistical analyses

In order to compare Watch RF eA1c with Watch LM eA1c and the CGM-based models, we calculated the RMSE and MAPE for each model. We then performed t-tests on the model RMSE calculated on each training fold, which are outlined in online supplemental table 6. Paired t-tests performed on the RMSE from the same held-out observation were done between ADA CGM LM eA1c and CGM RF eA1c, ADA CGM LM eA1c and Watch RF eA1c (LOOCV), CGM RF eA1c and Watch RF eA1c (LOOCV), and Watch RF eA1c (external validation test) and Watch LM eA1c. Because there was a different number of participants in the retrospective, external validation cohort, we used a two-sided t-test for two independent samples between ADA CGM LM eA1c and Watch RF eA1c, CGM RF eA1c and Watch RF eA1c (external validation test), Watch RF eA1c (LOOCV) and Watch RF eA1c (external validation test), ADA CGM LM eA1c and Watch LM eA1c, CGM RF eA1c and Watch LM eA1c, and Watch RF eA1c (LOOCV) and Watch LM eA1c. We used a Bonferroni multiple hypothesis corrected significance cut-off p value of 0.005 (p=0.05/10 different analyses performed). We used R2 between observed and expected values to determine what per cent of the variance is explained by our Watch RF eA1c model. We also performed a Bland-Altman analysis for both the Watch RF eA1c (LOOCV) and the Watch RF eA1c (external validation test).

Statistical analyses were performed using Python V.3.8.3 using the statsmodels and scipy libraries.

Methods used in this study have been made publicly available in the Digital Biomarker Discovery Pipeline (DBDP) to promote reproducibility.43 Methods for extracting glucose variability metrics from a CGM are available in both Python and R in the cgmquantify module of the DBDP. The feature engineering methods for wearable sensors can be found in the wearablevar module of the DBDP. Preprocessing, exploratory data analysis, and machine learning methods are also available in the DBDP.

Results

For the prospective cohort, 16 participants (mean age 54.7 years; 9 women; 4 African American, 11 Caucasian-white, 1 multiracial; 8 participants with pre-diabetes, 8 participants with high-normal glucose; body mass index (BMI) 32.67±5.68) were recruited (figure 1). For the retrospective external validation cohort, 10 participants met our inclusion criteria (five women, five men, mean age=55.4, racial breakdown: four African American, six Caucasian-white; five participants with pre-diabetes, five participants with high-normal glucose). Demographics for both the prospective and retrospective, external validation cohort are defined in online supplemental table 3.

Glucose variability estimation models

We developed models to estimate each of the 27 CGM-based glucose variability metrics (Watch RF eGluVar) and clinically measured HbA1c (Watch RF eA1c) using non-invasive wearables.

The Watch RF eGluVar performance metrics, RMSE and MAPE, are summarized in table 1 and depicted in figure 2. Out of the 27 of Watch RF eGluVar models built, 11 of the glucose variability metrics could be estimated with high performance (MAPE <10%) (table 1). These metrics include the GMI, Interday Mean Glucose, Interday Median Glucose, Interday Quartile 1 Glucose, Interday Quartile 3 Glucose, Mean of Glucose Excursions (MGE), Mean of Intraday SD, SD of Intraday SD, Time Inside Range (TIR), Per cent Time Inside Range, and Mean of Normal Glucose. We found that 10 of 27 models outperformed the mean model and 11 of 27 models outperformed the median model (online supplemental table 4). The variance of each glucose variability metric explained by each of the Watch RF eGluVar models (R2) is shown in online supplemental table 5. We examined the relative importance of each individual wearable sensor (optical heart rate, accelerometry, skin temperature, and electrodermal activity) in the Watch RF eGluVar models (table 2, online supplemental table 6) and found that while sensor importance varied by the glucose variability metric being modeled, every metric required the use of all four sensors to achieve high model performance.

Table 1

Results of wearable sensor estimation of glucose and glucose variability RF-LOOCV models

Table 2

Relative importance of wearable sensors for each of the 11 glucose variability models with high accuracy

Figure 2

Accuracy of models estimating glucose variability using a non-invasive, wrist-worn wearable sensor. Models shown in terms of mean average per cent error (MAPE). Eleven models achieved a MAPE of less than 10%. (Not pictured: LBGI and HBGI MAPE due to >100% MAPE). ADRR, Average Daily Risk Range; CONGA24, Continuous overall net glycemic action for 24 hours; CV, Interday Coefficient of Variation; GMI, Glucose Management Indicator; HBGI, High Blood Glucose Index; iCV Mean, Mean of Intraday Coefficient of Variation; iCV Median, Median of Intraday Coefficient of Variation; iCV SD, SD of Intraday Coefficient of Variation; iSD Mean, Mean of Intraday SD; iSD Median, Median of Intraday SD; iSD SD, SD of Intraday SD; LBGI, Low Blood Glucose Index; Maximum, Interday Maximum Glucose; Mean, Interday Mean Glucose; Median, Interday Median Glucose; Minimum, Interday Minimum Glucose; MGE, Mean of Glucose Excursions; MGN, Mean of Normal Glucose; MODD, Mean of Daily Differences; PIR, Per cent Inside Range; POR, Per cent Outside Range; Q1G, Interday Quartile 1 Glucose; Q3G, Interday Quartile 3 Glucose; TIR, Time Inside Range; TOR, Time Outside Range.

HbA1c estimation model

The Watch RF eA1c model estimating the clinically measured HbA1c, validated using LOOCV on the prospective cohort, achieved RMSE: 0.281 with MAPE: 4.87%. The variance of HbA1c explained by the Watch RF eA1c model (R2) was 26.0%. The retrospective external validation of Watch RF eA1c model using the prospective cohort as the training set and the retrospective external validation cohort of 10 people as the test set resulted in RMSE: 0.357 and MAPE: 5.12% (R2=4.31%).

The performance of the Watch RF eA1c model did not exceed the performance of the mean model, while the CGM RF eA1c model did exceed the performance of the mean model (online supplemental figure 2). Comparison of the Watch RF eA1c (both the LOOCV and the external validation) with the ADA CGM LM eA1c (RMSE: 0.379; MAPE: 5.39%, R2=12.1%) and CGM RF eA1c (RMSE: 0.245±0.237; MAPE: 4.22±3.89%, R2=0.036%) models showed no significant difference between model errors (figure 3, online supplemental table 7). In addition to comparing the Watch RF eA1c model with the models built using CGM data, we also compared the Watch RF eA1c model with a linear model, Watch LM eA1c. We found that while there was no significant difference between the model errors at our Bonferroni multiple hypothesis-corrected p value, the error of the Watch LM eA1c model (RMSE: 0.690; MAPE: 9.56%; R2=14.4%) was double that of the Watch RF eA1c model (RMSE: 0.357; MAPE: 5.12%) (online supplemental figure 2, online supplemental table 7).

Figure 3

Comparison of HbA1c estimation models. Models: American Diabetes Association (ADA) estimated A1c (eA1c), our model estimating A1c using glucose metrics from CGM (LOOCV, tuned), our model using non-invasive wearable sensors (LOOCV, tuned), our model using non-invasive wearable sensors (tested on external test set). Models were compared using t-tests. As shown, the models are not significantly different from one another. CGM, continuous glucose monitor; HbA1c, hemoglobin A1c; LOOCV, leave-one-person-out cross-validation; MAPE, mean average per cent error; RMSE, root mean squared error.

We examined the importance of each individual wearable sensor in estimating HbA1c in the Watch RF eA1c model. The importance of each wearable sensor in estimating HbA1c was skin temperature (33%), electrodermal activity (28%), heart rate (14%), and accelerometry (25%) (online supplemental figure 3). We developed Watch RF eA1c models using each individual sensor alone (accelerometry, heart rate, electrodermal activity, or temperature) and found that the accuracy of the models with a single sensor did not exceed that of the multimodal sensor model (online supplemental table 8).

We performed a Bland-Altman analysis for the Watch RF eA1c model for both validation methods and found that model bias was increased at higher HbA1c (figure 4). On the Watch RF eA1c (external validation test), all values were within the limits of agreement. On the Watch RF eA1c (LOOCV), only one individual exceeded the accepted limits of agreement (figure 4). We also performed a Bland-Altman analysis for the CGM RF eA1c model (online supplemental figure 4) and found that all except one individual were within the accepted limits of agreement.

Figure 4

Bland-Altman plots for the Watch RF model. (A) Validated on external validation cohort; (B) validated using LOOCV on the prospective cohort. The mean difference is shown with a solid line and limits of agreement are shown with dashed lines. LOOCV, leave-one-person-out cross-validation.

Discussion

This study sought to evaluate a novel approach for pre-diabetes detection and monitoring: the use of digital biomarkers from non-invasive wrist-worn wearables to estimate HbA1c and glucose variability. The primary goals of this study were to provide a proof of concept and demonstrate methods to inform future studies using digital biomarkers from non-invasive wearables for pre-diabetes and diabetes screening and monitoring. Furthermore, this study sought to explore the relationship between features extracted from non-invasive wearables and glycemic variability and HbA1c. In this feasibility study, we found that we can estimate 11 glucose variability metrics with high accuracy (<10% MAPE) and HbA1c with high accuracy (RMSE: 0.357; MAPE: 5.1%) using these non-invasive wearable devices. The HbA1c estimation model that we developed from the non-invasive wrist-worn wearables was as accurate as the invasive CGM-based ADA eA1c. Because the ADA model was developed for patients with type 1 diabetes and may therefore be limited in patients with pre-diabetes and T2D, we have attempted to corroborate the eA1c model using a model we developed to estimate HbA1c in our population based on CGM data. Our non-invasive wearable HbA1c model also performed comparably with our CGM-based model.

This study shows the feasibility of using non-invasive, wrist-worn wearables to estimate HbA1c and glucose variability in pre-diabetes, an approach which could be used in the future for remote detection of pre-diabetes and could potentially be extended to monitoring and management of pre-diabetes. Leveraging wearables for non-invasive, remote detection of pre-diabetes could represent a groundbreaking strategy for closing the current diagnostic gap in pre-diabetes, which leaves individuals with unrecognized pre-diabetes at a high but preventable risk for developing T2D and its complications. Recent studies have shown the potential for using these wearable devices for the detection of other chronic disease states and acute diseases, such as infection.25 26 44 While larger studies will be needed to provide additional validation of our models, the present findings explore the relationship between non-invasive wearables and glycemic variability and HbA1c and show the feasibility of using non-invasive wrist-worn wearables for pre-diabetes detection and monitoring.

There is a well-established acute and chronic impact of glucose variability on the autonomic nervous system.27–30 Fluctuations in autonomic nervous system metrics can be measured non-invasively and remotely using wearable devices,22–25 31 which begets the possibility of monitoring glycemic variability non-invasively using autonomic nervous system measurements as a proxy.

While all of the sensors used in this study (accelerometry, heart rate, electrodermal activity, and skin temperature) were important for the estimation of glucose variability metrics and HbA1c, we found that electrodermal activity and skin temperature are the most important sensors when estimating HbA1c and heart rate was among the most important indicators in many of the glucose variability models, including Interday Mean Glucose, MGE, and TIR. Overall, the glucose variability metric models had higher importance of heart rate (19.7%–52%), which has been shown previously to be indicative of glucose health.44 The importance of skin temperature and electrodermal activity in the HbA1c estimation model is possibly due to the strong associations between these sensors and autonomic nervous system function,23 30 45 46 which is very sensitive to fluctuations of glucose, specifically high blood glucose (hyperglycemia) and low blood glucose (hypoglycemia).47 Sudomotor dysfunction, defined as decreased sudomotor activity and measured with electrodermal activity and skin temperature, is the earliest clinically detectable stage of autonomic neuropathy,45 so the fact that these sensors are predictive of HbA1c and glucose variability is rooted in known physiology. Furthermore, preliminary studies have demonstrated that the thermal effects of food, and the corresponding increase of metabolic rate, may be detected through skin temperature measurements.48 This provides additional physiological evidence that skin temperature as measured with a wearable sensor may be predictive of factors that impact HbA1c and glucose variability.

While we demonstrate with a Bland-Altman analysis that all but one individual were within the limits of agreement across both the Watch RF eA1c (external validation test) and the Watch RF eA1c (LOOCV) (figure 4), this analysis does demonstrate bias for higher HbA1c values. This is potentially due to the low number of participants in the cohort and the very small number of participants with HbA1c >5.7 (eight participants). However, this may also be due to irregular patterns among participants with higher HbA1c values. This is an area that should be explored in future studies. If confirmed that there are more irregular patterns among participants with higher HbA1c values in a larger cohort, this could be harnessed as a feature in the model and better inform future predictions.

If future research supports the approach of using non-invasive wearables for pre-diabetes detection, the goal would be to leverage data from smartwatches that 117 million people already have,13 with many of these devices containing the same sensors used in this study, including heart rate, accelerometry, and electrodermal activity. This would enable us to take advantage of data already being collected to add to the current screening approaches to pre-diabetes. Layering additional detection from wearables on top of current methods could increase recognition of pre-diabetes and allow intervention on currently undiagnosed patients. If non-invasive wearables can be confirmed to predict HbA1c and glucose variability with similar accuracy as the CGM, the use of wearables for pre-diabetes detection and monitoring would have dramatically improved prospects for scalability.

Limitations and future directions

This study has several limitations. First, HbA1c was measured in our cohorts prior to collection of the wrist-worn wearables data, meaning that any HbA1c variation occurring during the 10-day monitoring period would not be reflected in our models. However, because HbA1c is unlikely to change substantially over a 10-day period in the population with pre-diabetes, this limitation is unlikely to have significantly impacted our findings. While we examined data over 10 days in our proof-of-concept study, in future studies, examining data over a longer monitoring period would be beneficial to uncover the minimum duration of data that is needed to develop robust models of HbA1c.

Second, our cohorts included a small number of participants and our Watch RF eA1c model did not exceed the performance of the mean model, potentially due to the narrow range of HbA1c values in this dataset. The coefficients of determination in this model were low, which we hypothesize is a result of the low amount of variance in the data, which has been demonstrated to be a limitation of using coefficients of determination for model evaluation in small cohort studies.49 This further supports the need for future studies that are adequately powered beyond this proof-of-concept study. Other factors that cause biological variability that we are not able to measure with a wearable may also be present. While we did use an independent cohort for validation of the wrist-worn wearable data-based HbA1c prediction model, a larger study with a broader range of HbA1c values and more racial and ethnic diversity will be needed in order to continue developing these putative digital biomarkers. Further, external validation of the glucose variability models will be necessary in follow-up studies. There are inherent limitations in small cohorts, due to low power and high margins of error, regardless of validation methods used; thus, it is important to note that this study is a proof-of-concept study that proposes the feasibility of a new research area and provides methods that can be used to inform and support future studies in this space that are adequately powered and will generalize to new populations.

Third, the wrist-worn wearables used in this study were research-grade devices and contain more sensors than many commercial-grade devices (including electrodermal activity and skin temperature). Wrist-worn wearables with a similar array of biometric sensors are becoming increasingly popular, and with the recent release of electrodermal activity and skin temperature sensors on the Fitbit wearable,50 future work should explore pre-diabetes prediction using such commercial-grade devices. Additionally, while our models using data from individual sensors did not perform as well as the composite model with all sensors, better teasing apart the contribution of individual sensors to model performance is an area of future research in larger cohorts.

Finally, the dietary intervention in our study could potentially limit applicability of the glucose variability models on external populations. The HbA1c estimation model was tested on an external cohort without a dietary intervention and showed similar efficacy to the LOOCV model on the population with the dietary intervention.

Other future directions to note include the relationship between HbA1c and glycemic variability, which can vary with comorbidities and differing physiology, and the effect of this on the accuracy of the models presented herein should be examined in larger cohorts.51 Hypertension may be a risk factor for pre-diabetes. Though not included as part of this study, hypertension status should be recorded and included as a covariate in follow-up studies. Individuals with higher BMI have been demonstrated to have higher glycemic variability.52 53 Thus, larger follow-up studies involving a wider range of BMI and/or body fat percentage should explore its role as a covariate.

Here, we have explored the relationship between non-invasive wearables and measures of glycemic health, including glucose variability and HbA1c. This proof-of-concept study should be followed up by rigorous, adequately powered studies in order to generalize to larger populations with wider ranges of HbA1c values. One of the primary objectives of this study was to provide methods for future studies that aim to develop digital biomarkers from non-invasive wearables for pre-diabetes and diabetes screening and monitoring. To support this, we have developed two open-source packages for calculating the glucose variability metrics and features from non-invasive wearables that we have developed in this study. These packages are available in the DBDP.43

Conclusions

This study is a proof of concept that investigates the relationship between non-invasive wearables and glucose variability and HbA1c. It supports the feasibility of using non-invasive, wrist-worn wearables to estimate HbA1c and glucose variability among patients with normal blood glucose and pre-diabetes. The findings in this study support the development of future studies examining the use of digital biomarkers for pre-diabetes and diabetes screening and monitoring. Specifically, we were able to estimate HbA1c with high accuracy (RMSE: 0.357; MAPE: 5.1%) and 11 glucose variability metrics with high accuracy (<10% MAPE) using these non-invasive wearable devices. Our findings from this proof-of-concept study suggest that wearables could potentially be used as part of a strategy to remotely monitor diabetes and detect undiagnosed pre-diabetes. Because wearables are so prevalent in the general population, leveraging these ubiquitous devices for purposes including glycemic monitoring and pre-diabetes detection could represent a major advance in clinical diabetes care.

Data availability statement

Data are available upon reasonable request. The data sets generated during and/or analyzed during the current study (the prospective cohort) will be made available 1 year from the date of publication to a public repository that is linked to the Digital Biomarker Discovery Pipeline (DBDP.org)[58].

Ethics statements

Ethics approval

The prospective cohort study was approved by the Duke University Health System (DUHS) Institutional Review Board and written informed consent was obtained from all participants (Pro00101398). All subjects consented to the study and were compensated a total of $150 for their participation. Data analysis on the retrospective, external validation cohort study (data collected under Stanford University Institutional Review Board Protocol Numbers 23602 and 34907) was approved by the DUHS Institutional Review Board (Pro00102307).

Acknowledgments

This publication is written in memoriam of Dr Mark Feinglos, our collaborator, mentor, and friend. His insights and ideas were integral to this study and we are grateful for the time we had to learn and work together.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Contributors BB was involved in study design, data collection, data analysis and interpretation, model development, and manuscript preparation. PJC and AW were involved in data collection. CT and AW were involved in study design. SM and MS provided data for the prospective cohort. MF was involved in concept development, funding, and study design. MJC was involved in data interpretation and manuscript preparation. JPD was involved in concept development, funding, study design, data interpretation, and manuscript preparation.

  • Funding This work was supported by Duke MEDx. BB is a Duke Forge predoctoral fellow. JPD is a MEDx investigator and a Whitehead scholar. This work was funded in part by the Chan-Zuckerberg DAF, an advised fund of Silicon Valley Community Foundation (grant number 2020-218599).

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.