Epidemiology/Health services research

Determinants of good metabolic control without weight gain in type 2 diabetes management: a machine learning analysis

Abstract

Introduction The aim of this study was to investigate the factors (clinical, organizational or doctor-related) involved in a timely and effective achievement of metabolic control, with no weight gain, in type 2 diabetes.

Research design and Methods Overall, 5.5 million of Hab1c and corresponding weight were studied in the Associazione Medici Diabetologi Annals database (2005–2017 data from 1.5 million patients of the Italian diabetes clinics network). Logic learning machine, a specific type of machine learning technique, was used to extract and rank the most relevant variables and to create the best model underlying the achievement of HbA1c<7 and no weight gain.

Results The combined goal was achieved in 37.5% of measurements. High HbA1c and fasting glucose values and slow drop of HbA1c have the greatest relevance and emerge as first, main, obstacles the doctor has to overcome. However, as a second line of negative factors, markers of insulin resistance, microvascular complications, years of observation and proxy of duration of disease appear to be important determinants. Quality of assistance provided by the clinic plays a positive role. Almost all the available oral agents are effective whereas insulin use shows positive impact on glucometabolism but negative on weight containment. We also tried to analyze the contribution of each component of the combined endpoint; we found that weight gain was less frequently the reason for not reaching the endpoint and that HbA1c and weight have different determinants. Of note, use of glucagon-like peptide-1 receptor agonists (GLP1-RA) and glifozins improves weight control.

Conclusions Treating diabetes as early as possible with the best quality of care, before beta-cell deterioration and microvascular complications occurrence, make it easier to compensate patients. This message is a warning against clinical inertia. All medications play a role in goal achievements but use of GLP1-RAs and glifozins contributes to overweight prevention.

Significance of this study

What is already known about this subject?

  • No study has examined in the real world which factors play a role in achieving the combined goal “Hb1Ac at target and no weight gain” in type 2 diabetes management.

What are the new findings?

  • We investigated the topic with an artificial intelligence technique in a database of 5.5 million measurements.

  • Elevated HbA1c and fasting glucose values and slow drop of HbA1c emerge as first, main, obstacles that oppose goal achievement. As a second line of negative factors, markers of insulin resistance, microvascular complications, years of observation and proxy of duration of disease and low quality of assistance appear to be important determinants.

  • Almost all the available diabetes treatments are effective but use of GLP1-RA and glifozins stands out in weight control.

How might these results change the focus of research or clinical practice?

  • To achieve the best results, diabetes should be treated as early as possible with the best quality of care, probably before beta-cell decline and harmful hyperglycemic exposure that lead to microvascular complication. This is a warning against clinical inertia.

  • All medications play a crucial role in goal achievements but the most remarkable difference is the favorable role that GLP1-RA and glifozins show in overweight prevention.

Introduction

Achieving the combined goal of HbA1c within the target value with no weight gain is the primary (although not the only) objective of the everyday activity of physicians, especially diabetologists.1 A large bulk of literature indicates that this is nowhere an easy business as many series all over the world report that only a proportion around 40%–50% of diabetic population attains the HbA1c goal.2 3

While many studies have examined the effect of drugs in reducing blood glucose and HbA1c in patients with diabetes mellitus, none, to our knowledge, have examined in the real world which factors are more likely to be associated with the combined target (Hb1Ac and weight).

In Italy, a continuous improvement effort implemented by a network of diabetes clinics, that is, AMD (Associazione Medici Diabetologi) Annals, has been in place since 2006.4 5 After 12 years from the launch of the initiative, half of the diabetes clinics in Italy participated in the AMD-Annals initiative, caring for over one-sixth of all diagnosed patients. Process and intermediate outcome measures consistently improved, in parallel with a more intensive and appropriate use of pharmacological treatments.6

AMD, considering the unique knowledge contained in more than 12 years of AMD ANNALS database, decided to exploit the huge potential offered by artificial intelligence (AI) and machine learning (ML). The benefits of these methods are reported in many published articles on the topic, including some with specific focus on diabetes.7 A “clear box-explainable” AI algorithm, namely, the logic learning machine (LLM), was chosen for this analysis (overcoming ‘black box’ AI issues, that is lack of transparency).8 LLM allows to solve investigation problems producing sets of intelligible rules capable of achieving an accuracy comparable or superior to that of best ML algorithms.9

In brief, the aim of this AI analysis was to identify, in a specialist setting, the factors (either clinical, organization-related and doctor-related) capable of predicting rapid and effective achievement of metabolic control in patients with type 2 diabetes simultaneously avoiding weight gain.

Methods

Characteristics of the LLM (logic learning machine)

When dealing with biomedical data concerning a problem, usually doctors ask experts in conventional statistical techniques to prove specific conclusions about a pathologic or biological phenomenon of interest, starting from a sample of experiences gained previously. ML, on the contrary, has the ability to perform an analysis without making any a priori assumption and, moreover, can reveal unknown aspects of the analyzed situation.

A specific type of ML techniques, “rule generation methods,” builds models described by a set of intelligible rules, thus permitting to extract important knowledge about the variables included in the analysis and on their relationships with the target attribute. Two different paradigms have been proposed in literature to perform rule generation: decision trees,10 which adopt a divide-and-conquer approach for generating the final model, and methods based on Boolean function reconstruction,11 12 which follow an aggregative procedure for building the set of rules.

LLM is an original proprietary algorithm capable of an efficient implementation of the switching neural network model13 which allows to solve classification problems producing sets of intelligible rules expressed in the form:

“if premise …, than consequence …,” where ‘premise’ refers to combination, in “and”, of conditions (conditional clauses) on input variables, and “consequence” contains information about the target function yes or no.

The LLM rule generation technique produces a subset of relevant variables associated with a specific outcome and informs us of explicit intelligible conditions related to a particular outcome: relevant thresholds are identified for each input variable (eg, if triglycerides>110 AND Fasting blood sugar >132 AND high density lipoprotein (HDL) Cholesterol <52, then TARGET=NO), which represent valuable information to better understand the phenomenon under study.

LLM is able to achieve accurate results, comparable or superior to that of best ML methods.9 More specifically, the application of LLM to the analysis of biomedical datasets included in the Statlog benchmark14 permits to appreciate the optimal results obtained by this innovative analysis method.15

Investigation modes

This was an observational, longitudinal, retrospective study. The data were derived from the AMD-Annals database, that is, from electronic medical records of all patients attending 235 diabetes clinics between 2005 and 2017.6 After exclusion of type 1 diabetes, patients under 30 and diabetes in pregnancy, the final database contained 2.3 billion of data corresponding to information on over 1 300 000 patients with type 2 diabetes.

These patients were followed over time with scheduled periodic checks of the Hba1c value. The average time between two controls was 0.6 years and the variables detected for each control were 91. The data flow (online supplementary figure 1) resumes these main steps:

  • Keep, for each variable, only the measurements with values within a reasonable range; discard the others.

  • Time interval between 2 HbA1c measurement must be at least 2 months; measurement under this interval are discarded.

  • For each HbA1c measurement, keep track of the value of the “clinical factors” (eg, systolic blood pressure (BP)) closest in time, in an interval of maximum 4 months before and after the date of the measurement.

  • For each HbA1c measurement, keep track of the permanent factors (such as acute myocardial infarction), looking back up to the date of their first detection.

  • For each HbA1c measurement, keep track of the drugs related to the previous measurement: we assumed that the achievement of that specific target depends on the treatments followed in the period preceding the date of HbA1c measurement.

In this way, 5 564 822 HbA1c measurements and related weight variations were consolidated to be used for our analysis. This figure corresponds to 802 348 patients.

To avoid the interference of different goals of HbA1c set by diabetologists for the geriatric population, we focused our analysis on patients under 75 years.

To simplify the huge number of drugs combination (about 800) we grouped the drugs in eight diabetes therapies as reported in online supplementary figure 2; these therapies were administered in 18 combinations, as reported in table 1. Similarly, to ensure a robust estimate of comorbidities, taking into account the possibility that a specific diabetes complication could have been reported in different fields in the electronic medical record, we regrouped the scattered information as reported in table 2. This solution enabled us to have a more manageable classification of diabetes therapies and patients’ comorbidities.

Table 1
|
Therapies combinations (mutually exclusive)
Table 2
|
Comorbidity groups

Main descriptive variables also included age, sex, years of clinical observation (considered a proxy of duration of diabetes), smoking, weight, body mass index (BMI), HbA1c, blood pressure, serum uric acid, lipid profile, serum alanine aminotransferase (ALT), aspartate aminotransferase (AST), gamma glutamil gransferase (GGT), estimated glomerular filtration rate (Chronic Kidney Disease Epidemiology Collaboration formula), albuminuria, diabetes treatment scheme, antihypertensive treatment, lipid-lowering and antiplatelet treatment.

Three important ‘derived’ variables were calculated: distance from HbA1c target, that is HbA1c minus 7 %, HbA1c Drop speed, that is the speed of HbA1c reduction between two different measurement and mean interval between visits. We also added, besides the AMD database indicators, a quality of care summary score (Q-score) calculated for each year of observation. The Q-score was developed and validated in two previous studies.16 17 The score is based on a combination of organization and clinical outcome indicators related to HbA1c, blood pressure, low density lipoprotein (LDL) cholesterol and microalbuminuria. The score ranges between 0 and 40, with a higher value indicating a better quality of care. In the two previous studies,16 17 the Q-score was closely related to long-term outcomes, in fact, the risk of developing a new cardiovascular event was 80% higher in patients with a score <15% and 20% higher in those with a score between 15 and 25, compared with patients with a score >25.

It is worth noticing that the task of including/excluding and grouping variables, in addition to the choices to derive new variables, was driven by the LLM modeling outputs.

The quality of a proposed model depends on its accuracy, that is, how much the model represents the analyzed phenomenon. For example, an accuracy of 75% indicates that the model is able to predict correctly 75% of the outcomes. The relevance is also very important. It has a value between 1 and 0 and measures the weight of the variable in determining the outcome.

In our figures and tables, the following wording was used: “TARGET=YES” if the combined goal is achieved, “TARGET=NO”, if the combined goal is not achieved.

Results

The primary goal of the research, HbA1c≤7% and weight variation either negative or ≤2%, was achieved in 37,5% of measurements in the time range 2005–2017 (2017 first-quarter only). Within the same period, HbA1c≤7% only was achieved in 47.5% of measurements. These two percentages have steadily increased from 2005 to 2017.

The LLM engine has identified 19 variables out of 93 (online supplementary figure 3) as worthy of deeper analysis, whereas the others database variables were discarded; in this process, LLM put in evidence that the primary goal was influenced by BMI, only in the range between 30 and 32, while, outside this interval, it was irrelevant and therefore this information was incorporated in the models, but the variable excluded. Age of patients and size of the diabetes clinics (number of patients) were also discarded because of irrelevance. Six modeling cycles were performed (learning set=70% and test set=30%) to analyze the various facets of this phenomenon. The model supporting the objective of our analysis, shown in table 3, emerged as the best and was characterized by an accuracy of 0.75.

Table 3
|
Best and main model

In brief, glucometabolic factors such as high HbA1c and fasting glucose and slow drop of HbA1c have the greatest relevance values and emerge as the first, main, obstacles the doctor has to address and overcome. However, as a second line of negative factors, a certain degree of insulin resistance, presence of complications, years of observation and proxy of duration of disease, interestingly appear to be important determinants. Of note, the Q score, indicating the quality of assistance provided to every patient, plays a significant role in the model. Male gender also stands out as a favorable factor for goal achievement. Finally, as regard medications, treatment with almost all the available oral agents appears in the models as effective; the use of insulin in addition or alone is the only negative factor.

We then tried to analyze the contribution of each component of the combined endpoint, HbA1c and weight, in the achievement of the goal, and found that weight gain was less frequently the reason for not reaching the endpoint. The accuracy of the model and therefore its ability to represent the phenomenon under study is reduced when the endpoint is not reached due to the weight. Interestingly, a model with only HbA1c was tested and resulted in a pattern with accuracy and relevance of variables very similar to that of the combined endpoint. (table 4). By contrast a model tailored only to study weight control (table 5) revealed that pre-existing obesity, use of innovative medications glucagon-like peptide-1 receptor agonists (GLP1-RA), sodium-glucose cotransporter-2 inhibitors (SGLT2-i), but not insulin alone are predictors of weight control. Insulin alone in the HbA1c model played a favorable role, whereas in the weight model, it played an opposite role: an apparent demonstration that its negative contribution in achieving the combined endpoint was based on determining weight gain.

Table 4
|
Only Hba1c model
Table 5
|
Only weight reduction model

Discussion

Achieving the combined goal of “HbA1c at target and no weight gain” is a primary objective of the everyday activity of diabetologists and general practitioners (GPs). However, many series all over the world report that only a proportion around 40%–50% of diabetic population attains the HbA1c goal. With this step by step analysis, we tried to explore the factors that play a role in this medical process. We realized that there are different areas which have to be addressed, corresponding to progressive levels of difficulty. The first area is that of the degree of decompensation of the patient: when faced with high fasting glucose and/or high HbA1c, which require speed in HbA1c drop, physicians encounter the greater obstacles, as witnessed by the high values of relevance of these factors. This is somewhat intuitive as these are biological characteristics which make decompensation difficult. On a lower level, typical insulin-resistance characteristics as triglycerides, low HDL cholesterol, hypertension and hepatic steatosis emerge as negative predictors of good HbA1c and weight control. Very informative is the fact that the presence of almost all established microvascular complications (diabetes kidney disease, albuminuria, retinopathy) acts against an easy achievement of goals, as well as other indicators of late interventions such as duration of diabetes and intervals of diabetes clinic referral. As a whole these findings highlight that late intervention is deleterious for a quick achievement of good metabolic control18

The quality of the average cure provided by the clinic, as stated by Q Score, reveals that it is easier to overcome the obstacles if the clinic has good performance in terms of guidelines adherence. In other words, professional performance pays off. The info that male gender is a condition in favor of diabetes metabolic control is not new, although the precise underlying mechanism is still poorly known.19

HbA1c goal achievement and weight control appear to be two scarcely correlated phenomena and this is one of the main finding of our research. It seems easier to act on HbA1c reduction where traditional medications play an important role. Intriguing is the role of insulin use that favors HbA1c drop, but shows an opposite effect on weight control. Innovative medications such as GLP-1receptor agonists and SGLT2 inhibitors, despite their limited use, less than 5% after 2009, emerge as a promoter of better weight control as compared with other treatments. This fact deserves attention and may have important clinical consequences.

A reasonable conclusion is that these findings suggest that to achieve the best results, an effort should be made to treat diabetes as early as possible with the best quality of care, probably before beta-cell decline and harmful hyperglycemic exposure that lead to microvascular complication. All medications play a crucial role in goal achievements but the most outstanding difference is the favorable role that GLP1-RA and SGLT2 inhibitors show in overweight prevention. In terms of implications for clinical practice, these messages of timely intervention translate into a warning against clinical inertia.18 20

To the best of our knowledge, this is the first study assessing and ranking factors which oppose rapid HbA1c and weight normalization and that physicians have to overcome.

Our study has limitations. First is the lack of information on hypoglycemia episodes which are important adverse effects capable of limiting goal achievement. Second, the study population was selected based on the availability of a minimum of measurements during follow-up for each of the parameters of interest. In other words, the study population possibly represents a “compliant” patient group. Third, information on medication adherence was not available. Adherence could at least partially explain the failure to reach the goals in some cases.

Finally and positively, the results regarding the relevance and accuracy of the model created by LLM were likely to be all highly statistically significant also with classical statistics due to the high numbers of patients and measurements.

Additional research and training on AI platforms is needed to explain the 25% of the phenomenon that remained unexplained by this first approach.