Discussion
In our study, we employed a novel unsupervised clustering technique and used data from a multiethnic national representative cohort to identify distinct subgroups of US adults with varying levels of cardiometabolic risk. We found heterogeneity in cardiometabolic risk profiles, with different clustering patterns of SBDH and metabolic risk factors. Specifically, we identified three distinct subpopulations with different degrees of risks for undiagnosed diabetes and pre-diabetes. The subpopulations labeled as ‘Middle-aged adults with multiple metabolic risk factors’ and ‘Elderly adults with chronic conditions and low physical activity levels’ exhibited significantly elevated risk compared with ‘Healthy socioeconomically vulnerable young adults’ group. Furthermore, we observed significant variations in the association between pre-diabetes and subpopulations by racial/ethnic group. These findings underscore the importance of comprehensive assessments of SBDH and objective measures of metabolic risk factors in evaluating diabetes risk. This approach could inform the development of more precise and effective screening and prevention strategies that are tailored to the unique needs of at-risk subpopulations.
As the availability of large-scale health data continues to expand, advanced data analytic techniques become increasingly crucial for uncovering complex patterns and relationships among multiple risk factors.32 In this regard, our approach, which incorporates social, behavioral and metabolic risk factors, offers a promising new tool for improving the accuracy of cardiometabolic risk classification, particularly in populations characterized by diverse social determinants of health. However, further research is needed to elucidate the underlying mechanisms, including potential genetic, environmental, and behavioral factors, that contribute to these clustering patterns. By leveraging large-scale health data and advanced analytical techniques, we can gain deeper insights into the complex interplay between these factors and ultimately develop more effective strategies for preventing cardiometabolic disease.
Our study reveals the variability in cardiometabolic risk profiles among US adults without prior diagnosis of diabetes, highlighting the critical role of SBDH in the development of pre-diabetes and undiagnosed diabetes. While the importance of SBDH has been recognized in previous research,33 34 our study confirms their essential role in defining the subpopulation at highest risk for developing pre-diabetes and diabetes. This incorporation of SBDH could improve the phenotyping of pre-diabetes35 and the stratification of cardiometabolic risk at the population level, with significant implications for the prevention and management of diabetes and other chronic diseases.36 Future research should focus on elucidating the underlying biological, social, and behavioral mechanisms contributing to the observed clustering patterns and differential risks of pre-diabetes and undiagnosed diabetes in these subpopulations.
Our study identified two subpopulations at high risk of undiagnosed diabetes: ‘Cluster 1: Middle-aged adults with multiple metabolic risk factors’ and ‘Cluster 3: Elderly adults with chronic conditions and low physical activity levels’. These findings reveal a gradient of increasing risk, with younger participants in cluster 2 being at lower risk compared with middle-aged individuals in cluster 1, who in turn exhibit a lower risk than older adults in cluster 3. This trend is consistent with the well-established association between advanced age and increased risk of diabetes.1 37 38 The differences in age across the three clusters may contribute to variations in sociodemographic and metabolic characteristics. Therefore, elucidating the interaction and interplay between age and other factors is crucial for public health organizations to develop precise prevention and control programs for pre-diabetes and diabetes.37 Our analysis extends beyond age as the sole risk factor, highlighting the additional significance of factors such as physical activity levels and comorbidity burden in identifying subpopulation at an even higher risk of undiagnosed diabetes. Moreover, access to healthcare and health insurance coverage emerged as important factors in detecting undiagnosed diabetes. These findings underscore the need for improved diabetes awareness, education, and preventive healthcare, particularly for less physically active older men who may be at risk of suboptimal diabetes testing despite being eligible for screening according to current guidelines.
In our study, we observed that older adults in cluster 3 displayed the longest sleep duration among the three clusters, which may contribute to the heightened risk of undiagnosed diabetes and pre-diabetes in this group. Notably, self-reported long and short sleep durations have been associated with an increased risk of type 2 diabetes.39 40 While existing research has primarily focused on the detrimental effects of short sleep duration on cardiometabolic health,41 42 emerging evidence indicates that long sleep duration is also related to an elevated risk of diabetes.39 The increased diabetes risk among older adults in cluster 3 could be attributed to potential factors such as poor sleep quality, where the proportion of stage N1 and N2 sleep increases, while stage N3 deep sleep slow wave sleep decreases with age, and time awake after sleep onset tends to rise.42 Moreover, obstructive sleep apnea, which is more prevalent in individuals with long sleep duration, is known to be associated with an increased risk of incident diabetes.43 44 Additionally, altered levels of leptin and ghrelin, and their impact on appetite and glycemic control, may contribute to the heightened type 2 diabetes risk in cluster 3, particularly as these older adults were the least physically active.43 45 While randomized controlled trials are essential to elucidate the mechanisms linking long sleep to diabetes risk, our findings further support the significance of encouraging appropriate sleep duration in delaying or preventing diabetes.
The identification of a specific subgroup characterized by clustered behavioral determinants of health, labeled as ‘Middle-aged adults with multiple metabolic risk factors’, provides valuable insights into the relationship between health-related behaviors and diabetes risk. Emerging evidence suggests that multiple unhealthy behaviors have a significant impact on the incidence of diabetes.39 In our study, after controlling for baseline metabolic risk, we found that ‘Middle-aged adults with multiple metabolic risk factors’ had a nearly 1.5-fold increase in pre-diabetes risk compared with ‘Healthy socioeconomically vulnerable young adults’. Similar observations have been made in other studies, including a population-based cohort study of Chinese adults, where a distinct cluster characterized by smoking, heavy drinking, physical inactivity, and insufficient sleep was associated with a higher likelihood of diabetes.46 These findings suggest that the identification of adults with multiple unhealthy behaviors holds potential for predicting an increased risk of cardiometabolic disease. Furthermore, lifestyle modification or medications have been shown to prevent the progression from pre-diabetes to diabetes.47 Therefore, cluster analyses capturing health-related behavior patterns in the adult population could provide a more effective way for identifying subgroups at risk of developing diabetes compared with conventional prediction methods.
Interestingly, our study found that the ‘Healthy socioeconomically vulnerable young adults’ subpopulation had a lower risk for both pre-diabetes and undiagnosed diabetes, which contrasts previous research suggesting that individuals with lower socioeconomic status are more susceptible to cardiometabolic disease.8 48 This discrepancy in results may be attributed to the younger age of this subpopulation in our study, as the prevalence of pre-diabetes and diabetes typically increases with age.49 Moreover, inconsistencies in previous studies examining the inverse association between socioeconomic status and cardiometabolic disease50 51 may stem from variations in how socioeconomic status is measured.52 Additional factors, such as behavioral and psychosocial differences among specific racial and ethnic groups living in disadvantaged neighborhoods, may account for these disparities.51 In our study, the ‘Healthy socioeconomically vulnerable young adults’ subpopulation exhibited the lowest BMI and waist circumference, as well as the highest diet quality and physical activity levels, which likely contributed to their lower risk of pre-diabetes and diabetes. The high proportion of Hispanics in this subpopulation could also be a contributing factor, as Hispanics in the USA have been found to have comparable or better health outcomes than their non-Hispanic white counterparts despite facing higher rates of poverty and lower health insurance coverage.53 Migration history, including the duration of time spent in the US, may play a role in the cardiometabolic risk profile of Hispanics.54 Our findings suggest that incorporating SBDHs and migration factors into traditional metabolic risk assessments could help provide a more nuanced understanding of how these factors intersect with race/ethnicity in the development of cardiometabolic disease.
This study offers promising implications for advancing opportunities for cardiometabolic health improvement by addressing SBDH factors. Our findings suggest the potential use of unsupervised cluster analysis to enhance risk stratification at the population level by incorporating SBDH into the classification of cardiometabolic risk. Clustering analyses have been demonstrated to improve the accuracy of risk prediction compared with traditional risk models.55 Although we have not yet tested whether this approach improves the classification of cardiometabolic risk, our results indicate clinically relevant prognostic SBDH differences between subpopulations, providing a framework for developing machine learning algorithms to automatically identify individuals at increased risk for developing diabetes and cardiovascular disease. This approach may be particularly useful for developing tailored, cost-effective preventive strategies based on individuals’ SBDH profiles. Our study highlights the importance of addressing health-related social needs in addition to clinical factors when promoting health equity.56 It also provides guidance for the recruitment of participants into clinical trials involving diabetes screening, testing, or lifestyle interventions to reduce cardiometabolic risk, as certain subpopulations of adults are at greater risk of pre-diabetes and undiagnosed diabetes and may derive the greatest benefit from such interventions. These findings have implications for national organizations such as the American Diabetes Association8 and American Heart Association,57 as they work toward improving understanding and addressing social determinants of health in promoting cardiometabolic health.
Limitations
Our study has several limitations that need to be considered. First, using the K-prototype clustering algorithm to categorize the population into discrete clusters may have overlooked the continuous spectrum of health and disease progression. Additionally, the SBDH characteristics of the population may change over time, highlighting the need for further research on the longitudinal change of cluster membership to provide additional information. Despite this, our analysis supports the hypothesis that subpopulations with distinct combinations of SBDH and metabolic risk factors can be revealed using cluster analysis. Second, our SBDH characterizations were based on self-report, which may introduce subjectivity into the analysis, and other variables may be of greater significance in developing meaningful cardiometabolic risk clusters. Moreover, we did not include genetic information in our clustering, which limits our ability to assess the impact of Mendelian pathogenic variants predisposing individuals to cardiometabolic conditions.58 Nonetheless, our study benefits from a large multiethnic national representative sample of US adults, and our cardiometabolic risk clustering patterns overlap with those identified by another population study conducted in China.46 Third, while our decisions on the number of clusters and variable selection were informed by established methods, there are no well-validated techniques for finding optimal numbers of clusters for K-prototype analysis. Replicating our findings in other population-based datasets is necessary, and our study results should be considered hypothesis-generating only. Last, non-Hispanic Asians and participants of mixed or other race/ethnicity were grouped together as ‘Others’ due to the limited sample size available in the dataset. Consequently, we were unable to investigate potential effect modification by these subgroups. Performing further stratification analyses specifically focusing on non-Hispanic Asians could offer valuable insights into the observed absence of differences in pre-diabetes risk between cluster 3 and clusters among the Other races. Existing evidence suggests that individuals of South Asian descent, in particular, may demonstrate a higher risk cardiometabolic risk profile at lower BMI levels when compared with non-Hispanic whites.59 60 Nevertheless, our study provides valuable insights into the potential for unsupervised cluster analysis to improve risk stratification and identify subpopulations at increased risk for developing diabetes.