Discussion
The growing prevalence of obesity necessitates the exploration of risk prediction and prevention strategies for obesity. Administrative claims databases can be comprehensive and inexpensive sources of RWD for epidemiological studies as they establish the prevalence and incidence of various chronic diseases across large and demographically diverse populations.25 However, previous studies have shown that the use of these data sources may result in an incorrect estimate of obesity prevalence due to underutilization of the diagnosis codes.8 11 26 27
The current study assessed the validity of obesity diagnosis codes in an administrative claims database and reported high PPV and low sensitivity, in concordance with previous studies.11 12 Ammann et al reported low specificity and high PPV of the ICD-CM-9 and ICD-CM-10 BMI-related diagnosis codes for identification of patients with overweight or obesity.11 Ammann et al also demonstrated higher sensitivity of ICD-CM-10 coding compared with ICD-CM-9 coding11 which was confirmed by Suissa et al,28 a finding potentially attributable to improved coding practices and reimbursement requirements. Moreover, the accuracy or PPV of obesity diagnosis codes was higher among patients with obesity-related complications such as diabetes or hypertension,27 and the probability of having an obesity-related diagnosis code in claims data increased with comorbidity burden and hospitalization.11
In the current study, older individuals, those with severe obesity, or those with a higher disease burden were more likely to have an obesity diagnosis code recorded in the claims data. Possible reasons could be because these individuals may be more likely to seek medical care, or healthcare providers may code obesity for those with greater severity and burden, as they may consider obesity to be a driving diagnosis for the high disease burden. Indeed, people with an obesity diagnosis code were more likely to have increased healthcare utilization, including hospitalizations, emergency room visits, outpatient visits, and increased usage of medications compared with those without obesity diagnosis codes.28 However, despite the increased usage of obesity diagnosis codes in the claims database, the true obesity prevalence was still underestimated in the claims database compared with EMR data in the current study. Furthermore, the sensitivity was low as only approximately 31% of people with BMI indicative of obesity in the EMR had a corresponding diagnosis in claims. These results further emphasize the magnitude of obesity code underutilization and its impact on assessing obesity in claims data.
Previous studies showed that people with severe obesity were more likely to have a BMI-related diagnosis code in administrative data relative to those of normal weight.9 11 27 People with obesity diagnosis codes in the claims database are more likely to have class 3 obesity than those without obesity diagnosis codes, suggesting that diagnosis codes may not be recorded for people with class 1 obesity.27 Underutilization of obesity diagnosis codes could also result from other factors, such as physicians not considering obesity to be a disease, or the obesity diagnosis not being based on an objective measurement of BMI, thus only capturing cases of severe obesity.25 Taken together with the findings of the current study, these factors emphasize the importance of careful consideration of the diagnosis codes used for inclusion criteria in observational studies using claims data, and how diagnosis codes may impact outcomes such as disease prevalence or incidence. The use of diagnosis codes to help provide a “snapshot” for the better capture of obesity in the identification of target populations is critical to improving public health surveillance and research studies that use these databases, given the established association of obesity with several chronic diseases.25
Lately, Wu et al8 developed two models applying an SLA: model 1 with recorded baseline BMI values and model 2 with demographics and clinical characteristics data, excluding baseline BMI values, to predict obesity in people of all age groups. Model 1 reported better performance than model 2 with a higher AUC ROC (88% vs 73%), accuracy (ranging from 87.9% to 92.8% versus 73.6% to 80.0%) and specificity (ranging from 91.8% to 94.7% versus 71.6% to 85.9%). However, a notable limitation of Wu et al8 is that the study interpolated BMI from claims data that under-reports BMI. In the current study, we tested the predictive performance of five ML models to differentiate people with and without obesity. Of all the models tested, the XGBoost model demonstrated moderate-to-strong performance in predicting obesity risk. A higher AUC ROC and lower Brier score of the XGBoost model translated to better distinction of people with and without obesity with greater accuracy; however, a lower Brier score (0.17 in the current study) does not necessarily imply higher calibration.29 To prevent poor calibration, we followed some of the strategies outlined in Van Calster et al.30 For example, we included a sufficient sample size for the number of predictors, we used Lasso regression, a penalized regression technique, and we employed a simpler model that did not include too many interaction terms.
Furthermore, candidate variables that predict obesity risk were identified in the US administrative claims database using BMI recorded in the EMR. The XGBoost model ranked corresponding features by their relative contribution to the model in terms of “gain”, calculated by averaging each feature’s contribution for each tree in the model. Features were listed in descending order of their predictive power, as variables at the top contribute more to the model than the ones at the bottom. The highest-ranked predictors of obesity were relatively consistent across the ML algorithms used. Nevertheless, predictor importance did differ slightly between sensitivity analyses such as different lengths of baseline periods (3 or 6 months) and BMI targets of >35 kg/m2 and >40 kg/m2. Interestingly, results from ML models that excluded baseline BMI/weight-related diagnosis were able to predict obesity status based on other risk factors of obesity such as sleep apnea and use of antidiabetic medications. The findings potentially fill the gap of missing obesity status in administrative claims databases.
Besides the finding that the number of obesity diagnoses and diagnoses from inpatient settings were found to be the most important predictors of obesity, diagnosis of OSA, hypertension, and use of antidiabetic or antihypertension medications suggests that these factors are strongly associated with obesity and can be used to identify people at high risk for the condition. Interestingly, this study highlighted the presence of dermatological conditions such as acne and melanocytic nevi to negatively impact obesity risk. No epidemiological evidence is available suggesting a relationship between obesity and sebum production, and thus the pathogenesis of acne. In one of the largest risk factor studies conducted on the prevalence of melanocytic nevi among children and adolescents in Baltic countries, the condition was found to be associated with higher BMI.31 Aside from this, little to no evidence exists assessing the impact of these conditions on obesity risk.
Limitations of this study include the availability of all potential predictors in the databases investigated, such as race, region, and physical activity of the participants as well as the possible inaccuracies and misclassification of key variables like obesity diagnosis codes. Second, the database consists of administrative healthcare data of primarily large employer sponsor-insured individuals from a convenience sample of the population across the USA. While BMI is not the best indicator of obesity, it is a helpful tool for obesity screening and health assessment in clinical practice. It is a standardized metric used worldwide, a simple way to identify individuals who may be at risk for weight-related health problems, prompting further evaluation. Other anthropometric measures of adiposity, such as waist circumference or fat-to-muscle ratio, may provide a more complete picture of the obesity status of a patient. However, these measures are usually not captured in EMRs, and it would be difficult, if not impossible, to exclude specific populations, such as athletes with high BMI but without obesity,32 from the analyses. Therefore, the results may not be generalizable to all populations. Lastly, further research is warranted to carry out external validation of the models using different databases.
Identifying potential candidate variables that predict obesity using RWD may help decision-makers understand the impact of obesity at the population level, helping them to identify appropriate levers to implement policy measures to mitigate risks. Furthermore, ML methods can help improve obesity status prediction and thus assist practitioners and payors to estimate the burden of the condition, investigate the potential unmet need for current treatment, and determine the economic value of new treatments, at both individual and population levels. Moreover, as obesity is a major risk factor for several chronic conditions, predicting its occurrence using RWD will ensure greater accuracy in risk estimates for morbidity and mortality associated with its comorbidities. Future research could focus on validating the ML model in other populations and evaluating predictive scores for obesity-related complications, such as CVD and mortality in administrative claims data. For example, Njei et al recently used an explainable machine learning model for high-risk MASH prediction and compared its performance with well-established biomarkers such as MASLD fibrosis scores.33 The XGBoost model in the study had high sensitivity, specificity, AUC, and accuracy for identifying high-risk MASH. Furthermore, BMI was one of the top five predictors of high-risk MASH. Future studies are needed on how the predictive risk of obesity may change over time with obesity intervention or treatment.