Table 2

Performance of predictive models fitted to the training set (60%) and evaluated on the validation set (20%)

Model typeNo. of featuresAUC (%)Average precision (%)Brier scorePPV (%)Sensitivity (%)Precision (%)Youden Index (%)F1 scoreNPV (%)Specificity (%)
XGBoost model47979.475.60.17473.561.768.843.00.65176.181.4
Reg logistic model45876.571.50.18771.560.265.739.20.62874.879.0
Logistic model48176.471.50.18771.460.265.539.00.62774.878.8
Random forest model47876.172.30.19871.957.567.439.00.62174.281.4
Lasso model5875.370.90.19472.054.069.338.00.60773.384.1
  • Reg logistic model, logistic regression model; higher AUC, better distinction between patients with and without obesity; lower Brier score, greater accuracy; higher recall, better maximization of the number of true positives; higher precision, better minimization of false positives; Youden Index of >50%, higher F1 score, better performance of model; higher specificity, better identification of negative results.

  • AUC, area under the curve; Lasso, Least Absolute Shrinkage and Selection Operator; NPV, negative predictive value; PPV, positive predictive value; XGBoost, extreme gradient boosting.