Article Text

Download PDFPDF

Serum acylcarnitines and amino acids and risk of type 2 diabetes in a multiethnic Asian population
  1. Samuel H Gunther1,
  2. Chin Meng Khoo2,
  3. E-Shyong Tai3,
  4. Xueling Sim1,
  5. Jean-Paul Kovalik4,
  6. Jianhong Ching4,
  7. Jeannette J Lee1,
  8. Rob M van Dam1,5,6
  1. 1Saw Swee Hock School of Public Health, National University of Singapore, Singapore
  2. 2Department of Medicine, Yong Loo Lin School of Medicine, National University Health System, Singapore
  3. 3Saw Swee Hock School of Public Health, National University of Singapore and National University Health System, Singapore
  4. 4Programme in Cardiovascular and Metabolic Disorders, Duke-NUS Medical School, Singapore
  5. 5Yong Loo Lin School of Medicine, National University of Singapore, Singapore
  6. 6Department of Nutrition, Harvard T.H. Chan School of Public Health, Harvard University, Boston, Massachusetts, USA
  1. Correspondence to Samuel H Gunther; shgunther{at}


Introduction We evaluated whether concentrations of serum acylcarnitines and amino acids are associated with risk of type 2 diabetes and can improve predictive diabetes models in an Asian population.

Research design and methods We used data from 3313 male and female participants from the Singapore Prospective Study Program cohort who were diabetes-free at baseline. The average age at baseline was 48.0 years (SD: 11.9 years), and participants were of Chinese, Malay, and Indian ethnicity. Diabetes cases were identified through self-reported physician diagnosis, fasting glucose and glycated hemoglobin concentrations, and linkage to national disease registries. We measured fasting serum concentrations of 45 acylcarnitines and 14 amino acids. The association between metabolites and incident diabetes was modeled using Cox proportional hazards regression with adjustment for age, sex, ethnicity, height, and parental history of diabetes, and correction for multiple testing. Metabolites were added to the Atherosclerosis Risk in Communities (ARIC) predictive diabetes risk model to assess whether they could increase the area under the receiver operating characteristic curve (AUC).

Results Participants were followed up for an average of 8.4 years (SD: 2.1 years), during which time 314 developed diabetes. Branched-chain amino acids (HR: 1.477 per SD; 95% CI 1.325 to 1.647) and the alanine to glycine ratio (HR: 1.572; 95% CI 1.426 to 1.733) were most strongly associated with diabetes risk. Additionally, the acylcarnitines C4 and C16-OH, and the amino acids alanine, combined glutamate/glutamine, ornithine, phenylalanine, proline, and tyrosine were significantly associated with higher diabetes risk, and the acylcarnitine C8-DC and amino acids glycine and serine with lower risk. Adding selected metabolites to the ARIC model resulted in a significant increase in AUC from 0.836 to 0.846.

Conclusions We identified acylcarnitines and amino acids associated with risk of type 2 diabetes in an Asian population. A subset of these modestly improved the prediction of diabetes when added to an established diabetes risk model.

  • type 2 diabetes
  • risk predictors
  • Asia

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See:

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Significance of this study

What is already known about this subject?

  • Metabolomics studies can identify potential biomarkers of insulin resistance and type 2 diabetes, but data from Asian populations are limited.

What are the new findings?

  • A panel of serum acylcarnitines (C4, C8-DC, and C16-OH) and amino acids (alanine, glutamate, glutamine, glycine, isoleucine, leucine, ornithine, phenylalanine, proline, tyrosine, valine, and the alanine to glycine ratio) were associated with type 2 diabetes risk in an Asian population.

  • A subset of these metabolites modestly improved diabetes risk prediction when added to an established risk function.

  • The associations between alanine and tyrosine and diabetes risk may be partly mediated by glucose metabolism, and those of glutamate and glutamine and diabetes risk by central adiposity.

How might these results change the focus of research or clinical practice?

  • Selected acylcarnitines and amino acids may play a role in the development of type 2 diabetes and could be potential therapeutic targets.

  • Further research is required to evaluate whether these novel biomarkers provide a clinically significant improvement in diabetes risk prediction.


Type 2 diabetes mellitus is a significant public health challenge, with prevalence rates increasing globally as a result of urbanization, reduced physical activity, changing dietary patterns, and increasing obesity.1 Over 60% of all diabetes cases occur in Asia. Individuals of Asian ethnicity experience a higher risk of diabetes, as ethnic predisposition has been observed to interact with other known risk factors such as age and body mass index (BMI).2 3 However, prospective studies of incident diabetes often do not include participants of Asian ethnicity or consider ethnic differences.

Metabolomics—the targeted or untargeted analysis and profiling of components of the metabolic pathway—is emerging as a useful method to better understand the pathogenesis and improve the prediction of type 2 diabetes.4 Metabolomics studies have identified several compounds, including acylcarnitines and amino acids, that are associated with insulin resistance and incident type 2 diabetes.4 5 However, data from Asian populations are sparse. Previous studies using targeted metabolomics analysis have implicated amino acid and acylcarnitine signatures associated with future risk of type 2 diabetes in Asian populations.6 7 In addition, a panel of amino acids were associated with insulin resistance independent of BMI in a population of ethnic Chinese and Indians.8

The aim of this study was therefore to conduct metabolomics analyses to identify risk factors for type 2 diabetes in a multiethnic Asian population. We measured a panel of circulating acylcarnitines and amino acids using targeted metabolomics and evaluated their association with incident diabetes. We also evaluated whether adding these biomarkers to the established Atherosclerosis Risk in Communities (ARIC) predictive diabetes model9 would improve the prediction of diabetes.

Research design and methods

Study population and design

The study population consisted of participants from the Singapore Prospective Study Program (SP2). The methodology of SP2, a population-based study conducted in Singapore between 2004 and 2007, has been previously described in detail.10 Briefly, recruited individuals had participated in any of four previous, population-based, cross-sectional surveys: the Thyroid and Heart Study 1982–1984,11 the National Health Survey 1992,12 the National University of Singapore Heart Study 1993–1995,13 and the National Health Survey 1998.14 Participants were contacted at their homes for an interview and health examination. The interview consisted of standardized questionnaires on lifestyle factors and medical history, while the health examination included physical measurements and collection of fasting blood samples. The health examination did not include thyroid or liver function tests, or measures of blood urea nitrogen. Of the 11 053 original participants of the four studies, 10 445 were eligible for SP2. The 517 original participants who had died, 6 who had emigrated, and 85 who had recorded an error in identity card during the original study were considered ineligible. There were 7742 participants who completed the questionnaire, of whom 5157 subsequently participated in the health examination. The main reason for non-response was being uncontactable during the study period (n=2673), with a smaller portion having refused to participate (n=30). For this study, relevant baseline data, including sociodemographic information, measures of glycemic control, and serum metabolite measurements, were available for 4451 participants.

Since the outcome of interest was incident diabetes, further exclusion of participants was based on diabetes status at baseline. Of the 4451 participants, 431 reported having diabetes at baseline. Additionally, 159 participants were classified to have undiagnosed diabetes based on their baseline fasting plasma glucose (FPG ≥7.0 mmol/L) and/or baseline glycated hemoglobin (HbA1c ≥6.5%; 48 mmol/mol) measurements.15 Furthermore, 419 participants did not provide consent for their diabetes status to be tracked over time, leaving 3442 participants available for the study at baseline. A follow-up examination was conducted between 2011 and 2016, which also consisted of both an interview at home and a health examination. Of the baseline participants, 129 were lost to follow-up, leaving 3313 participants with revisit and diabetes outcome data.

Assessment of incident diabetes

Diabetes assessment over the course of the follow-up period was based on one or more of the following criteria: self-reported physician diagnosis of diabetes during follow-up interview, FPG (≥7.0 mmol/L) or HbA1c (≥6.5%; 48 mmol/mol) concentrations, or diabetes diagnosis reported in a nationwide health database covering subsidized general practitioners, polyclinics, and public hospitals, and updated on a regular basis. The dates of diagnosis through any of these three criteria were linked with initial SP2 interview date to calculate time-to-event.

Assessment of serum metabolites

For extraction of acylcarnitines and amino acids, 50 µL of serum was spiked with deuterium-labeled acylcarnitine and amino acid standards, including D3-C2, D3-C3, D3-C4, D9-C5, D3-C8, and D3-C16 carnitines (Cambridge Isotope Laboratories, Andover, Massachusetts, USA), and alanine-D4, arginine-D5, citrulline-D2, glutamic acid-D3, glycine-D2, histidine-D6, leucine-D3, methionine-D3, ornithine-D2, phenylalanine-D5, proline-D7, serine-D3, tryptophan-D5, and valine-D8 (Sigma Aldrich, USA). The mixture was extracted using methanol. The extracts were then derivatized with 3M hydrochloric acid in methanol or butanol (Sigma Aldrich), respectively, dried, and reconstituted in methanol for analysis in liquid chromatography-mass spectrometry (LC-MS).

Acylcarnitine measurements were performed without a column, using flow injection tandem mass spectrometry on the Agilent 6430 Triple Quadrupole LC/MS System (Agilent Technologies, California, USA). The sample analysis was carried out at 0.4 mL/min of 80/20 methanol/water as mobile phase, and injection of 4 µL of sample. Data acquisition and analysis were performed on Agilent MassHunter Workstation V.B.06.00 software. Amino acids were separated using a C18 column (Phenomenex, 100×2.1 mm, 1.6 µm, Luna Omega) on an Agilent 1290 Infinity LC System (Agilent Technologies) coupled with quadrupole-ion trap mass spectrometer (QTRAP 5500, AB Sciex, DC, USA). Mobile phase A (water) and mobile phase B (acetonitrile), both containing 0.1% formic acid, were used for chromatography separation. The LC run was performed at a flow rate of 0.4 mL/min with initial gradient of 2% B for 0.8 min, then increased to 15% B in 0.1 min, 20% B in 5.7 min, 50% B in 0.5 min, and 70% B in 0.5 min, followed by re-equilibration of the column to the initial run condition of 2% B for 0.9 min. All compounds were ionized in positive mode using electrospray ionization. The chromatograms were integrated using MultiQuant V.3.0.3 software (AB Sciex).

Absolute quantification of both acylcarnitines and amino acids was done by comparing the ratios of the metabolites with their respective internal standards, against an external calibration curve. External calibration curves consisted of C2, C3, C4, C5, C6, C8, C10, C12, C14, C16, and C18 carnitines and all reported amino acids.

Quality control (QC) was performed on the metabolite data prior to statistical analysis. To reduce potential batch effect, data were log2-transformed and normalized in two steps: first by adjusting the intensity levels of each metabolite according to the QC sample runs in each batch, and second by equalizing the average log2 concentration across the batches. Next, all metabolites with a coefficient of variation (CV)% of more than 20% for duplicate measurements were excluded from further analysis. The proportion of values either missing or below the limit of detection was determined for each remaining metabolite, and all metabolites with more than 5% missing data were also excluded from further analysis. Finally, missing values for the remaining metabolites were imputed using k-nearest neighbor imputation, whereby missing values for a given metabolite were replaced with an average of the non-missing values for that metabolite from participants similar in terms of the other variables. See online supplemental table 1 for detailed QC metrics for each metabolite.

Supplemental material

Prior to QC, the metabolite panel consisted of 59 metabolites: 45 acylcarnitine and 14 amino acid measurements. As no column separation was used for the acylcarnitine analysis, certain pairs of isomeric acylcarnitines were reported as the sum of both species. Isoleucine and leucine, and glutamate and glutamine, were similarly measured in aggregate. Following QC, 37 acylcarnitines and all 14 amino acids remained available for analysis. Five acylcarnitine measurements, namely the C12-OH/C10-DC, C18:1-OH/C16:1-DC, and C18-OH/C16-DC aggregate measures, in addition to C4-OH and C22, were removed due to having a CV% greater than 20%. Three more, namely C5, C8:1-DC, and the C8:1-OH/C6:1-DC aggregate, were removed due to having too many values either missing or below the limit of detection.

The 37 remaining acylcarnitine measurements were combined into three functional groups based on chain length: short chain (species with carbon chain length 8 or shorter), medium chain (10–14) and long chain (16 or longer). Similarly, certain amino acids were combined into two functional groups: aromatic amino acids and branched-chain amino acids. Additionally, we examined the Fischer ratio (ratio of the sum of isoleucine, leucine, and valine to the sum of tyrosine and phenylalanine) and the ratio of alanine to glycine, two established metabolite ratios associated with diabetes.16 17 Online supplemental table 2 provides details of the full metabolite panel, functional groupings, and ratios, including those metabolites excluded from statistical analysis.

Assessment of covariates

Covariates in this study included age (years), sex (male/female), ethnicity (Chinese/Malay/Indian), height (cm), waist circumference (cm), parental history of type 2 diabetes (yes/no), systolic blood pressure (SBP, mm Hg), FPG (mmol/L), serum triglycerides (TG, mmol/L), high-density lipoprotein (HDL)-cholesterol (mmol/L), and fasting insulin (mU/L). Ethnicity was based on participants’ national identity cards and recorded as one of three categories representing the three main ethnic groups in Singapore, namely Chinese, Malay, and Indian; participants of mixed ethnicity were categorized as the primary ethnicity listed on their identity card. Height was measured without shoes and with the head in the Frankfurt plane position using a portable stadiometer (SECA, Model 782-2321009; Vogel & Halke, Hamburg, Germany). Waist circumference at the midpoint between the last rib and the iliac crest was measured using stretch-resistant tape. SBP was measured twice using the Dinamap Carescape Monitor (Woodley Equipment, Horwich, Bolton, UK) and the average of the two values was used for analysis. Blood samples were analyzed on the day of collection for FPG, TG, HDL-cholesterol, and fasting insulin at the biochemistry laboratories of the National University Hospital and Singapore General Hospital. Measurements were calibrated between the two sites and the calibrated values used for analysis. See online supplemental table 3 for detailed methodology and QC metrics for biosample analysis.

Data on additional variables of potential interest in relation to diabetes incidence were also collected during the SP2 interview and health examination. These included BMI (kg/m2), serum creatinine (µmol/L), regular use of lipid-lowering or blood pressure-lowering medication (yes/no), and daily intake of the key nutrients carbohydrates, protein, total fat, saturated fat, monounsaturated fat, and polyunsaturated fat (% of total energy).

Statistical analysis

Metabolite data were first log2-transformed to test various metabolite ratios in our data set using the p-gain statistics, which determines whether a ratio of metabolites contains more information than the constituent metabolites alone.18 The significance level for the p-gain test was set at 0.05. Concentrations were then exponentiated back to their original value to be standardized and converted into Z-scores. The ratios were also standardized and converted to Z-scores for consistency of interpretation. The association between metabolites and time to diabetes development was modeled using Cox regression. Two models were used for the main analysis: an unadjusted model, and a multivariable model adjusting for the non-modifiable risk factors age, sex, ethnicity, height, and parental history of diabetes. β-values from the models were exponentiated to produce HR. Each metabolite, aggregate pair, or ratio was evaluated in both models, and multiple testing correction was applied to all results using the Benjamini-Hochberg procedure, controlling the false discovery rate at an α-value of 0.05.

Several additional Cox regression models were developed for secondary analyses in order to assess the impact of modifiable risk factors on the observed associations and to explore potential biological mechanisms underlying the development of diabetes. For these analyses, the multivariable Cox model used in the main analysis was further adjusted for SBP, waist circumference, levels of blood lipids (TG and HDL-cholesterol), and levels of glycemic markers (FPG and fasting insulin).

Metabolites that remained significantly associated with incident diabetes in the main multivariable model were identified as potential biomarkers and incorporated into the ARIC risk model to determine whether they could improve predictive ability. The ARIC model was selected for this study based on a previous investigation by Chin et al,19 in which the ARIC model was shown to better predict diabetes risk in the Singapore population than the San Antonio Health Study model and the Framingham model. In addition to sociodemographic factors, the ARIC model includes the risk factors waist circumference, SBP, FPG, TG, and HDL-cholesterol. In accordance with the binary way in which ethnicity is identified in the ARIC model, participants belonging to the minority Malay and Indian ethnic groups were combined into a single ‘non-Chinese’ category during risk prediction.

Predictive ability was assessed using area under the receiver operating characteristic curve (AUC), with 95% CI for model AUC estimated using the DeLong method for correlated receiver operating characteristic curves. Metabolites were individually added to the ARIC model, and a final, parsimonious model was identified using a backward stepwise approach. An ARIC model was constructed containing all potential biomarkers, with subsequent models removing metabolites according to the lowest β-value, until a final model yielding the most improved predictive value with the best model fit was identified. Models were trained on a randomly selected half of the data set and tested on the other, and model calibration and goodness of fit were assessed using the Akaike information criterion (AIC). As a secondary analysis, models were also assessed in terms of their net reclassification improvement indices, which illustrate the ability of successive predictive models to accurately reclassify cases and non-cases. Finally, sensitivity analyses were conducted using the participants of Chinese ethnicity to identify potential ethnic differences in diabetes prediction. All analyses were performed using R V.3.5.1 (R Core Team, Vienna, Austria).


Of the 3313 study participants in the revisit, 314 (9.5%) developed incident diabetes during follow-up. The mean (SD) duration of follow-up was 8.4 (2.1) years. Table 1 summarizes the sociodemographic and clinical characteristics of the study population according to diabetes development. Participants who developed diabetes were significantly older, shorter, had a greater waist circumference and BMI, were more likely to have a parental history of diabetes, and were more likely to be taking lipid-lowering and blood pressure-lowering medication at baseline as compared with non-cases. They also had higher SBP, FPG, TG, fasting insulin, serum creatinine, and HbA1c concentrations, and lower HDL-cholesterol concentrations. A higher proportion of cases were of Malay or Indian ethnicity as compared with non-cases, who were more likely to be of Chinese ethnicity.

Table 1

Baseline characteristics of participants who did and did not develop diabetes mellitus (DM)

Table 2 shows the association between metabolite concentrations and incident diabetes after adjustment for age, sex, ethnicity, height, and parental history of diabetes (see online supplemental table 4 for unadjusted associations). In the multivariable model, three acylcarnitines (C4, C8-DC, and C16-OH), 10 amino acid measurements (alanine, glutamate/glutamine, glycine, isoleucine/leucine, ornithine, phenylalanine, proline, serine, tyrosine, and valine), and the alanine to glycine ratio, as well as the sum molar of the aromatic and branched-chain groups, were significantly associated with incident diabetes. Of these metabolites, C8-DC, glycine, and serine were inversely associated with diabetes risk, whereas the other metabolites were directly associated with diabetes risk.

Table 2

Associations between metabolites and incident diabetes, adjusted for age, sex, ethnicity, height, and parental history of diabetes

To investigate the collinearity of the statistically significant biomarkers, which could potentially cause model overfitting, we generated a heatmap of the pairwise partial correlations between the biomarkers, controlling for other diabetes risk factors (figure 1; see online supplemental figure 1 for unadjusted correlations). The highest degree of collinearity occurred among the branched-chain amino acids isoleucine/leucine and valine (r=0.866). The branched-chain species were also moderately correlated with phenylalanine (isoleucine/leucine r=0.687, valine r=0.644). The pairwise correlations among the remaining biomarkers were not strong (r for all <0.600).

Figure 1

Heatmap of the pairwise partial correlations between the 13 potential biomarkers, adjusted for age, sex, ethnicity, height, waist circumference, parental history of diabetes, systolic blood pressure, fasting plasma glucose, serum triglycerides, high-density lipoprotein cholesterol, and fasting insulin.

We also evaluated whether the observed associations between metabolites and diabetes incidence could be explained by established diabetes risk factors that may act as mediators (table 3). Building on the multivariable Cox model used for the main analysis, four models were constructed additionally adjusting for SBP, waist circumference, blood lipids (TG and HDL-cholesterol), or glycemic markers (FPG and fasting insulin). Attenuation of associations was defined as a change in the β-value of at least 10% toward the null. Addition of SBP to the model did not substantially attenuate any of the associations. Adjustment for waist circumference attenuated associations for glutamate/glutamine, the branched-chain amino acids (isoleucine/leucine and valine individually and combined), and the alanine to glycine ratio. Adjustment for blood lipids attenuated associations for alanine, glutamate/glutamine, the branched-chain species, and the alanine to glycine ratio. Finally, adjusting for glycemic markers attenuated associations for alanine, tyrosine, and the branched-chain species. Some of the acylcarnitines (C4 and C8-DC) and amino acids (proline and ornithine) with weaker associations in the basic models lost statistical significance after further adjustments even though their β-values changed less than 10%.

Table 3

Associations between serum biomarkers and type 2 diabetes after adjustment for potential mediators

We next evaluated whether adding acylcarnitine and amino acid metabolites to the ARIC model could improve the prediction of incident diabetes based on AUC and AIC metrics (table 4). The ARIC model alone had an AUC of 0.836 and an AIC of 1559.4. Addition of the metabolites individually did not result in significant improvement in AUC. However, addition of all 14 metabolite measurements to the ARIC model resulted in a modest but statistically significant improvement in AUC to 0.847 (p=0.013) and a lower AIC (1541.9). Stepwise removal of the biomarkers produced an ARIC model supplemented with seven metabolite measurements (C8-DC, C16-OH, isoleucine/leucine, ornithine, proline, serine, and the alanine to glycine ratio), which had the best model fit (AIC: 1530.3). This model also represented a significant improvement in AUC (0.846, p=0.022) as compared with the ARIC model alone, and recorded a net improvement of 39.8% in reclassification of diabetes cases and non-cases.

Table 4

Comparisons of AUC, AIC, and NRI for the ARIC model* and models supplemented with the potential biomarkers

Finally, we conducted sensitivity analyses in the 3095 participants who were not pre-diabetic (pre-diabetes defined by the American Diabetes Association criteria of FPG >5.6 mmol/L),15 which produced similar results (data not shown). In an additional sensitivity analysis, we only included ethnic Chinese, who made up 77.8% of the study participants. All of the more significant associations in the original analysis (p<0.01 after multiple testing correction) remained significant in the ethnic Chinese subgroup (online supplemental table 5).


In this prospective cohort study, we identified several serum acylcarnitine and amino acid metabolites with the potential to serve as biomarkers of type 2 diabetes in Asian populations. Higher concentrations of the acylcarnitines C4 and C16-OH, alanine, glutamate/glutamine, ornithine, proline, the branched-chain amino acids isoleucine/leucine and valine, and the aromatic amino acids tyrosine and phenylalanine were associated with higher diabetes risk. A higher alanine to glycine ratio was also associated with higher diabetes risk. In contrast, higher concentrations of serine, glycine, and C8-DC acylcarnitine were associated with lower diabetes risk. Adjustment for known metabolic risk factors (blood lipids and glycemic markers) partially explained the associations with diabetes risk for alanine, glutamate/glutamine, tyrosine, the branched-chain species, and the alanine to glycine ratio. Adding a panel of metabolites (C8-DC, C16-OH, isoleucine/leucine, ornithine, proline, serine, and the alanine to glycine ratio) to the ARIC model with established diabetes risk factors led to a modest but statistically significant improvement in the prediction of diabetes.

Previous studies have implicated several of the amino acids associated with diabetes in our study as potential biomarkers of insulin resistance and type 2 diabetes. Higher levels of branched-chain amino acids have previously been linked to higher diabetes risk in populations of European, Hispanic, African, and Asian ancestry.5 20–22 Furthermore, a large-scale Mendelian randomization analysis identified genetic instruments reflective of higher levels of circulating branched-chain species that were also associated with higher diabetes risk, suggesting a causal role of branched-chain amino acid metabolism in diabetes development.23 These findings are consistent with knowledge of biological mechanisms involved in diabetes development. Branched-chain species play a central role in the PI3K-AKT-mTOR signaling pathway by regulating expression of genes and phosphorylation of kinases involved in glucose and lipid metabolism.24 Metabolic imbalance and overexpression of branched-chain species lead to phosphorylation of insulin receptor substrate (IRS)-1, which interferes with insulin signaling and over time leads to insulin resistance.5 A related amino acid group to the branched-chain species is the aromatic amino acids, consisting of phenylalanine, tyrosine, and tryptophan (not measured in this study). The aromatic and branched-chain species share a transmembrane protein,25 and higher levels of the five amino acids have been observed to be associated with higher diabetes risk in multiple previous studies as well as our own.20 22 26 27 It has been proposed that tyrosine can inhibit glucose transport and phosphorylation,28 a hypothesis supported by our finding that additional adjusting for FPG and fasting insulin attenuates the association between tyrosine and diabetes risk.

Branched-chain species serve as nitrogen donors for alanine, glutamate, and glutamine,24 which may partially explain why higher levels of these amino acids were also significantly associated with diabetes risk in our study. That being said, higher serum levels of alanine, but not the branched-chain species, were consistently associated with higher diabetes risk in two Chinese cohorts,6 which suggests the link between alanine and diabetes risk is not necessarily due to branched-chain species metabolism. This is also consistent with biological mechanisms, as alanine stimulates glucagon secretion,29 and alterations in alanine metabolism as a manifestation of non-alcoholic fatty liver disease have been linked to higher diabetes risk.30 The association for alanine was attenuated by further adjusting for blood lipids and glycemic markers in our study, which is consistent with its biological role in gluconeogenesis. Furthermore, higher concentrations of aggregate glutamate/glutamine were associated with insulin resistance and diabetes development in ethnic Chinese and Indian SP2 participants,8 and in the Insulin Resistance Atherosclerosis Study,21 while higher concentrations of glutamate by itself were associated with insulin resistance phenotypes in the Framingham Heart Study and Malmö Diet and Cancer Study cohorts.31 Glutamate has also been shown to stimulate glucagon secretion and gluconeogenesis,32 and serves as a metabolic precursor to α-ketoglutarate, a keto acid with anticatabolic effects on protein metabolism,33 which again suggests a biological link between glutamate and diabetes risk independent of branched-chain amino acids. In our study, the association for glutamate/glutamine was attenuated by further adjusting for waist circumference and blood lipids, which potentially implicates central adiposity as a mediator in the link between these species and diabetes.

A growing body of evidence supports our finding of an inverse association between circulating glycine levels and diabetes risk.34 Lower serum concentrations of glycine were associated with higher insulin resistance and diabetes risk in the Insulin Resistance Atherosclerosis Study, the Framingham Heart Study, the Malmö Diet and Cancer Study, the European Prospective Investigation into Cancer and Nutrition - Potsdam, and the Relationship of Insulin Sensitivity to Cardiovascular Risk study cohorts,21 31 35 36 while in a Japanese prospective cohort study, baseline concentrations of glycine were lower in participants who developed diabetes compared with those who did not.37 Furthermore, a Mendelian randomization analysis reported genetic instruments reflecting higher levels of circulating glycine were associated with a lower diabetes risk, suggesting a causal protective effect of glycine on diabetes risk.38 This is also consistent with biological mechanisms, as glycine plays key metabolic roles as a neurotransmitter, in the synthesis of heme and the antioxidant glutathione, and in the regulation of one-carbon metabolism.34 Dysregulation of these pathways from overexpression of glycine is proposed to contribute to insulin resistance by increasing oxidative stress in pancreatic cells, compromising mitochondrial function, and disrupting glucose homeostasis.34 We also observed a significant association between a higher alanine to glycine ratio and diabetes risk. The alanine to glycine ratio was strongly associated with insulin sensitivity measured using a hyperglycemic clamp and incident diabetes in the Cooperative Health Research in the Region of Augsburg S4_to_F4 cohort.17 Analysis of metabolite ratios is an emerging field that can provide additional information in association studies by reducing overall biological variability in a given study population and better representing biochemical pathways,18 and our results provide further evidence of their value.

Additionally, we observed significant associations between ornithine and proline concentrations and diabetes risk, a finding also reported in a Japanese study.37 The biological mechanisms underlying this putative relationship are not well understood. Both ornithine and proline are produced by arginase activity during the urea cycle, and upregulated arginase activity, resulting in higher ornithine and proline levels, can decrease nitric oxide bioavailability and lead to metabolic complications including diabetes.39 However, this pathway is mediated by arginine and also results in citrulline biosynthesis, and neither of those species were significantly associated with diabetes risk in our study. Conversely, arginine was associated with diabetes risk in a Japanese cohort,37 and ornithine levels were inversely associated with diabetes risk in a Chinese study.40 Likewise, the role of serine in diabetes development is understudied, although the Japanese study did report lower concentrations of serine in participants who developed diabetes compared with those who did not.37 Serine is synthesized by glycine activity, and it is possible that depressed levels in those with higher diabetes risk are reflective of depressed glycine levels and the consequent metabolic imbalances.41 Enzymes involved in serine biosynthesis have been linked to insulin signaling and sensitivity in animal studies, while a lack of serine in cancer cells results in altered mitochondrial metabolism akin to metabolic disturbances resulting in insulin resistance.41 Further research into the roles of ornithine, proline and serine in diabetes development is required to clarify these inconsistencies and whether these species contribute to or merely indicate higher diabetes risk.

In addition to amino acids, we observed three acylcarnitines, C4, C8-DC, and C16-OH, to be associated with diabetes risk. Acylcarnitines are primarily produced from mitochondrial fatty acid β-oxidation, and their accumulation may indicate incomplete fatty acid oxidation and downstream metabolic disturbances, including depletion of tricarboxylic acid cycle intermediates and activation of pathways that interfere with insulin action.42–44 Short-chain species, such as C4, are intermediate products of β-oxidation, and their accumulation in participants with type 2 diabetes may indicate generalized dysfunction at the interface of fatty acid oxidation and the electron transport chain.4 Dicarboxylic species, including C8-DC, are produced when long-chain fatty acids undergo ω-oxidation, a compensatory pathway activated when β-oxidation is disturbed. Reduced concentrations of these species in those with a high risk of diabetes could indicate a disturbance of β-oxidation if the ω-oxidation rescue pathway was also impaired. This would lead to accumulation of fatty acid fuels in the mitochondria and contribute to insulin resistance via the mismatching of fuel and ATP demand.45 While we did not observe significant associations between medium-chain species and diabetes risk following multiple testing correction, it has been suggested that the accumulation of medium-chain species results in activation of the proinflammatory NFκB pathway, which in turn promotes insulin resistance.44 The accumulation of long-chain species, such as C16-OH, is similarly thought to be reflective of impaired tricarboxylic acid cycle activity, as they are the initial products of β-oxidation.46

The link between acylcarnitines and diabetes is controversial, and there is lack of consensus over whether elevated or depressed levels of specific short-chain, medium-chain, and long-chain species are associated with diabetes risk.4 7 42–46 In a Chinese cohort, fasting serum concentrations of C4 were higher in diabetes cases than in non-cases, but the investigators did not find an association with C8-DC or C16-OH.7 The authors also described a panel of long-chain acylcarnitines that were significantly associated with diabetes risk and increased the AUC of a predictive diabetes risk model, although C16-OH was not part of the panel. In a US study, fasting concentrations of both C4 and C16-OH were higher in participants with type 2 diabetes compared with lean participants without diabetes.4 A German study, however, reported higher concentrations of C16-OH but not C4 in participants with diabetes compared with those with normal glucose tolerance.42 A Mexican study reported elevated concentrations of C4 in obese participants without diabetes compared with their counterparts with diabetes,43 while a US study reported no difference in C4 levels between participants of these categories.44 In our study, the association between C4 and diabetes risk was not attenuated after additional adjusting for waist circumference, which suggests the mechanism may not be mediated by body fatness. To our knowledge, this is the first study to report an inverse association between serum C8-DC levels and diabetes risk, although an animal study reported higher concentrations of C8-DC in insulin-resistant mice.47 Further research is required to clarify the role of acylcarnitines in the development of diabetes in humans.

Strengths of our study included the prospective design and the Asian study population, a population that is more susceptible to diabetes than populations of European ancestry.3 Our study also had several potential limitations. First, we had substantial non-response during follow-up. This is a common issue in large cohort studies, and we addressed it by using a nationwide clinical registry to ascertain incident diabetes in addition to reported diagnosis and fasting glucose and HbA1c measurements during follow-up. However, there remains some potential for cases to have gone undetected, for instance if participants were diagnosed at a private clinic. Second, metabolite profiles were measured only once during follow-up, resulting in potentially inaccurate measurements of long-term biomarker levels and potential attenuation of observed associations. While the targeted metabolomic approach facilitated identification of potential biomarkers, the panel of metabolites was not exhaustive and concentrations of other clinically important species such as lysine and tryptophan were not recorded. Measuring certain metabolites, including glutamate and glutamine, in aggregate may also have weakened our findings, as the two species play separate biological roles and have displayed markedly differential associations with diabetes risk when measured separately.31 32 Additionally, while we based our multivariable analyses on an established diabetes risk model, there remains a potential for residual confounding due to risk factors not included in the ARIC model. Finally, our findings apply to a multiethnic Asian population and may not necessarily generalize to other populations or ethnic groups.

Our results provide further evidence of the role of specific acylcarnitines, amino acids, and amino acid ratios in the development of type 2 diabetes in Asian populations. A predictive model containing a panel of acylcarnitines and amino acids improved classification of both diabetes cases and non-cases as compared with a model containing solely the established risk factors included in the ARIC model. The increasing availability and affordability of profiling technologies mean they could feasibly be applied in the clinical setting. However, it remains unclear whether measurement of novel metabolites leads to sufficient improvement in the identification of high-risk groups to warrant use in clinical practice. Further research is warranted to establish whether specific acylcarnitines and amino acids play a causal role in the etiology of diabetes and could be targets for preventive interventions.


The authors thank the members of the SP2 cohort for their cooperation and participation. They also thank the SP2 data collection and management team, Hai Ning Wee and Kee Voon Chua of Duke-NUS, and Milly Ng and Yueheng Hong of SSHSPH.



  • Contributors SHG contributed to study design and conduct, data analysis, and wrote the manuscript. XS contributed to study design and data analysis and reviewed the manuscript. E-ST, CMK, J-PK, JC, JJL, and RMvD contributed to study design and conduct, data collection, and reviewed the manuscript. All authors approved the final version of the manuscript.

  • Funding This work was supported by grants from the Biomedical Research Council (grant 03/1/27/18/216), National Medical Research Council (grants 0838/2004 and 1111/2007), and National Research Foundation (through the Biomedical Research Council, grants 05/1/21/19/425 and 11/1/21/19/678) of the Republic of Singapore.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Ethics approval All participants provided written informed consent before taking part in SP2 and the follow-up examinations. SP2 and the follow-up examinations were approved by the National University of Singapore IRB (reference no. 12-282).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data availability statement Data collected and analyzed in this study are available upon request from the National University of Singapore. Inquiries can be directed to