Article Text
Abstract
Introduction This study aimed to assess data relevancy and data quality of the Innovation in Medical Evidence Development and Surveillance System Distributed Database (IMEDS-DD) for diabetes research and to evaluate comparability of its type 2 diabetes cohort to the general type 2 diabetes population.
Research design and methods A retrospective study was conducted using the IMEDS-DD. Eligible members were adults with a medical encounter between April 1, 2018 and March 31, 2019 (index period). Type 2 diabetes and co-existing conditions were determined using all data available from April 1, 2016 to the most recent encounter within the index period. Type 2 diabetes patient characteristics, comorbidities and hemoglobin A1c (HbA1c) values were summarized and compared with those reported in national benchmarks and literature.
Results Type 2 diabetes prevalence was 12.6% in the IMEDS-DD. Of 4 14 672 patients with type 2 diabetes, 52.8% were male, and the mean age was 65.0 (SD 13.3) years. Common comorbidities included hypertension (84.5%), hyperlipidemia (82.8%), obesity (45.3%), and cardiovascular disease (44.7%). Moderate-to-severe chronic kidney disease was observed in 20.2% patients. The most commonly used antihyperglycemic agents included metformin (35.7%), sulfonylureas (14.8%), and insulin (9.9%). Less than one-half (48.9%) had an HbA1c value recorded. These findings demonstrated the notable similarity in patient characteristics between type 2 diabetes populations identified within the IMEDS-DD and other large databases.
Conclusions Despite the limitations related to HbA1c data, our findings indicate that the IMEDS-DD contains robust information on key data elements to conduct pharmacoepidemiological studies in diabetes, including member demographic and clinical characteristics and health services utilization.
- database
- diabetes mellitus, type 2
- pharmacoepidemiology
- safety
Data availability statement
Data may be obtained from a third party and are not publicly available.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
WHAT IS ALREADY KNOWN ON THIS TOPIC
Distributed database network has become a new type of data source for observational diabetes research.
The Sentinel Distributed Database (SDD) of the Sentinel System, a national electronic active surveillance system for medical product safety established under the US Food and Drug Administration’s Sentinel Initiative, is one successful example.
WHAT DOES THIS STUDY ADD
As a subset of SDD, the Innovation in Medical Evidence Development and Surveillance System Distributed Database (IMEDS-DD) reliably captures key data elements to conduct pharmacoepidemiological studies in patients with type 2 diabetes, including demographics, comorbidities, and health services utilization—substantial overlap was found in patient characteristics between such population from the IMEDS-DD and those included in published benchmarks or within other large databases.
HOW MIGHT THIS STUDY AFFECT RESEARCH, PRACTICE OR POLICY
The IMEDS-DD, accessible through the IMEDS program, is a new data source for epidemiological studies related to type 2 diabetes and its management.
Introduction
The Innovation in Medical Evidence Development and Surveillance Distributed Database (IMEDS-DD) is a subset of the Food and Drug Administration (FDA) Sentinel Distributed Database. The IMEDS is a public-private partnership launched in 2017 by the Reagan-Udall Foundation for the FDA, an independent, non-profit organization created by the United States Congress, to advance the FDA’s mission by promoting regulatory science. The IMEDS provides a framework for private-sector entities (eg, regulated industry, academic institutes) to leverage the FDA Sentinel System, a national electronic system established under the Sentinel Initiative1 2 for active safety surveillance of medical products including drugs, biologics, vaccines, and medical devices. The policies and procedures for using the IMEDS-DD for observational research are available on the foundation’s website (https://reaganudall.org/programs/research/about-imeds).
The IMEDS-DD comprises selected network partners and uses routinely collected administrative claims and laboratory result data from the Sentinel Distributed Database, including data elements commonly available in database studies such as demographics, health plan enrollment, diagnoses, procedures, outpatient pharmacy dispensing records, and laboratory results. As of early 2021, the IMEDS-DD had access to data available for over 111 million person-lives across 9 health plan partners and is expected to be largely representative of the commercially insured population, including employer-sponsored health plan and Medicare Advantage members, in the USA. The IMEDS-DD shares the same data management, common data model, privacy protection methods, and quality assurance procedures with the Sentinel Distributed Database,3–7 and the same secure distributed querying approach with the Sentinel System . As such, the IMEDS-DD inherited the data curation process and infrastructure standards, as well as the privacy-preserving techniques and analytic tools, of the FDA Sentinel System.
Although the Sentinel Distributed Database and data from many of its network partners have been used widely in epidemiological studies for type 2 diabetes, the IMEDS-DD has not previously been used for the same purpose. This study aimed to serve as a feasibility assessment of using the IMEDS-DD for observational diabetes research. Specifically, the study evaluated data relevancy and data quality of the database and examined availability of key data elements in epidemiological studies for type 2 diabetes as well as comparability of such to those provided by other healthcare databases.
Research design and methods
Study design and study population
This observational study adopted a non-interventional, retrospective design and examined demographic and clinical characteristics of patients with type 2 diabetes identified from the IMEDS-DD. This study included data from seven national and regional health insurers participating in the IMEDS-DD in the USA: CVS Health/Aetna, Harvard Pilgrim Health Care, HealthCore/Elevance Health, HealthPartners, Humana, Marshfield Clinic Health System, and TennCare. Eligible members had to meet the following criteria up until the date of their most recent medical encounter (referred to as the ‘index date’, regardless of care settings) between April 1, 2018 and March 31, 2019 (index period): 18 years of age or older, at least 6 months continuous enrollment in medical and prescription drug insurance plans (maximum allowable enrollment gap of 45 days), and at least one E11.x Type 2 Diabetes Mellitus plus no E10.x Type 1 Diabetes Mellitus or O24.4x Gestational Diabetes Mellitus in Pregnancy International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) diagnosis recorded in claims any time since April 1, 2016 through the end of the index period (see design diagram in figure 1).
Patient characteristics assessment
The data availability was first assessed via characterization of the above cohort by key dimensions in diabetes research, including demographics, antihyperglycemic treatment by drug class (metformin, insulin, sulfonylurea, thiazolidinediones, dipeptidyl peptidase-4 (DPP-4) inhibitors, glucagon-like peptide-1 (GLP-1) receptor agonist; sodium-glucose cotransporter-2 (SGLT-2) inhibitors, and others (alpha-glucosidase inhibitors, meglitinides), comorbidities, general health services utilization, and hemoglobin A1c (HbA1c) values. The evaluation period was from April, 2016 to the index date except for demographics and antihyperglycemic treatment by drug class, which were evaluated on the index date (figure 1). For antihyperglycemic treatments, drug class utilization was determined by any dispensing with overlapping days supply on the index date, including dispensing on the index date.
This study used outpatient pharmacy dispensings to define drug utilization and medical encounter claims to define existing conditions or medical history. Specifically, individual drugs were identified using the National Drug Codes; medical conditions were identified using algorithms based on diagnosis and procedure codes encoded in the following systems: ICD-9-CM/ICD-10-CM, ICD-10-Procedure Coding System, Healthcare Common Procedure Coding System, and Current Procedural Terminology codes.
Presence of comorbid conditions was assessed using all available data prior to and including the index date. The following conditions were evaluated: cardiovascular disease (CVD), moderate-to-severe chronic kidney disease (CKD, stages 3–5, assumed estimated glomerular filtration rate <60 mL/min/1.73 m28), retinopathy, nephropathy, neuropathy, amputation, hypertension, hypoglycemia, hyperlipidemia, obesity, and pancreatitis. CVD was categorized based on diagnoses for cerebrovascular disease, coronary heart disease, heart failure, myocardial infarction, peripheral artery disease, or stroke.
An exploratory analysis focused on a subset of the type 2 diabetes cohort members identified from the five (of the seven total) participating network partners who provided HbA1c results. To explore the quality of HbA1c data in the IMEDS-DD, the study summarized the total number patients with at least one HbA1c value recorded during the study period, as well as average testing intervals among those who had at least two HbA1c test results. The study further examined ranges of the most recent test per patient.
Statistical analyses
All analyses were completed using the Sentinel Routine Querying System version 9.4.09 with additional custom programming. Patient characteristics, comorbid conditions, and HbA1c availability were summarized via descriptive analyses. Continuous variables were reported as means and SDs, and categorical variables were summarized as number and proportion of the total study population in each cohort.
The study then evaluated the comparability of measured characteristics of patients with type 2 diabetes in the IMEDS-DD with those identified in other populations or data sources in the USA and other countries.
Results
Characteristics of patients with type 2 diabetes in the IMEDS-DD
A total of 414 672 (12.6%) patients with type 2 diabetes were identified from the 3 280 646 active members in the IMEDS-DD who had a medical encounter between April 1, 2018 and March 31, 2019 at the time the data were accessed for this study. In these patients, 52.8% were male, and the mean age was 65.0 years (SD 13.3 years). Persons aged 75 years or older comprised 24.1% of this cohort (table 1). The race composition was 40.4% white, 9.9% black, and 46.9% unknown.
Common comorbid conditions were hypertension (84.5%) and hyperlipidemia (82.8%). Around two in every five patients had a history of obesity (45.3%) or existing CVD (44.7%), including coronary heart disease (31.2%), peripheral artery disease (18.1%), cerebrovascular disease (17.4%), heart failure (16.9%), and stroke (15.3%). Nearly one-half (47.7%) of patients experienced at least one diabetic complication, with nephropathy (31.8%) and neuropathy (25.4%) being the most prevalent. Prevalence of moderate-to-severe CKD was 20.2%.
Antihyperglycemic treatments utilization included metformin (35.7%), followed by sulfonylureas (14.8%), insulin (9.9%), and DPP-4 inhibitors (6.0%). Between April 1, 2016 and their most recent medical encounter, patients with type 2 diabetes used an average of one antihyperglycemic drug (mean 1.3 drugs) and had frequent ambulatory visits (mean 36.6 visits).
Of the 414 672 patients with type 2 diabetes identified, 48.9% had at least one HbA1c value recorded. Among these, the mean of the most recent value was 7.0% (SD 3.2%), and 36.3% of these results were ≥7.0%. Of 1 46 418 patients with two or more HbA1c results recorded, 8.6% had an average testing interval within 90 days, and 48.3% within 91–183 days.
Comparability of patients with type 2 diabetes in the IMEDS-DD versus in the other data sources commonly used for diabetes research
Comparability of the IMEDS-DD patients with type 2 diabetes to those in the general population was summarized by population prevalence and patient demographic (table 2), by comorbidity and health services utilization (table 3), as well as by antihyperglycemic utilization and HbA1c value (table 4). The 12.6% type 2 diabetes prevalence in the IMEDS-DD population is generally consistent with estimates by the most recent Centers for Disease Control and Prevention (CDC) National Diabetes Statistics Report10 and the International Diabetes Federation (IDF) Diabetes Atlas11: the CDC estimated a 13.0% diabetes prevalence in the US population using data from the National Health and Nutrition Examination Survey (NHANES) and a survey respondent self-report approach (ie, ‘being told by a doctor or health professional that they had diabetes’), whereas the IDF estimated a 13.3% diabetes prevalence in the US population using data gathered for the IDF Diabetes Atlas.
Patients with type 2 diabetes in the IMEDS-DD had a mean age over 60 years with similar proportions of female and male patients. On average, these estimates align with findings from both nationwide surveys such as NHANES10 12 and large databases frequently used for diabetes research including the IBM MarketScan databases in the USA,13 the Canadian Network for Observational Drug Effect Studies system,14 the Clinical Practice Research Datalink in the UK,15 and the Scottish Care Information-Diabetes system (SCI-Diabetes).16
Over 40% of patients with in the IMEDS-DD had CVD, comparable to 45.2% in finding of Weng et al from the IBM MarketScan databases.13 The percentage of study patients with heart failure was within the range of estimates established by the American Heart Association for patients with type 2 diabetes17 (9%–22%). The prevalences of moderate-to-severe CKD,10 12 16 18 neuropathy,19 20 and retinopathy10 13 were also in line with prior characterization of the type 2 diabetes population in literature. Similarly, consistency was observed in health services utilization,21 including the number of antihyperglycemic agents used.18 22
In general, the most commonly observed antihyperglycemic treatment in the IMEDS-DD patients with type 2 diabetes—metformin, sulfonylurea, insulin, and DPP-4 inhibitors—largely match published findings based on SCI-Diabetes16 and data from various electronic health record (EHR) systems, such as the National Patient-Centered Clinical Research Network (PCORnet)22 and the Cleveland Clinic EHR system.19 The utilization proportions of individual drug class, however, vary by data source, evaluation period, or definition of patients with type 2 diabetes. Antihyperglycemic treatment utilization observed in the study was generally lower than possible observations restricted to treated patients with type 2 diabetes only.18
Both the availability and mean values of the most recent HbA1c test result in the IMEDS-DD for patients with type 2 diabetes were comparable to findings in published studies. Specifically, Bachmann et al also found that around one-half of patients with type 2 diabetes in the PCORnet had one or more documented HbA1c test result,22 and Pantalone et al, McGurnaghan et al, Iglay et al, respectively observed an average HbA1c value of 7.0% in data from Cleveland Clinic EHR system,19 SCI-Diabetes,16 and Quintiles Electronic Medical Record research database.18
Conclusions
This study assessed data quality of the IMEDS-DD and demonstrated feasibility of epidemiological studies for type 2 diabetes in this database. Availability of key data elements was generally high in the IMDES-DD for patients with type 2 diabetes. A descriptive summary of the study cohort suggests that basic demographics, comorbidities, and diabetes treatment or other health services utilization were reliably captured.
Findings from this comparability evaluation indicate notable similarities between the patients with type 2 diabetes in the IMEDS-DD and those in the general US patient populations. Despite slightly lower type 2 diabetes prevalence estimate in the IMEDS-DD than the CDC and IDF benchmarks, the difference was expected and may be attributed to the broader diabetes definitions used in the references (ie, mixed diabetes definition of types 1 and 2 diabetes). Given that patients with type 2 diabetes typically account for over 90% of the diabetes patients in total,10 the 12.6% prevalence estimate in the IMEDS-DD population remains comparable to that of the US population. With regard to the overall age difference, the older patients with type 2 diabetes in the IMEDS-DD may be a database strength that can be leveraged to enhance the generalizability of future IMEDS studies. In particular, the high proportion of older patients could be helpful to address study questions focusing on patients treated with second-line antihyperglycemics, a subpopulation which is often characterized by advanced age.14
While administrative health insurance claims databases provide comprehensive and longitudinal records of encounters within the healthcare system and outpatient pharmacy dispensings, they may not contain sufficient clinical details. For example, while the performance of a laboratory test, such as HbA1c, would generate a claim, the test result is not always available within the claims database. This is no exception to the IMEDS-DD. This feasibility assessment shows that the most recent HbA1c result was only available among 48.9% of the IMEDS-DD study population. Although HbA1c was not available for half of the study population, for epidemiological studies of type 2 diabetes, other detectable comorbidities/sequelae such as history of CVD, diabetic complications, and utilization of antihyperglycemic drugs may alternatively be considered as proxies for diabetes control or severity.
As with any other health insurance administrative and claims database, the IMEDS-DD provides comprehensive and longitudinal records of encounters within the healthcare system and of pharmacy dispensing records. Yet, the IMEDS-DD also shares general limitations of these databases. First, misclassification is possible due to the use of diagnosis, procedure, and drug codes for identification of specific medical conditions or drug exposure. For example, rule-out diagnoses are possible and may partially explain the low antihyperglycemic agent use among IMEDS-DD patients with type 2 diabetes, as the study might include patients with prediabetes and lower HbA1c who did not yet need treatment. Second, limited clinical details were available to verify personal history and severity of the measured medical conditions. The sizable missingness of HbA1c data may be due to the patient not having an HbA1c measurement or the health plan not being provided the test result. Similarly, a high degree of missingness might exist in other laboratory data pertinent to diabetes research, such as kidney function and lipid profile. Third, most data were tailored to administrative or billing needs and thus reflected selected information and the use of covered health services only. Race/ethnicity information is sourced through health insurance demographic data and does not have complete capture. Utilization of the over-the-counter medications and free drug samples is unknown. Records may be incomplete for health services subject to bundle payment (eg, hospital admissions) due to lack of recording incentive. Substantial underestimation for lifestyle factors may exist. Dispensing records do not reflect the actual drug ingestion. Fourth, since the data presented in table 2 come from disparate sources and from studies with different protocols, detailed statistical assessments were not possible. As such, this study provides an overview of the IMEDS-DD data elements relevant to diabetes research and a broad comparison of the estimates of these elements in the IMEDS-DD with data from other sources. Insofar as the estimates for certain key parameters are broadly similar between the IMEDS-DD and other databases of different types, the IMEDS-DD serves as a potential data source for observational diabetes research. Lastly, study results are only generalizable to the commercial health insurance population from which the study population was derived as well as other populations sharing similar characteristics.
In conclusion, the IMEDS-DD contains key data elements to conduct pharmacoepidemiological studies in patients with type 2 diabetes, including demographics, clinical characteristics, and health services utilization. This study found substantial overlap in patient characteristics between the patients with type 2 diabetes identified from the IMEDS-DD population and those included in published benchmarks or within other large databases. Despite the general limitations related to claims data such as the lack of HbA1c data, this study supports the IMEDS-DD as an appropriate data source for epidemiological studies related to type 2 diabetes and its management.
Data availability statement
Data may be obtained from a third party and are not publicly available.
Ethics statements
Patient consent for publication
Ethics approval
The Institutional Review Boards of Reagan-Udall Foundation for the Food and Drug Administration, the IMEDS Analytic Center at the Harvard Pilgrim Health Care Institute, and individual participating network partners had independently reviewed and determined this study exempted from their respective review.
Acknowledgments
The authors would like to thank the network partners who participated in this project: Aetna, a CVS Health company, Blue Bell, Pennsylvania; Harvard Pilgrim Health Care Institute, Boston, Massachusetts; HealthCore, Inc, Safety and Epidemiology, Wilmington, Delaware; HealthPartners Institute, Bloomington, Minnesota; Humana Healthcare Research Inc, Louisville, Kentucky; Marshfield Clinic Research Institute, Marshfield, Wisconsin; Vanderbilt University Medical Center, Department of Health Policy, Nashville, Tennessee, through the TennCare Division of the Tennessee Department of Finance & Administration which provided data. The authors thank Nina DiNunzio, Juliane Reynolds, Sarah Malek, and Jenice Ko at the Harvard Pilgrim Health Care Institute for their project management and research assistance. The authors thank Vinit Nair and Yunping Zhou at the Humana Healthcare Research for their analytic support.
References
Footnotes
Contributors Conceptualization: T-YH, TW, ABM, ST, JSB. Data provision: CR-W, AJ-A, RTG, MS, PAP, CNMcMW. Methodology, validation, formal analysis, visualization: T-YH, JM, JSB, YHN, TW, SRC. Investigation: T-YH, YHN, TW. Manuscript drafting: T-YH, AR, ST. Manuscript review and editing: all authors. Guarantor: JSB.
Funding Funding for this research was provided by Merck Sharp & Dohme, a subsidiary of Merck & Co, Inc, Kenilworth, New Jersey, USA, in collaboration with Pfizer, New York, New York, USA. Overall project oversight and project management support was provided by the Reagan Udall Foundation for the Food and Drug Administration.
Competing interests TW and SRC are employees of Merck Sharp & Dohme, a subsidiary of Merck & Co, Inc, Rahway, New Jersey, USA, who may own stock and/or stock options in Merck & Co, Inc, Rahway, New Jersey, USA.
Provenance and peer review Not commissioned; externally peer reviewed.