Research design and methods
Study design and data source
We conducted a population-based retrospective cohort study using two population-based data sources: (1) administrative healthcare data from the province of Alberta (AB), Canada and (2) primary care clinical data from the UK’s Clinical Practice Research Datalink (CPRD) GOLD. AB’s administrative databases capture population-based universal healthcare system encounters for all AB residents (over 4 million). The CPRD contains longitudinal data on about 5% of the UK population collected from over 950 primary care practices, providing a representative sample that is similar to the overall UK population in age, sex, and ethnicity.31–35 Both data sources are routinely checked for accuracy through computerized validation checks.
From both sources, de-identified individual-level longitudinal data were available for: (1) sociodemographic (age, sex, and index of multiple deprivation (CPRD only)); (2) hospital-based diagnoses using the 10th revision of the International Statistical Classification of Diseases (ICD-10) codes; (3) medical diagnoses using ICD-9 in AB and Read codes in CPRD; (4) outpatient prescription medications (dispensation records from AB and prescription records from primary care physicians in CPRD); (5) laboratory data (eg, renal function, lipids, blood glucose, etc) and (6) mortality data (date and cause of death). Hospital episode and death certificate linkage is only available for a subset of CPRD data. Additionally, physiological information (body mass index (BMI)) and information on health behaviors (eg, smoking) were also retrieved from CPRD.
Study cohort
First, we identified a base cohort of adult (≥18 years) new users of metformin as monotherapy, between January 1, 2012 (AB) or January 1, 2005 (CPRD) and the end of study period (March 30, 2018, in AB and November 29, 2018, in CPRD). New metformin users were defined as those with no prescription records for any antidiabetic drug, including insulin, for 365 days prior to the initial metformin prescription. At least 12 months of continuous data prior to the first antidiabetic agent prescription recorded during the study period was required. We restricted the CPRD cohort to patients eligible for linkage to hospital records through the Hospital Episodes Statistics (HES) and death certificate records through the Office of National Statistics (ONS) (herein referred to as HES/ONS linkage). From the base cohort, we identified all patients initiating either an SGLT-2 inhibitor or a DPP4 inhibitor between May 1, 2014 in AB or January 1, 2013 in CPRD (corresponding to after market entry in Canada and the UK), and the end of study period. We included patients who have been exposed to other antidiabetic drugs (not SGLT-2 inhibitor or DPP4 inhibitors) before index date. Furthermore, we excluded patients who have a previous record of diagnostic codes indicating AKI or renal replacement (dialysis or transplant) in the 365 days before initiation of an SGLT-2 or DPP4 inhibitor.
Exposure and outcome definitions
SGLT-2 inhibitor and DPP4 inhibitor exposure was operationalized using an as-treated exposure definition. The index date of exposure was defined as the date of initiation of SGLT-2 inhibitor or DPP4 inhibitor. We calculated the duration of therapy for each prescription based on the quantity dispensed (or days’ supply if available) plus a 30-day grace period to account for non-adherence. If quantity was missing, we assumed a 30-day supply. For the primary analysis, gaps between prescriptions were allowed, although we conducted several sensitivity analysis whereby alternative exposure definitions were used. Discontinuation of exposure was based on the estimated duration of the last SGLT-2 or DPP4 inhibitor prescription plus a 30-day grace period.
The primary efficacy outcome was a composite of new or worsening nephropathy, defined as either (1) increase from baseline, defined as the latest laboratory value measured before index date, in 24-hour urinary excretion of albumin to >300 mg OR increase in timed collection to >200 μg/min OR increase in albumin-creatinine ratio (ACR) to >20 mg/mmol; (2) a doubling of the serum creatinine level from baseline, accompanied by an estimated glomerular filtration rate (eGFR) of ≤45 mL/min/1.73 m2; (3) the initiation of renal replacement therapy, based on hospitalization records; (4) new hospitalization for renal failure or (5) death from renal disease. This definition was based on the (Empagliflozin) Cardiovascular Outcome Event Trial in Type 2 Diabetes Mellitus Patients (EMPA-REG OUTCOME) trial.20 For laboratory test-based end points, baseline values were compared with one or more measures during follow-up and the first date any lab-based end point (ie, albuminuria) criteria was met was considered the outcome event date.
The primary safety outcome was AKI based on all hospitalization records for one of the following ICD-10 diagnostic codes: N17.0, N17.1, N17.2, N17.8 or N17.9. Previous studies have shown this case definition has a specificity of >95%.36–38
Propensity score matching
To minimize potential confounding, we used propensity score matching. We used the high dimensional propensity score algorithm39 to identify relevant potential confounders based on five dimensions (hospitalizations, procedures, medical diagnoses, prescription medication, and laboratory records) during the year before index date. A sixth dimension, emergency department visits, was also included for the AB analysis. We identified the 200 most prevalent variables in each dimension and ranked them according to their frequency as once, sporadic or frequent. Then, we selected 500 variables for inclusion in estimation of propensity score, in addition to a list of 30 predefined variables (32 in CPRD), including sex, age, year of cohort entry, prescription drug use (ACE inhibitors, angiotensin receptor blockers, statins, loop diuretics, thiazide diuretics, other antihypertensive drugs, other antidiabetic agents, epoetin/darbepoetin), comorbidities (myocardial infarction, stroke, heart failure, hypertension, dyslipidemia, amputation, diabetic ketoacidosis, fracture, chronic kidney disease), laboratory values (hemoglobin A1c (HbA1c), eGFR, hemoglobin, high-density lipoprotein, low-density lipoprotein, triglycerides, ACR), in addition to physiological and lifestyle indicators (smoking, BMI) from CPRD only. A multivariable logistic regression model was used to estimate propensity scores for initiation of an SGLT-2 inhibitor compared with a DPP4 inhibitor. SGLT-2 inhibitor users were then matched to DPP4 inhibitors users in a one-to-one greedy nearest-neighbor match based on the logit of propensity score with a caliper of 0.2 times the SD.40 41 Balance of baseline covariates after matching was assessed using standardized differences (>10% considered unbalanced).42 We repeated the above propensity score matching process for each secondary and sensitivity analysis which are described below.
Primary analysis
Standard descriptive statistics were used to compare the characteristics of SGLT-2 inhibitor users with DPP4 inhibitor users. Patients were followed from index date until the earliest of experiencing the outcome, disenrolment, switching from SGLT-2 inhibitor to DPP4 inhibitor, switching from DPP4 inhibitor to SGLT-2 inhibitor, death, or cohort end date. Incidence rates per 1000 person-years were calculated before and after propensity score matching. The association between SGLT-2 inhibitor use and the renal outcomes of interest was assessed using a conditional Cox proportional hazards regression models, stratified by matched pair, within the matched cohort. We ran an additional multivariable conditional Cox model adjusted for age, sex, and the use of other antidiabetic agents in the year prior to index date. Model assumptions including the proportional hazards assumption for each variable was tested.43 Furthermore, we assessed for effect modification by age, sex, diabetes duration, and A1c level using an interaction term between exposure status and these variables. We considered a p value <0.05 to be statistically significant. Last, aggregate data from each database were combined by random-effects meta‐analysis using a profile likelihood estimator.44
Secondary and sensitivity analyses
For the secondary analyses, we repeated our primary analysis using four alternative active comparator new-user cohorts using the following control groups: sulfonylureas (SU), glucagon-like peptide-1 receptor agonists (GLP1- RA), thiazolidinediones (TZD), and insulin. For each of these analyses, a new cohort was identified, and the propensity score matching process was conducted. We also stratified the primary cohort (SGLT-2 inhibitors matched to DPP4 inhibitors) based on individual SGLT-2 inhibitor agents (canagliflozin, dapagliflozin, and empagliflozin). We conducted a stratified analysis for the primary cohort based on baseline kidney function, wherein we calculated the eGFR based on an abbreviated Modification of Diet in Renal Disease (MDRD) equation using the serum creatinine measurement most recent before index date, if there were no serum creatinine measurement before index date, we used the first measurement within 365 days after index date. The stratification cut-off point was eGFR <60 mL/min/1.73 m2 as impaired kidney function and ≥60 mL/min/1.73 m2 as non-impaired kidney function. Last, for our primary cohort, we replicated our primary analysis to assess the association for each of the five components of the composite outcome definition.
To test the robustness of our results, we took two main approaches to conduct a series of sensitivity analyses. First, we varied the definition of our exposure where we reran our primary analysis and secondary comparator analysis using the following exposure definitions: (i) as-treated exposure definition without allowing any gaps in exposure whereby we censored a person’s follow-up time at their first gap; (ii) intention to treat exposure definitions with a maximum follow-up of 180, 365, and 730 days; (iii) time varying exposure definition. Second, we reran our primary effectiveness analysis using the full CPRD GOLD cohort, irrespective of eligibility for HES/ONS linkage.
Data availability statement
Data may be obtained from a third party and are not publicly available. We are unable to make data available because of third party license restrictions.