1. Background

NSAIDs are widely used for the symptomatic treatment of acute pain and chronic inflammatory and degenerative joint diseases. However, their use is restricted by the occurrence of upper gastrointestinal (GI) complications (UGIC) such as peptic ulcer perforations, obstructions and bleeding. The use of NSAIDs has been associated with a 3- to 5-fold increase in the risk of UGIC.[1,2] Clinical trials and observational studies have shown that the use of selective cyclooxygenase (COX)-2 inhibitors is associated with a lower risk of UGIC;[35] however, they have been also associated with an increased risk of serious cardiovascular (CV) events.[6] Further data are necessary to quantify the risk of UGIC associated with many individual NSAIDs, including selective COX-2 inhibitors, and to evaluate the benefit-risk balance of the NSAIDs most often used in regular clinical practice, taking into account dose, duration and effect of other risk factors. These data can help clinicians select treatments for individual patients and help health policy regulators assess the public health impact of therapy.

Within the European Community’s Seventh Framework Programme, the Safety Of non-Steroidal anti-inflammatory drugs (SOS) collaborative project started in 2008 with the goal of developing statistical and decision models to facilitate regulatory and treatment decisions based on the GI and CV safety of individual NSAIDs. One of the initial tasks of the SOS project was to summarize the data available on the risk of GI and CV events from observational studies. In this context, we conducted a systematic review and meta-analysis of published observational studies to provide pooled relative risks (RR) for UGIC associated with the use of individual NSAIDs versus non-use of NSAIDs. We followed the MOOSE guidelines for reporting meta-analyses of observational studies (http://www.equator-network.org/resource-centre/).

2. Materials and Methods

We performed a literature search in PubMed using medical subject headings (MeSH) and free-text terms for individual NSAIDs and selective COX-2 inhibitors, GI disease, case-control studies and cohort studies. The search was restricted to observational studies published in the English language between 1 January 1980 and 31 May 2011. Details of the search strategy are available in the supplemental digital content (SDC; http://links.adisonline.com/DSZ/A78). Studies had to be (i) cohort, case-control or nested case-control studies; (ii) provide odds ratios or RRs of UGIC comparing individual NSAIDs with non-use of NSAIDs; and (iii) provide effect estimates adjusted at least for age and sex. All titles and/or abstracts of the articles identified were reviewed to select those potentially meeting the inclusion criteria. Data from these articles were abstracted in a standardized database that included information on source population, inclusion and exclusion criteria, study design, case definition and validation, selection of controls, exposure definition, confounding factors and statistical analysis. The accuracy of the abstracted data was reviewed independently by two of the authors (NR-G, JC). References from relevant studies and prior meta-analyses were also reviewed. Study authors were contacted when additional information was needed.[7]

The methodological quality of each study was evaluated using the Newcastle-Ottawa Scale (NOS).[8] The NOS involves a score system in which the study design is evaluated on three broad categories: (i) selection of the study groups; (ii) comparability between the study groups; and (iii) exposure/outcome ascertainment. For each study, the NOS was evaluated independently by two of the authors (NR-G and JC), and any differences were resolved by consensus.

We estimated pooled RRs for those individual NSAIDs that had effect estimates reported in at least three different studies. Pooled RRs and 95% CIs were estimated using both the inverse-variance Lagrange fixed-effects method and the DerSimonian and Laird random-effects method.[9] We generated forest plots from the random-effects models. Heterogeneity between studies was assessed by graphical inspections of the forest plots and by Cochran’s Chi-squared (χ 2) test of homogeneity, and subgroup analyses evaluating methodological and clinical heterogeneity between studies. Subgroup analyses included stratification by study design, prior history of UGIC, bleeding complications, study period and dose of NSAIDs. Pooled estimates for dose were calculated according to the dose categorization used in each study. In the subgroup analyses, pooled RRs were also calculated for those NSAIDs with only two effect estimates available. The Higgins inconsistency I 2 statistic was used to describe the percentage of the variability in effect estimates that is due to heterogeneity rather than chance.[10] The meta-analysis was conducted using Review Manager (RevMan), Version 5.0.22 (The Nordic Cochrane Centre, The Cochrane Collaboration, Copenhagen, Denmark, 2009).

3. Results

A total of 2984 articles on NSAIDs and GI complications were identified. Of these, 2974 articles were identified in the PubMed search and ten additional articles were identified through the references of relevant studies (figure 1). The review of titles and abstracts of these studies led to select 59 articles for full data abstraction. After review of the abstracted information, 28 studies on the use of individual NSAIDs and the risk of UGIC met the inclusion criteria and were included in the meta-analysis.[7,1135] The remaining 31 articles were excluded for the following reasons: the reference group was other than non-use of NSAIDs in nine studies;[3644] the outcome was overall upper and lower GI complications in three studies;[4547] the outcome was uncomplicated upper GI events in two studies;[48,49] the study population was restricted to users of specific drugs or to patients with specific diseases in three studies;[5052] the study population and the study period overlapped in four studies;[5356] and the study design did not meet the inclusion criteria in ten studies (i.e. different type of study or measures of association and exposure assessment).[5766]

Fig. 1
figure 1

Flow diagram for identification of studies on upper gastrointestinal complications and individual NSAIDs. GI = gastrointestinal.

Selected characteristics of the 28 studies included in the meta-analysis are summarized in table I; 3 studies were cohort studies,[20,24,26] 10 were nested case-control studies,[7,11,1719,23,25,33,35,67] and 15 were case-control studies.[1216,21,22,2732,34,68] Twelve studies, all case-control studies, were field studies collecting individual information by standardized questionnaires. The 16 remaining studies used information recorded in healthcare databases.

Table I
figure Tab1

Description of observational studies on the risk of upper gastrointestinal complications associated with the use of individual NSAIDs

Cases were defined as hospitalization or referral to a specialist for upper GI bleeding in 13 studies,[12,15,16,1922,2730,33,34] and for bleeding and/or perforation in 15 other studies; four studies also included cases of uncomplicated peptic ulcer.[14,17,33,68] The site of complication was defined as gastric and/or duodenal in all studies, and two included oesophageal complications.[12,32] Most studies, both field and database studies, required information from endoscopy or other diagnostic procedures to confirm UGIC. Six studies conducted in healthcare databases did not conduct any validation of the cases identified.[14,1720,68] Sixteen studies reported aggregate results for patients with and without a history of UGIC,[7,1214,16,2023,28,3134,67,68] and 12 provided results for patients without a history of UGIC;[11,15,1719,2427,29,30,35] the remaining study was a case-crossover study.[14] Most studies excluded subjects with a history of a known cause of UGIC, including the use of gastrotoxic medications and life-threatening diseases (table II). Four studies did not have any exclusion criteria.[20,22,28,34] Among the 14 case-control studies, seven included hospital controls;[12,13,16,21,2931] five included both hospital and community controls;[22,27,28,32,34] one included community controls;[15] and two were case-crossover studies.[14,68] All case-control studies were matched on age, sex (except one study[16]), hospital or geographic area, and index date. Three of the case-control studies with hospital and community controls estimated separate results for each set of controls;[22,28,34] as results between the two sets were similar, we included in the meta-analysis results reported using hospital controls. In cohort studies, current use of NSAIDs was defined as the time covered by each prescription,[20,24] and one study extended the coverage by 15 days.[26] Most case-control studies defined current use of NSAIDs as any use ending at the index date or within 7 days before the index date. A few case-control studies considered current use as that ending up to 30 days[11,13,17,19,28,35] or 90 days[18] before the index date. Two cohort studies focused on new users of NSAIDs,[20,26] and one nested case-control study provided results for both new users and all users (incident and prevalent) of NSAIDs.[17] In addition to age and sex, the most frequent confounders considered were a history of peptic ulcer (21 studies),[7,11,1531,35,67] smoking (13 studies),[7,11,15,18,20,21,23,2732] alcohol use (9 studies),[7,15,19,21,27,28,3032] use of proton-pump inhibitors and anti-ulcer medications (11 studies),[1623,30,67,68] and concurrent use of medications increasing the risk of UGIC (16 studies).[7,11,15,16,1921,23,27,2931,33,35,67,68] The quality of the studies measured with the NOS was, in general, very good: for the selection component, 12 studies had the maximum score of 4[7,11,15,2225,32,33,35,67,68] and 13 studies had the next highest score of 3;[13,14,1619,21,26,28,29,30,32,34] for comparability, 24 studies had the maximum score of 2;[7,11,1419,2133,35,67,68] and for the exposure/outcome component, 14 studies had the maximum score of 3[7,11,13,1719,2326,33,35,67,68] and 10 studies had the next highest score of 2.[12,14,16,20,21,2830,32,34]

Table II
figure Tab2

Exclusions applied in observational studies on the risk of upper gastrointestinal complicationsa

The studies included in the meta-analysis allowed us to estimate pooled RRs for the current use of 16 different NSAIDs (table III and figure 2). Forest plots for each individual NSAID are available in the SDC. Using random-effects models, pooled RRs ranged from 1.43 (95% CI 0.65, 3.15) for aceclofenac to 18.45 (95% CI 10.99, 30.97) for azapropazone. Pooled RR was less than 2 for aceclofenac, celecoxib and ibuprofen; between 2 and less than 4 for rofecoxib, sulindac, diclofenac, meloxicam, nimesulide and ketoprofen; between 4 and less than 5 for tenoxicam, naproxen, indometacin and diflunisal; and greater than 5 for piroxicam, ketorolac and azapropazone. Pooled RRs from studies providing results for patients without a history of UGIC were similar to those from the overall analysis except for naproxen (more than 10% change), 3.10 (95% CI 2.45, 3.91) and diclofenac, 3.76 (95% CI 2.71, 5.21). Data in patients with a history of peptic ulcer disease were available only for celecoxib (two studies) and rofecoxib (one study).[67,68] The pooled RR for celecoxib in this population was 1.50 (95% CI 1.16, 1.94).

Table III
figure Tab3

Individual and pooled relative risks (95% CIs) of upper gastrointestinal complications associated with the use of individual NSAIDs

Fig. 2
figure 2

Pooled relative risks and 95% CIs of upper gastrointestinal complications associated with the use of individual NSAIDs. Vertical bars denote 95% CIs.

Pooled RRs from case-control studies were higher than those from cohort studies for all NSAIDs except ibuprofen, ketorolac and sulindac. In general, pooled RRs from fixed-effects models were slightly lower than those from random-effects models. In general, there was significant heterogeneity between studies, which decreased in the subsequent subgroup analysis exploring methodological and clinical diversity.

Pooled RRs for the effect of daily dose were estimated for eight different NSAIDs. Cut-off values used in each study to define the daily dose of each NSAID are presented in table IV. Variations in cut-off values were, in general, small except for those used for ibuprofen (200 mg) and naproxen (220 mg) in one study,[14] and for ibuprofen (≪2400 mg) in another study.[33] RRs for the use of high daily doses of NSAIDs were approximately 2- to 3-fold greater than RRs for low-medium doses (figure 3). The pooled RR for high daily dose of ibuprofen was similar to that for high daily dose of diclofenac. Exclusion of results from the studies with different cut-off values for ibuprofen[14,33] and naproxen[14] did not substantially change the pooled results for these individual NSAIDs. For ibuprofen, RRs were 2.15 (95% CI 1.66, 2.79) for low-medium dose and 4.22 (95% CI 1.76, 10.12) for high dose. For naproxen, the RR for low-medium daily dose was 3.62 (95% CI 2.62, 4.99) [the excluded study[14] did not provide data on high dose].

Table IV
figure Tab4

Cut-off values (mg) used to define daily dosea of NSAIDs in observational studies on the risk of upper gastrointestinal complications

Fig. 3
figure 3

Pooled relative risks and 95% CIs of upper gastrointestinal complications associated with the daily dose of individual NSAIDs. See table IV for cut-off values used in each study to define high dose and low-medium dose. Vertical bars denote 95% CIs. HD = high dose; LD = low-medium dose.

A total of 12 studies provided results specifically for upper GI bleeding for eight different NSAIDs; ten of these studies were case-control field studies. Pooled RRs were higher than those from all UGIC (bleeding, perforation and/or obstruction). Pooled RRs were 1.09 (95% CI 0.77, 1.53) for celecoxib; 1.43 (95% CI 0.65, 3.15) for aceclofenac; 1.88 (95% CI 1.00, 3.51) for ibuprofen; 2.25 (95% CI 1.56, 3.25) for rofecoxib; 4.20 (95% CI 3.03, 5.83) for diclofenac; 5.64 (95% CI 3.60, 8.83) for indometacin; 5.72 (95% CI 3.83, 8.53) for naproxen; and 13.36 (95% CI 9.62, 18.54) for piroxicam.

Pooled RRs from studies conducted from the year 2000 onward[7,1518,21,67,68] were slightly higher than those from studies conducted before the year 2000 for ibuprofen, 2.13 (95% CI 1.66, 2.73) versus 1.50 (95% CI 1.12, 2.01); ketoprofen, 4.28 (95% CI 2.36, 7.76) versus 3.70 (95% CI 2.27, 6.05); and nimesulide, 3.89 (95% CI 3.18, 4.74) versus 3.50 (95% CI 2.03, 6.03); but lower for diclofenac, 3.08 (95% CI 2.47, 3.84) versus 3.63 (95% CI 2.81, 4.70).

Only one study provided information on the effect modification of gastroprotective agents.[18] In that study, RRs for all individual NSAIDs were lower among patients receiving ulcer-healing drugs than among those not receiving them.

4. Discussion

The results from this meta-analysis confirmed the variability of RRs among individual NSAIDs. The lowest RRs were observed for the use of aceclofenac, celecoxib and ibuprofen, and the highest for the use of piroxicam, ketorolac and azapropazone. Intermediate RRs, between approximately 2 and 4, were observed for the rest of the NSAIDs for which at least three estimates were available: rofecoxib, sulindac, diclofenac, meloxicam, nimesulide, ketoprofen, tenoxicam, naproxen, indometacin and diflunisal.

The use of high daily doses of individual NSAIDs was associated with approximately a 2- to 3-fold increase of RRs for UGIC compared with the use of low-medium doses, except for celecoxib, for which we did not observe a dose-response relationship. For most NSAIDs, there were no major differences between studies restricted to patients without a history of peptic ulcer and those that included all patients, and between studies conducted before and after the year 2000. Data from the studies included in the meta-analysis were insufficient to estimate pooled RRs for the duration of use of individual NSAIDs and for the concurrent use of gastroprotective agents.

In general, there was significant heterogeneity among studies, although it improved or tended to disappear in the subgroup analysis conducted by study design, type of complication, history of peptic ulcer disease and dose of NSAIDs. However, the χ 2 test to detect heterogeneity is very sensitive to the number of studies (or sample size) included in the analysis. Meta-analysis of a small number of studies may be underpowered to detect heterogeneity. On the other hand, when the number of studies is large, the χ 2 test has a high power to detect a small amount of heterogeneity that may be clinically unimportant.

Most studies evaluated the role of confounding factors other than age and sex. One of the concerns in observational studies is the effect of confounding by indication. This is particularly relevant for selective COX-2 inhibitors as there is evidence that they have been preferentially prescribed to patients at high risk of UGIC.[20,46] Many studies attempted to minimize the effect of confounding by indication by adjusting for relevant comorbidity, including GI, CV and other chronic disease; concurrent use of medications; and prior use of healthcare services. Some studies also evaluated confounding by indication by conducting stratified analysis by markers of disease severity or restricted the study population to patients at high risk of UGIC.[7,19]

Pooled RRs from case-control studies were higher than those obtained from cohort studies, although CIs were wider. Most case-control studies were field studies restricted to upper GI bleeding. Thus, a more specific and strict definition of UGIC probably increased the internal validity of these studies. On the other hand, recall bias, leading to overestimation of effects, could be present in these case-control studies in which exposure to NSAIDs was assessed retrospectively by self-reported questionnaires.

Most studies included in this meta-analysis applied broad restrictions to the source population by excluding patients with major risk factors for peptic ulcer disease. Restriction results in more homogeneous study populations and increases the internal validity of studies.[69] On the other hand, absolute rates for UGIC obtained from studies conducted in restricted populations underestimated the rate of the disease.[24,26] This can impact public health policy, particularly when patients with risk factors for UGIC are excluded from studies estimating incidence rates.

Results for new users of NSAIDs were available for only three studies. The rest of the studies provided results for new and prevalent users overall. Inclusion of prevalent users of NSAIDs may result in an overrepresentation of the group of patients who tolerate the treatment (survivors). This is a selection process that may introduce bias if the risk varies with time since the beginning of treatment. In the case of UGIC and the use of NSAIDs, some studies have shown that the risk is higher at the beginning of treatment because susceptible patients are affected early in therapy.[13] Restricting the study to new users of NSAIDs prevents bias from overrepresentation of long-term users who tolerate the treatment.[70,71]

Most studies confirmed cases of UGIC by requiring a positive endoscopy or other diagnostic or therapeutic procedure. However, some studies conducted in databases, which identified potential cases of UGIC by discharge codes, did not conduct any validation of the cases identified, thus introducing misclassification of the outcome.[72] Studies conducted in Canada and Italy show that the positive predictive values (PPVs) for specific discharge diagnoses (gastric ulcer and duodenal ulcer) range from 92% to 100%, whereas PPVs for other specific diagnoses (peptic ulcer and gastrojejunal ulcer) range from 81% to 84%, and those for the non-specific diagnoses (GI haemorrhage) range from 54% to 68%.[67,73,74] Overall, the results from the studies included in this meta-analysis that did not confirm the cases identified may be biased towards the null, assuming non-differential outcome misclassification. Cohort studies estimating incidence rates of UGIC based on the cases identified through hospital discharge codes without performing any validation may overestimate absolute rates of UGIC by including non-cases or cases with uncertain clinical validity.[17,20,75]

The benefit-risk balance of individual NSAIDs is mainly driven by their GI and CV safety profile. Recent meta-analyses of observational studies on the CV risk associated with individual NSAIDs show that rofecoxib and diclofenac are associated with the highest RR and naproxen and ibuprofen with the lowest RR.[76,77] For cerebrovascular events, the increase in risk appears to be similar between naproxen, ibuprofen and diclofenac, and higher for rofecoxib.[78] In our study, ibuprofen was in the lowest range of pooled RRs for UGIC; rofecoxib and diclofenac were in the middle range; and naproxen was associated with a higher RR. However, estimates for individual NSAIDs varied by dose. The pooled RR of UGIC for high-dose ibuprofen was similar to the RR for high-dose rofecoxib and diclofenac, and the use of high-dose naproxen was associated with a higher RR.

5. Conclusions

We conclude that the results obtained in this meta-analysis are in line with those from other meta-analyses published in the last decade and confirm the variability of the risk of UGIC among individual NSAIDs as they are used in clinical practice.[1,2,5] Our meta-analysis provides pooled RRs for many individual NSAIDs, including, for the first time, nimesulide, tenoxicam and diflunisal. Aceclofenac, celecoxib and ibuprofen (analgesic and anti-inflammatory doses combined) were the NSAIDs with the lowest RR, whereas piroxicam, ketorolac and azapropazone were those with the highest RRs. Intermediate RRs were observed for the rest of the NSAIDs: rofecoxib, sulindac, diclofenac, meloxicam, nimesulide, ketoprofen, tenoxicam, naproxen, indometacin and diflunisal. The impact on the findings across studies related to varying study design approaches — including choices in the definition and validation of UGIC, exposure assessment and analysis of new or prevalent users — should be taken into account when designing new studies. For individual NSAIDs, data on the effect of dose, duration of use and concurrent use of other medications are still scarce. These gaps need to be addressed in future studies, including those to be conducted in the ongoing SOS project (http://www.sos-nsaids-project.org/).