Data Mining in Drug Safety

Review of Published Threshold Criteria for Defining Signals of Disproportionate Reporting

Deshpande, Gaurav
Gogolak, Victor
Weiss Smith, Sheila

¹ Center for Drug Safety, University of Maryland, School of Pharmacy, Baltimore, Maryland, USA
² DrugLogic Inc., Reston, Virginia, USA

Correspondence: Professor Sheila Weiss Smith, University of Maryland, School of Pharmacy, 220 Arch Street, 12th floor, Baltimore, MD 21201, USA.

Pharmaceutical Medicine 24(1):p 37-43, February 1, 2010. | DOI: 10.2165/11535300-000000000-00000

Data mining is used in pharmacovigilance as an adjunct to traditional pharmacovigilance practices. There remains ongoing debate as to the impact automated signal detection would have on pharmacovigilance resources. An important component of this debate is the value of each statistical alert or signal of disproportional reporting (SDR) and the resources needed to evaluate SDRs that are clinically unimportant. Using the terminology of diagnostic testing, such SDRs are called false positives as they are statistically positive but are clinically negative. Based on the clinical testing paradigm, a more stringent threshold increases the sensitivity of the test by lowering the number of false positives; however, the trade off of increased sensitivity is a reduced specificity, i.e. potentially missing clinically relevant problems.

In developing the protocol to assess the clinical validity of an SDR, a literature search was conducted to determine what threshold(s) were commonly used for data mining adverse event databases. Of the more than 100 manuscripts identified, 41 published the results of data mining excursions with a clearly identified threshold for significance. The commonly used data mining algorithms were proportional reporting ratio (PRR), reporting odds ratio (ROR), multi-item gamma Poisson shrinker (MGPS) and Bayesian confidence propagation neural network (BPCNN).

There was some variation in the threshold used for each algorithm. For the PRR, thresholds of 1.0, 1.5 and 2.0 were reported. Some authors required a Chi-squared test statistic of ≥4.0. Minimum drug-event pair counts were most often required for the frequentist measures of disproportionality, PRR and ROR. Among the Bayesian algorithms, MGPS and BPCNN, there was variation in the metric used and, within a metric, variation in thresholds and the use of minimum case counts. Metrics based on the MGPS algorithm that have been used to determine statistical significance include the empirical Bayes geometric mean (EBGM), the lower 95% confidence interval of the EBGM, the lower 95% confidence interval of the empirical Bayes arithmetic mean, EBlog2 and interaction signal score. Based on the published literature, there is considerable variation in defining a significant alert or SDR among practitioners of pharmacovigilance data mining. Research into the impact of such variations in practice on SDR volume and value is urgently needed.