Framework Validation: Three Drugs, Two ADR Types, One Method
Cross-drug comparison of semaglutide, liraglutide, and metformin. 5/18 dimensions identical where pharmacology demands identity, 13/18 correctly differentiated.
Framework Validation: Three Drugs, Two ADR Types, One Method
Cross-drug comparison of semaglutide, liraglutide, and metformin using the same multi-source signal detection method. The question: does a single assessment framework produce pharmacologically sensible results across fundamentally different drug-event pairs?
March 2026 | 3 drug investigations | 18 comparison dimensions | 12 min read
Bottom Line
The same multi-source assessment method produces pharmacologically correct results across fundamentally different drug-event pairs. 5/18 dimensions are identical where pharmacology demands identity. 13/18 correctly differentiate where pharmacology diverges. This is not a tool designed for one drug — it works across the pharmacological space.
| Metric | Value |
|---|---|
| Drugs Tested | 3 |
| Dimensions | 18 |
| Converge | 5 / 18 |
| Differentiate | 13 / 18 |
Purpose
A single case study proves a method works for one question. It cannot answer whether that method generalizes. To validate a pharmacovigilance signal detection framework, you need to run it against drugs that should produce similar results (same class, same mechanism) and drugs that should produce different results (different class, different mechanism). The pattern of agreement and disagreement tells you whether the framework is capturing real pharmacology or overfitting to a single drug.
We selected three drugs deliberately: two GLP-1 receptor agonists (semaglutide and liraglutide) that share a mechanism and target organ, and one biguanide (metformin) with an entirely different pharmacological profile. If the method is sound, GLP-1 agonists should converge on class-level features, and metformin should diverge on everything that reflects its distinct pharmacology.
The Three Investigations
Each investigation followed the same 6-step protocol: drug identity resolution (RxNav), FAERS adverse event retrieval, disproportionality analysis (PRR, ROR, IC, EBGM), labeling review (DailyMed), literature search (PubMed), and causality assessment (Naranjo, WHO-UMC). Identical tools, identical sequence, identical data sources.
| Investigation | PRR | Naranjo | Reports | Link |
|---|---|---|---|---|
| Semaglutide + Pancreatitis | 6.93 | 4 / Possible | 2,068 | Read full investigation |
| Liraglutide + Pancreatitis | 17.39 | 5 / Probable | 3,219 | Read full investigation |
| Metformin + Lactic Acidosis | 71.42 | 6 / Probable | 18,419 | Read full investigation |
Head-to-Head Comparison: 18 Dimensions
The full comparison across all 18 assessment dimensions. Rows marked with * indicate dimensions where all three drugs produce identical results — reflecting shared pharmacological realities that transcend drug class.
| Dimension | Semaglutide | Liraglutide | Metformin |
|---|---|---|---|
| Drug Class | GLP-1 RA | GLP-1 RA | Biguanide |
| Target ADR | Pancreatitis | Pancreatitis | Lactic Acidosis |
| PRR | 6.93 | 17.39 | 71.42 |
| ROR | 7.09 | 18.54 | 74.58 |
| IC | 2.76 | 4.06 | 4.83 |
| EBGM | 6.76 | 16.68 | 28.44 |
| Naranjo | 4 POSSIBLE | 5 PROBABLE | 6 PROBABLE |
| WHO-UMC * | POSSIBLE | POSSIBLE | POSSIBLE |
| Harm Type | B (Idiosyncratic) | B (Idiosyncratic) | A (Dose-dependent) |
| Energy * | Critical (1.0) | Critical (1.0) | Critical (1.0) |
| Knowledge Voids | 4 | 3 | 2 |
| Highest Void * | Rechallenge | Rechallenge | Rechallenge |
| FAERS Reports | 2,068 | 3,219 | 18,419 |
| Market Years | 9 | 16 | 69 |
| Labeling | Section 5.2 | Section 5.2 | BOXED WARNING |
| Boxed Warning | No | No | Yes |
| Signal Detected * | Yes | Yes | Yes |
| Signal Persists * | Yes | Yes | Yes |
Where Results Converge
Five dimensions produce identical results across all three drugs. Each convergence point reflects a genuine pharmacological or methodological reality — not an artifact of the framework.
-
WHO-UMC: All POSSIBLE. All three drugs lack rechallenge data — clinicians do not re-expose patients to drugs suspected of causing pancreatitis or lactic acidosis. Without rechallenge, WHO-UMC cannot upgrade beyond POSSIBLE. This is a data reality, not a framework limitation.
-
Energy: All Critical (1.0). Pancreatitis and lactic acidosis are both life-threatening adverse events. The framework correctly assigns maximum seriousness energy regardless of drug class or mechanism. A method that rated lactic acidosis as less serious than pancreatitis would be pharmacologically wrong.
-
Highest knowledge void: All Rechallenge. The single most important missing piece of evidence is the same for all three: did the adverse event recur when the drug was restarted? This is ethically appropriate — clinicians rarely perform intentional rechallenge for life-threatening events. The framework correctly identifies this universal evidence gap.
-
Signal detected: All Yes. All four disproportionality measures (PRR, ROR, IC, EBGM) exceed their respective thresholds for all three drugs. These are genuine, statistically confirmed safety signals — not marginal findings.
-
Signal persists: All Yes. None of these signals are transient reporting artifacts. Semaglutide has 9 years of post-market data, liraglutide 16 years, and metformin 69 years. A signal that persists across decades of surveillance is reflecting real pharmacology.
Where Results Correctly Diverge
The 13 dimensions where results differ reveal three distinct patterns: GLP-1 class effects, evidence maturity gradients, and fundamental pharmacological distinctions. Each pattern is independently verifiable against published pharmacology.
GLP-1 Agonists Match Each Other (Class Effect)
Semaglutide and liraglutide share identical values on dimensions that reflect their shared pharmacology: both are classified as Type B (idiosyncratic) harm, both carry pancreatitis warnings in the same labeling section (5.2), and neither has a boxed warning for their target ADR. These are not coincidences — they reflect a shared mechanism of action (GLP-1 receptor agonism) acting on the same target organ (pancreas).
Metformin diverges on all of these: Type A (dose-dependent) harm, BOXED WARNING placement, and a fundamentally different mechanism (lactate metabolism disruption). The framework correctly captures these class-level distinctions without being told which drugs are in the same class.
| Dimension | Semaglutide | Liraglutide | Metformin |
|---|---|---|---|
| Harm Type | B (Idiosyncratic) | B (Idiosyncratic) | A (Dose-dependent) |
| Labeling | Section 5.2 | Section 5.2 | BOXED WARNING |
| Boxed Warning | No | No | Yes |
Evidence Maturity Tracks Market Exposure
The Naranjo causality score increases monotonically with years on market: semaglutide scores 4 (POSSIBLE, 9 years), liraglutide scores 5 (PROBABLE, 16 years), and metformin scores 6 (PROBABLE, 69 years). This is not an artifact — more years on market means more published evidence, more case reports, more controlled studies, and more data points available for each Naranjo question.
The same gradient appears in knowledge voids (4 to 3 to 2) and FAERS report counts (2,068 to 3,219 to 18,419). A method that was insensitive to evidence quality would produce flat scores regardless of market exposure. The framework is correctly distinguishing between "weak signal with limited data" and "strong signal with extensive data."
| Dimension | Semaglutide (9 yr) | Liraglutide (16 yr) | Metformin (69 yr) |
|---|---|---|---|
| Naranjo | 4 POSSIBLE | 5 PROBABLE | 6 PROBABLE |
| Knowledge Voids | 4 | 3 | 2 |
| FAERS Reports | 2,068 | 3,219 | 18,419 |
Pharmacological Differentiation
The deepest divergences appear exactly where pharmacology demands them. Metformin's lactic acidosis is a Type A adverse reaction: dose-dependent, predictable from the mechanism of action, and largely preventable through renal function monitoring. GLP-1 agonist pancreatitis is Type B: idiosyncratic, not predictable from dose, and not preventable through routine monitoring.
This distinction cascades through the assessment: Type A reactions accumulate stronger epidemiological evidence faster (higher PRR, more reports), get labeled more prominently (boxed warning), and score higher on causality scales (Naranjo 6 vs 4). The framework captures this cascade correctly — each dimension reinforces the others in a pharmacologically coherent pattern.
Scale Invariance
PRR values range from 6.93 (semaglutide) to 71.42 (metformin) — a 10x spread. FAERS report counts range from 2,068 to 18,419 — a 9x spread. Market history ranges from 9 to 69 years. Despite these order-of-magnitude differences in raw numbers, the framework produces structurally consistent verdicts: all three are confirmed safety signals at appropriate causality levels.
This is a critical property for a generalizable method. A framework that only works when PRR is between 5 and 10, or when FAERS report counts are in the low thousands, would be useless for the full range of drugs in clinical practice. The verdict-level conclusions (signal confirmed, causality level, harm type) remain stable across a 10x range of underlying measures.
Key observation: The disproportionality measures correctly scale with pharmacological reality (Type A reactions produce higher PRR than Type B, longer market exposure produces more reports), while the verdict-level conclusions remain structurally stable. The method is sensitive to magnitude differences without being distorted by them.
Verdict
A generalizable assessment method produces structurally consistent results where pharmacology demands identity, and correctly differentiated results where pharmacology diverges.
The five convergence points (WHO-UMC, energy, highest void, signal detection, signal persistence) reflect genuine pharmacological universals — not framework artifacts. The 13 divergence points separate into three interpretable patterns (class effects, evidence maturity, pharmacological differentiation), each independently verifiable against published pharmacology.
This cross-validation answers the generalizability question: the multi-source signal detection method is not overfit to semaglutide, GLP-1 agonists, or pancreatitis. It produces pharmacologically sensible results across different drug classes, different adverse reaction types, different harm mechanisms, and different levels of market maturity.
For safety scientists: the method can be applied to novel drug-event pairs with reasonable confidence that the results will reflect real pharmacology rather than methodological bias.
Run Your Own Comparison
Every finding in this article was computed from live data using NexVigilant Station. You can reproduce any individual investigation or run the same method against a drug-event pair of your choosing.
Individual Investigations
- Semaglutide + Pancreatitis — Full 8-source investigation
- Liraglutide + Pancreatitis — GLP-1 RA class reference signal
- Metformin + Lactic Acidosis — Anatomy of a Boxed Warning
Connect Your AI Agent
Add NexVigilant Station to Claude, and ask it to investigate any drug-event pair. No API key required.
MCP Server URL: mcp.nexvigilant.com/mcp
Then try: "Compare the safety profiles of atorvastatin and rosuvastatin for rhabdomyolysis"
Methodology
All three investigations used the same 6-step protocol executed through NexVigilant Station (135 public tools across 18 configurations). Step 1: Drug identity resolution via NLM RxNav. Step 2: Adverse event retrieval from FDA FAERS via the openFDA API. Step 3: Disproportionality analysis (PRR, ROR, IC, EBGM) computed from 2x2 contingency tables sourced from OpenVigil France against the full FAERS database (~20 million reports). Step 4: Labeling review via NLM DailyMed. Step 5: Literature search via NLM PubMed. Step 6: Causality assessment using the Naranjo ADR Probability Scale and WHO-UMC system, computed by NexVigilant's pharmacovigilance calculation engine. Harm classification follows the WHO Adverse Reaction Terminology (Type A/B framework). All computations are deterministic and reproducible.
Limitations
FAERS is a spontaneous reporting database subject to reporting bias, under-reporting (estimated 90-95%), stimulated reporting (media coverage of GLP-1 agonists may inflate pancreatitis reports relative to metformin), and confounding by indication. Market years is an imperfect proxy for evidence maturity — prescribing volume, indication breadth, and media attention all confound the relationship between time on market and reporting volume. The 18-dimension comparison framework was constructed post hoc to analyze the results of three investigations; it has not been independently validated. Disproportionality measures detect statistical associations, not causal relationships. This analysis should be interpreted alongside clinical evidence, not as a standalone causal claim.
Disclosure
NexVigilant is an independent pharmacovigilance technology company. This research was generated using our own tools to demonstrate their capabilities and validate their generalizability. We have no financial relationship with the manufacturers of semaglutide, liraglutide, or metformin.