Safety Definitions & Causality Assessment: A Regulator-Ready Playbook for Consistent, Defensible Decisions (2025)

Published on 15/11/2025

Engineering Safety Definitions and Causality Judgments That Withstand Inspection

Shared Language for Safety—What the Terms Mean and Why Precision Matters

Safety decisions rise or fall on vocabulary. Teams that define adverse events the same way move faster, avoid rework, and survive audits because case handling is repeatable and traceable. The anchor is a harmonized lexicon for medicines and devices. From a global perspective, proportionate safety controls and the idea that evidence must be fit for purpose are consistent with principles discussed by the International Council for

Harmonisation guidance. Public materials made available by the U.S. Food and Drug Administration on clinical trial protection and the European Medicines Agency’s pharmacovigilance resources provide practical orientation on how definitions flow into operational timelines. The ethical frame—respect, fairness, and clear communication—echoes in the World Health Organization’s safety guidance, and multinational programs should keep terminology coherent with orientation from Japan’s PMDA information for safety reviewers and Australia’s Therapeutic Goods Administration pharmacovigilance pages.

Core medicine terms. An Adverse Event (AE) is any untoward medical occurrence temporally associated with a medicinal product, whether or not related. An Adverse Drug Reaction (ADR) is a noxious and unintended response for which a causal relationship is at least a reasonable possibility. A Serious Adverse Event (SAE) is one that results in death, is life-threatening, requires or prolongs hospitalization, results in persistent or significant disability/incapacity, is a congenital anomaly/birth defect, or is another medically important event requiring intervention to prevent one of the foregoing. Severity describes intensity (mild/moderate/severe) and is distinct from seriousness, which is outcome-based. Expectedness compares an event’s nature or severity to the Investigator’s Brochure (IB) or Reference Safety Information (RSI); “unexpected” informs expedited reporting when combined with causality.

Device/device–drug specific terms. For device investigations, an Adverse Device Effect (ADE) is an untoward response to a device, and a Serious Adverse Device Effect (SADE) meets seriousness criteria. A Unanticipated ADE (UADE) is a serious effect not previously identified in nature, severity, or incidence, or one that presents increased risk. Device malfunctions that could lead to serious injury if they recurred are reportable even without injury; handling them consistently requires a clear taxonomy (hardware, software, usability) and a reproducible assessment of foreseeable harm.

Case validity, minimum criteria, and ALCOA++. For any safety case to exist, a minimum set of data must be present: an identifiable patient, an identifiable reporter, a suspect product/device, and a reportable event/problem. Evidence must meet ALCOA++ attributes—attributable, legible, contemporaneous, original, accurate, complete, consistent, enduring, and available. That translates operationally into immutable timestamps, version-controlled narratives, and reconciled listings between source and the safety database.

Common boundary issues. Not every abnormal lab result is an AE; classify findings as AEs only when clinically significant or when the protocol requires explicit reporting. Procedures are not events; complications are. Disease progression may be expected in oncology yet still be serious; causality and expectedness must be assessed rather than assumed. Overdose, medication error, or non-compliance are exposures of interest that can generate AEs or safety signals; they deserve explicit fields to avoid narrative ambiguity.

Seriousness vs. severity—why the distinction saves time. A severe headache (high intensity) may be non-serious if it does not meet outcome criteria; a mild anaphylaxis treated in the emergency department is serious because it is life-threatening. Keeping these concepts separate stabilizes triage and prevents “false SAEs” or missed expedited reports.

Expectedness and the reference set. For investigational products, expectedness is judged against the RSI section of the IB; for marketed comparators, use the local label/package insert. The reference must be current at the time of the event. Teams should maintain a change-controlled list of listed terms (MedDRA PTs) that represent the current “expectedness universe.” Failure to align to the governing version is a frequent inspection finding.

Decision hygiene. The roles are complementary: Investigators render a site-level medical judgment based on participant context; the sponsor (or independent safety physician) ensures consistency across cases and applies policy where uncertainty persists. When opinions differ, the most conservative plausible classification governs expedited reporting, while both views remain on file.

Causality Determination—From Medical Thinking to Reproducible Rules

Causality is a medical judgment structured by evidence. To make it reproducible across investigators, indications, and countries, teams codify the questions physicians already ask and drive to a transparent, auditable conclusion. Two families of tools dominate: category systems (e.g., WHO-UMC) and point-score systems (e.g., Naranjo). Either can work if definitions are unambiguous and training is reinforced with calibrated examples.

WHO-UMC categorical approach. Classify each case as Certain, Probable/Likely, Possible, Unlikely, Conditional/Unclassified, or Unassessable/Unclassifiable by answering evidence-based prompts: (1) Time to onset compatible with pharmacology or device exposure? (2) Alternative causes reasonably excluded (comorbidity, concomitants, procedural factors)? (3) Clinically plausible dechallenge and, where ethical and observed, rechallenge? (4) Objective evidence supporting the diagnosis (labs, imaging, device logs)? Document each dimension explicitly so the category flows naturally from the record.

Naranjo score as a tie-breaker. For certain drug cases, a point score (temporal sequence, dechallenge/rechallenge, dose-response, previous patient exposure, alternative explanations, objective confirmation) can resolve borderline categorical disagreements. Use it as a supporting analysis, not a replacement for clinical reasoning; devices and biologics often defy strict point models.

Temporality done correctly. Anchor causality in the first exposure capable of producing the effect and the earliest plausible onset. For chronic dosing, the “risk window” may be weeks; for infusion reactions, minutes. For devices, consider use cycles and task sequences: was a firmware update installed; did a battery fault arise; did user interface changes trigger an error cascade?

Alternative causes—make the search explicit. Record differential diagnoses and tests pursued, even when negative. Common confounders include illness progression, intercurrent infection, drug–drug interactions, and protocol-mandated procedures (e.g., biopsies). For device problems, isolate user error vs. hardware vs. software vs. manufacturing vs. instructions-for-use. When alternative causes remain plausible, categories usually land on Possible or Unlikely; explain why.

Dechallenge and rechallenge—interpret with care. Improvement after stopping drug or removing a device supports causality, but placebo effects and disease fluctuations can mislead. Rechallenge, when it occurs ethically, is persuasive only if exposure and outcome are well-documented and other conditions are unchanged. For hypersensitivity, a positive rechallenge is powerful; for idiosyncratic DILI, absence of rechallenge does not negate causality.

Biological plausibility and class effects. Mechanism and class history matter. A new agent in a class linked to QT prolongation deserves a lower threshold for “related.” For devices, materials, energy delivery mode, and human-factors similarity guide plausibility. Reference the IB/RSI science succinctly in the narrative to show that the judgment reflects known or suspected pharmacology or device behavior.

Severity bias and anchoring—how to prevent drift. Severe outcomes can bias toward “related” even with poor temporality; conversely, mild events are sometimes dismissed. Use a two-step process: assess seriousness and clinical priority for triage, then re-center on temporality/alternatives/dechallenge when assigning causality. Require a short, structured statement: “Onset X after dose Y; confounders Z; dechallenge/result; rationale → possible.”

Blinding, independence, and safety. Causality must not jeopardize the study blind. Establish a firewall: a blinded team records clinical details; an independent safety physician (or unblinded unit) reviews when needed, using code-protected product identifiers. If unblinding is required to protect participants, follow a pre-approved path that limits disclosure to the minimum necessary and records who learned what and why.

Special cases and populations. Pregnancy exposures, congenital anomalies, and neonatal outcomes need structured causality separate from general AE flow. For oncology (disease progression, immune-related events), anchor judgments in objective criteria and immune-specific algorithms. For vaccines or biologics, immunogenicity-type logic (temporal clusters, dechallenge uncommon) replaces classic small-molecule heuristics. For device usability problems, include human-factors evidence (training, labeling, lighting, language) in the causal chain.

From Causality to Expectedness—Linking Judgments to Expedited Reporting

Expectedness translates a causal medical judgment into regulatory timelines. The path is simple but unforgiving: if an event is related (at least reasonably possible) and unexpected (not consistent with the RSI/label), it becomes a candidate for expedited reporting (e.g., SUSAR for drugs, SADE/UADE for devices). A crisp, repeatable hand-off between causality and expectedness safeguards both speed and precision.

Define the reference set and keep it current. For investigational products, the RSI in the current IB is the governing reference at the time of onset. For marketed comparators or background therapies, the local label applies. Maintain a controlled mapping between MedDRA Preferred Terms in the case and the listed terms in the reference; include synonyms to avoid brittle matches. Document the version used at the time of assessment; auditors often ask for it.

Decision tree for medicines. (1) Does the case meet SAE criteria? If no, standard reporting. If yes, (2) Is causality at least “possible” per site or sponsor? If no, non-expedited SAE. If yes, (3) Is the event unexpected relative to the RSI? If yes, expedited (country timelines then govern). If expected, report as serious expected. Record both site and sponsor causality; the more conservative plausible view governs the expedited path.

Decision tree for devices. (1) Is it a SADE or a malfunction that could cause serious injury if it recurred? If yes, (2) Is the effect unanticipated in nature, severity, or incidence relative to the risk analysis/IFU? If yes, expedited (UADE pathways); if no, serious expected device effect reporting. Malfunctions demand systematic root-cause thinking—design, materials, software, manufacturing, labeling, or use—so recurrence risk is addressed, not just paperwork.

Case narratives that prove your conclusion. Narratives are not prose; they are structured evidence: context and history; exposure timelines; onset and course; tests and results; alternative causes considered; dechallenge/rechallenge; outcome; causality rationale; expectedness reference/version; and any actions taken. Pair each narrative with medical coding and seriousness/expectedness fields that match the text exactly; mismatches signal weak control.

Minimum information to start the clock. The moment a valid case exists (patient, reporter, suspect product, AE), clocks begin. If seriousness or relatedness is unknown, start follow-up immediately and capture the attempt. Record why the case was or was not considered related and how expectedness was determined. For timelines measured in calendar days, weekends/holidays count; build buffers into workflows rather than relying on exception handling.

Consistency controls across systems. Reconcile the safety database to EDC/source so that date of onset, seriousness, relatedness, and event term match. If EDC collects site causality and the safety database stores sponsor causality, make the duality explicit and explain it in your data transfer agreement. In device studies, store supporting evidence (device logs, lot numbers, returned units) in a way that retrieval takes minutes, not hours.

Training and calibration. Provide side-by-side examples that differ by a single fact (e.g., onset 2 hours vs. 2 weeks; dechallenge present vs. absent). Calibrate investigators on “possible vs. unlikely” with feedback loops in the first dozen cases. Use periodic case rounds to prevent drift; when the RSI updates, run a focused refresher because expectedness can flip on the same event as knowledge evolves.

Documentation that inspectors trust. Each expedited case should include: the causality statement, the expectedness reference with version/date, the seriousness criterion, and proof that submission occurred within timeline. File “what changed and why” memos for any re-classification (e.g., after new lab data) so a reviewer can follow the audit trail without guesswork.

Operating Model—Roles, Quality Signals, and a Ready-to-Use Checklist

Definitions and judgments become reliable only when the operating model is small, named, and governed. Keep decision rights close to the data, build in second-reader checks for edge cases, and monitor a few predictive indicators so problems surface before timelines fail.

Roles and firewalls. The Investigator documents site causality and seriousness with clinical context. An Independent Safety Physician (or sponsor medical monitor) adjudicates hard cases, protects consistency, and owns sponsor causality. A Safety Operations Lead owns intake, follow-up, and submissions. Data Management reconciles EDC to safety; Quality verifies ALCOA++ attributes and traceability. Firewalls prevent unnecessary unblinding; when unblinding for safety is necessary, the process should expose the minimum information to the minimum roles and record who learned what and why.

Decision templates that cut noise. Require short, structured fields: “Temporality: [days from last dose/use]; Alternatives: [list]; Dechallenge/Rechallenge: [summary]; Objective evidence: [tests, device logs]; Causality: [category plus one-sentence rationale].” Tie these fields to the narrative generator so repetition is eliminated and reviewers see the same evidence in the same order.

Key risk indicators (KRIs) and quality tolerance limits (QTLs). Track early warnings: high proportion of “unassessable” causality; repeated mismatches between narrative and coded fields; spikes in “unknown seriousness” at intake; frequent RSI mismatches; device malfunctions coded without root-cause fields. Convert the most consequential to QTLs, for example: “>5% expedited cases missing explicit expectedness reference/version over any rolling month,” or “>10% of device malfunction cases submitted without recurrence risk assessment.” Crossing a QTL triggers a documented review and corrective plan.

Calibration library and edge-case council. Maintain a library of anonymized, adjudicated cases with the final rationale. Once per quarter, convene a small council (safety physician, device engineer if applicable, data manager, quality) to review disagreements and update examples. Add new patterns (e.g., immune-related events, human-factors induced device errors) so training reflects emerging reality.

Audit posture and five-minute retrieval. Inspectors commonly ask for a random case and expect to see: intake timestamp, minimum criteria, site and sponsor causality, seriousness, expectedness reference, narrative, coding, follow-up attempts, submission proof, and reconciliation to EDC/source. Practice a monthly retrieval drill from a dashboard case number to the artifact set; if retrieval exceeds five minutes, fix metadata, filing locations, or the one-record-of-record rule.

Ready-to-use checklist (paste into your SOP or study safety plan).

Harmonized definitions issued (AE, ADR, SAE, seriousness vs. severity; ADE/SADE/UADE and malfunction for devices) with examples relevant to the protocol.
Minimum case criteria enforced at intake; ALCOA++ attributes verified; timestamps and versions immutable.
Causality tool selected (WHO-UMC categories ± Naranjo as tie-breaker) and embedded in forms with required rationale fields.
Temporality windows defined by mechanism and route; device use cycles and logs captured where relevant.
Alternative cause prompts active (comorbidity, concomitants, procedures; human-factors and manufacturing for devices).
Dechallenge/rechallenge rules clarified; ethics constraints documented; statements standardized.
Expectedness mapped to current RSI/label; MedDRA mapping table version-controlled; synonyms and cross-references maintained.
Decision trees implemented for medicines (SAE + related + unexpected → expedited) and devices (SADE/UADE + unanticipated or malfunction with serious potential → expedited).
Narrative template ensures context, exposure timeline, onset/course, tests, alternatives, dechallenge/rechallenge, rationale, expectedness reference/version, actions.
Dual causality (investigator vs. sponsor) captured; conservative plausible assessment governs expedited path; both views archived.
Reconciliation rules between EDC and safety database documented; periodic checks scheduled; discrepancies corrected with audit trail.
Unblinding for safety pre-authorized path defined; minimum necessary disclosure; access logs retained.
KRIs/QTLs monitored; red thresholds trigger formal review and documented corrective plans.
Calibration library and quarterly edge-case council in place; training updates after RSI changes.
Five-minute retrieval drill passed for a random expedited case; inspection pack generator available on demand.

Bottom line. When safety definitions are unambiguous and causality judgments follow explicit, trained rules, expedited reporting becomes fast and defensible. A small set of decision prompts, calibrated examples, clear expectedness mapping, and disciplined traceability will keep teams aligned, protect participants, and stand up to inspection across regions and study types.