Published on 15/11/2025
Estimands and Intercurrent Events: Framing the Question So the Answer Stands Up to Review
Estimands 101: Define the Decision Question Before You Touch a Dataset
Estimands are the backbone of modern clinical inference: they make explicit what effect a trial seeks to estimate, for which population, under what handling of intercurrent events, and using which summary measure. The ICH E9(R1) addendum anchors this thinking and is widely recognized by authorities such as the ICH assembly, the U.S. FDA, the EMA, Japan’s A compliant estimand specification contains four parts: (1) Treatment condition(s); (2) Population (e.g., all randomized eligible participants); (3) Variable/Endpoint (definition and timing); and (4) Intercurrent-event strategy together with the population-level summary (e.g., mean difference, risk ratio, hazard ratio). By nailing these choices up front, you keep design, conduct, analysis, and interpretation synchronized. Why sponsors stumble. Historically, protocols named endpoints and analysis models but left ambiguous what to do when real-world events occurred—discontinuation, rescue medication, switching therapies, death, or prohibited concomitants. The estimand framework removes that ambiguity, replacing “we will handle appropriately” with a transparent, auditable plan that statisticians can implement and regulators can inspect. Tie to clinical objectives. Start from the decision the evidence must support. If prescribers and payers need to know the expected treatment effect in routine use even when patients switch or use rescue, a treatment policy strategy is coherent. If the scientific question is mechanistic—what would happen if no rescue were used—a hypothetical strategy is apt. If taking rescue is itself a clinically meaningful failure, a composite strategy suits. If benefit is only relevant while patients persist on assigned therapy, a while-on-treatment strategy fits. And if the question is about a subgroup defined by potential intercurrent events, a principal stratum strategy may be considered (with care). Population and timing matter. Pre-specify the population (all randomized vs a biomarker-positive subset), the time horizon (e.g., change from baseline at Week 24, 52-week risk, or time to composite failure), and any windows or grace periods. In time-to-event frameworks, clarify the origin (randomization vs first dose), competing risks, and whether events after switching count. Inspectability by design. Estimands are not stand-alone prose. They must map to design choices (sample size, randomization strata), data flows (capturing intercurrent-event metadata), and analysis code (SAP, programs, and outputs). File point-in-time configuration snapshots for forms, edit checks, and visit windows so reviewers can reconstruct what was in force when decisions and events occurred. What qualifies as an intercurrent event? Any post-randomization occurrence that affects either the interpretation or the existence of the endpoint: treatment discontinuation, rescue/alternative treatments, dose changes beyond protocol, prohibited concomitants, surgical interventions, death, COVID-19 disruptions, or even device replacements in device trials. The key is pre-specification and consistent capture. Strategy catalogue. Choosing coherently. The same trial can host different estimands for different stakeholders. For example, a chronic-disease study might pair: (1) a treatment-policy primary estimand for labeling, and (2) a hypothetical supportive estimand to explore pharmacologic effect absent rescue. Pre-plan multiplicity and reporting hierarchy to avoid confusion. Time-to-event specifics. In survival analyses, specify whether post-discontinuation events count, whether switching triggers censoring (and if so, whether to use methods such as inverse probability of censoring weights, rank-preserving structural failure time models, or two-stage estimators), and how competing risks are handled. Non-proportional hazards are common when switching or rescue occurs late; consider estimands based on restricted mean survival time (RMST) or milestone survival. Responder and composite scales. For binary responder endpoints, define responder criteria that incorporate intercurrent events (e.g., responder only if target biomarker threshold is met and no rescue). For ordinal scales, specify how the event maps to categories (e.g., “worst category” upon rescue or surgery). Keep the mapping clinically motivated and symmetric across arms. Device and procedure nuance. In device trials, intercurrent events include explant, revision, or crossover procedures. Clarify whether outcomes post-revision belong to the original strategy (treatment policy) or require a composite failure rule. Document the taxonomy. Build a codebook of intercurrent events in the Data Management Plan: definitions, triggers, capture methods (EDC, IRT, eCOA, medical review), timestamp requirements (local time + UTC offset), and links to analysis flags. This turns real-world messiness into analysable, inspectable structure. Protocol precision. Place formal estimand statements in the objectives/endpoints cross-referencing the intercurrent-event taxonomy. For each, specify the population, endpoint definition/timing, strategy, and summary measure. Write in testable terms (“treatment policy estimand for mean change at Week 24, ITT population, difference in least-squares means”). Sample size implications. Estimands influence variance and event rates. A treatment-policy approach may increase variability due to heterogeneous post-event behavior and thus require larger N. Composite strategies can increase event rates (improving power) but may dilute clinical meaning if the composite is dominated by soft components. Survival estimands that censor at switching usually change the effective information and mandate simulation to understand power under plausible switching patterns. CRF and system design. Build data capture around estimands: fields and edit checks for rescue, discontinuation, switching, prohibited concomitants, surgery, and reasons. In IRT/IAM, capture emergency unblinding with timestamps and rationale. Ensure audit trails record who/what/when/why with local time and UTC offset so an inspector can reconstruct the sequence of events relative to visit windows and endpoint timing. Programming & derivations. In ADaM, create explicit flags/variables to represent intercurrent events (e.g., Missing data vs intercurrent events. Distinguish the two. An intercurrent event handled by the estimand strategy is not “missing”—it is part of the definition. Only data absent relative to the chosen strategy are “missing.” In treatment-policy estimands, post-event observations are observed data; in hypothetical estimands, data after the event are typically missing by design and require assumption-driven imputation or modeling. State the mechanisms (MAR/MNAR) and conduct tipping-point analyses that vary plausible departures. Switching and advanced methods. If censoring at switching is used, pre-specify methods to mitigate bias from informative censoring. Options include inverse probability of censoring weighting (with robust variance), rank-preserving structural failure time models, and two-stage estimators. Simulate operating characteristics under realistic switching distributions and covariate patterns; retain code, seeds, and versions for inspection. Blinded Data Review (BDR). Schedule a BDR to verify definitional logic without breaking the blind: check rescue/switch flags, discontinuation codes, and time anchors; confirm consistency across EDC, IRT, safety, and adjudication data. Document what may be corrected (e.g., inconsistent dates) and what may not be altered (e.g., outcomes themselves), with approvals and audit trails. Transparency in the SAP. For each estimand, the SAP should name the model (e.g., ANCOVA, MMRM, stratified Cox, RMST), covariates, and exact handling of the intercurrent events/flags. For hypothetical approaches, specify imputation models (variables, visit structure, number of imputations, delta adjustments). For composite/while-on-treatment, define censoring, failure, and windows precisely. Include a mapping table from the estimand prose → ADaM variables/flags → TFL shells. Communications and labeling. Plan how estimands appear in the CSR and labeling language. Use phrasing that reflects the strategy (“among all randomized participants regardless of rescue,” “among participants while persisting on assigned treatment,” “in a hypothetical scenario without rescue therapy”). Clarity here reduces misinterpretation in HTA/payer submissions. What reviewers will ask for quickly. Sensitivity analyses—organized, not opportunistic. Pre-define a reference (primary) analysis and a structured set of sensitivity analyses that probe departures in assumptions relevant to the chosen strategy. Examples: (a) hypothetical imputation with alternative deltas; (b) treatment-policy MMRM vs MI; (c) survival analyses using RMST milestones; (d) IPCW with different covariate sets; (e) per-protocol supportive estimation under while-on-treatment strategies. Present results with clear directionality (how assumptions shift estimates) rather than a laundry list of p-values. Frequent failure modes—and durable fixes. Program-level KPIs to demonstrate control. Study-ready checklist (single page). Bottom line. Estimands convert an implicit research question into explicit, inspectable intent. When you choose coherent strategies for intercurrent events, capture them faithfully, and implement analyses that match the prose—with sensitivity analyses that reveal how assumptions matter—your results will be reproducible, intelligible, and persuasive to assessors at the FDA, the EMA, the PMDA, the TGA, across the ICH community, and aligned with the WHO public-health perspective.Intercurrent Events: A Playbook of Strategies and When to Use Them
From Words to Workflows: Operationalizing Estimands Across Protocol, Data, and SAP
ASEVNT types, ONTRTFL, SWITCHDT, RESCUEFL). For hypothetical strategies, implement multiple imputation or model-based approaches aligned with the SAP, carrying through stratification and covariates. For while-on-treatment, define rules for truncation and document how missingness before truncation is handled.Regulatory Evidence Pack: Common Pitfalls, Sensitivity Arsenal, and a One-Page Checklist