Call Us
Pharma & CRO

Real-World Evidence in Drug Development: What It Is and Why It Matters

Real-world evidence is reshaping drug development, from regulatory submissions to post-market surveillance. This guide explains the distinction between RWD and RWE, the regulatory frameworks that govern their use, and why emerging market data represents the next frontier.

Kapsule Research Team27 February 202616 min read

The pharmaceutical industry spent decades treating randomised controlled trials as the only credible source of clinical evidence. That position is shifting. Regulators, payers, and sponsors now recognise that data generated outside the controlled trial environment, from hospital records, insurance claims, patient registries, and wearable devices, can answer questions that traditional trials cannot.

This shift has real implications for how sponsors design development programmes, how regulators evaluate safety and efficacy, and which patient populations end up in the evidence base. It also raises a question that most RWE strategies have yet to address: what happens when the vast majority of your real-world data comes from a narrow slice of the global population?

This guide covers the fundamentals of real-world evidence, the regulatory frameworks governing its use, the practical requirements for generating regulatory-grade RWE, and why the next wave of evidence generation will depend on data from regions that the industry has historically overlooked.

Real-World Evidence vs Real-World Data: Definitions That Matter

The terms "real-world data" and "real-world evidence" are frequently used interchangeably. They should not be. The distinction is functional: it determines what you can and cannot do with your data in a regulatory context.

Real-world data (RWD) refers to the raw information collected during routine clinical care or from other non-interventional sources. Electronic health records (EHRs), medical claims databases, disease registries, patient-reported outcome surveys, and data from wearable health devices all qualify as RWD. The defining characteristic is that the data is not collected under the controlled conditions of a prospective clinical trial.

Real-world evidence (RWE) is the clinical evidence derived from the analysis of RWD. When a sponsor applies a validated study design and rigorous analytical methods to a real-world dataset, the output is RWE. The same underlying dataset can produce evidence of varying quality depending on the analytical approach, the completeness of the data, and the appropriateness of the study design for the clinical question being asked.

The FDA's 2018 Framework for FDA's Real-World Evidence Program formalised this distinction. Under the framework, RWD is a potential input; RWE is a potential output. The quality and reliability of the evidence depends on both the fitness of the underlying data and the rigour of the analytical methods applied to it.

This means that a health data platform with millions of patient records is not, by itself, producing real-world evidence. It is producing real-world data. The evidence emerges when researchers design studies against that data with pre-specified endpoints, appropriate comparators, and methods that account for the biases inherent in observational data: confounding, selection bias, information bias, and missing data.

Understanding this distinction is essential for anyone building an RWE strategy. Investing in data acquisition without investing equally in study design and analytical capability will produce datasets, not evidence.

How RWE Is Used in Drug Development Today

Real-world evidence now plays a role across the full drug development lifecycle. The most established applications include:

Post-market safety surveillance. This is the longest-standing use of RWE and remains the most widely accepted by regulators. Once a drug reaches the market, RWD from claims databases, EHRs, and spontaneous adverse event reports provides the scale needed to detect rare safety signals that clinical trials, typically powered for efficacy rather than safety, cannot identify. The FDA's Sentinel System, which monitors safety across a distributed network of claims and EHR databases covering hundreds of millions of patients, is the largest example of this approach.

Label expansion and new indications. Sponsors increasingly use RWE to support supplemental New Drug Applications (sNDAs) for new indications, new patient populations, or modified dosing regimens. The 21st Century Cures Act, signed into law in 2016, explicitly directed the FDA to evaluate the use of RWE for approving new indications of previously approved drugs. Several label expansions have since been supported, in part, by real-world evidence.

Clinical trial optimisation. RWD is used to identify optimal trial sites, refine eligibility criteria, estimate recruitment timelines, and construct external control arms. Site selection informed by claims data or EHR data can reduce screen failure rates and accelerate patient recruitment by matching trial requirements to the patient populations that sites actually serve, rather than relying on investigator estimates.

Health technology assessment (HTA) and market access. Payers and HTA bodies in Europe, the UK, and now in Asia and Latin America use RWE to assess comparative effectiveness: how a drug performs relative to existing standard of care in routine clinical practice, not just versus placebo in a controlled setting. For sponsors, generating RWE early in the development programme can strengthen the market access dossier and reduce time to formulary listing.

Regulatory decision-making for rare diseases and oncology. In therapeutic areas where traditional RCTs are difficult or impossible to conduct, such as rare diseases with small patient populations or oncology indications with rapid treatment evolution, RWE can serve as a primary or supplementary source of evidence. Single-arm trial designs supplemented by external comparator arms built from RWD have become increasingly common in oncology drug development.

Regulatory Frameworks: FDA, EMA, and Emerging Markets

Regulatory acceptance of RWE varies by jurisdiction, but the trend is toward greater openness.

United States (FDA). The FDA's engagement with RWE accelerated following the 21st Century Cures Act and the publication of the 2018 Framework for FDA's Real-World Evidence Program. The framework outlined the agency's approach to evaluating RWE for effectiveness decisions and committed to a multi-year programme of demonstration projects. Since then, the FDA has issued multiple guidance documents addressing specific aspects of RWE generation, including guidance on the use of EHRs and claims data, registries, and data standards. The FDA has also approved or supported regulatory decisions using RWE in several cases, particularly in oncology and rare diseases, though the agency continues to evaluate submissions on a case-by-case basis rather than applying blanket acceptance.

European Union (EMA). The EMA has pursued RWE integration through several initiatives, most notably the DARWIN EU (Data Analysis and Real World Interrogation Network) programme, whose coordination centre was established in 2022 and which became fully operational in 2024. DARWIN EU provides the EMA with a standing capability to run real-world evidence studies using European health data sources, supporting regulatory decision-making across the product lifecycle. The EMA has also published its own guidance on registry-based studies and is increasingly receptive to RWE in benefit-risk assessments, particularly for conditional marketing authorisations and post-authorisation safety studies.

Emerging market regulators. Regulatory agencies in Africa, Southeast Asia, and Latin America are at earlier stages of formalising their approaches to RWE. However, several are moving quickly. The South African Health Products Regulatory Authority (SAHPRA) has signalled openness to RWE in its evaluation of medicines for local registration. Nigeria's NAFDAC and Kenya's Pharmacy and Poisons Board are developing frameworks for post-market surveillance that will rely on real-world data infrastructure. For sponsors conducting multi-regional development programmes, understanding the RWE readiness of emerging market regulators is becoming a practical necessity.

RWE complements randomised controlled trials by filling gaps that trials alone cannot address. Sponsors who build RWE capability now will have an advantage as regulatory frameworks mature.

Sources of Real-World Data: EHRs, Claims, Registries, and More

The quality and utility of real-world evidence depends entirely on the quality and fitness of the underlying data. Not all RWD sources are equivalent, and each carries distinct strengths and limitations.

Electronic health records (EHRs). EHR data provides rich clinical detail (diagnoses, lab results, medications, procedures, clinician notes) and is available at growing scale. In the US, companies like Flatiron Health (oncology-focused EHR data) and health systems participating in research networks like PCORnet provide curated EHR datasets for research. The primary limitation of EHR data is that it reflects clinical documentation practice, not research protocol. Diagnoses may be coded for billing convenience rather than clinical precision. Lab values may be missing for patients who do not return for follow-up. Medication records may not capture adherence.

Claims databases. Administrative claims data from insurers and pharmacy benefit managers, such as the datasets maintained by Optum, IQVIA, and Merative (formerly IBM Watson Health), provides broad population coverage and longitudinal follow-up. Claims data excels at capturing healthcare utilisation patterns, treatment sequences, and costs. Its weakness is clinical depth: claims data tells you what was billed, not necessarily what happened clinically. Diagnoses are limited to ICD codes, and clinical outcomes like lab values, staging, or patient-reported symptoms are absent.

Disease registries. Purpose-built registries like the SEER cancer registry, cystic fibrosis registries, or transplant registries offer deep, structured data on specific conditions. They are often the gold standard for RWE in rare diseases and oncology. However, registries are expensive to build and maintain, may have selection bias (patients enrolled at academic centres may not be representative), and typically cover only the condition of interest, limiting the ability to study comorbidities or concomitant medications.

Patient-generated data. Wearable devices, mobile health applications, and patient-reported outcome platforms are emerging as supplementary RWD sources. These provide continuous, real-time data on activity levels, heart rate, sleep, glucose, and patient-reported symptoms. The challenge is standardisation: device-level variability, inconsistent wear time, and the lack of regulatory-grade validation for most consumer devices limit their utility for regulatory submissions today. However, the FDA's Digital Health Center of Excellence is actively working to establish frameworks for incorporating digital health data into regulatory evidence packages.

Emerging market health data. Health systems in low- and middle-income countries generate millions of patient encounters annually, but this data has historically been inaccessible for research: fragmented across paper records, unstandardised EMR systems, and facilities with no data-sharing infrastructure. This is changing, particularly in Africa, where digital health investment has accelerated the transition from paper to electronic records and created new opportunities for real-world data aggregation.

Data Quality Requirements for Regulatory-Grade RWE

Generating real-world evidence that regulators will accept requires meeting quality thresholds that go well beyond basic data availability. The FDA's guidance documents and the EMA's methodological standards outline several dimensions of data quality that sponsors must address.

Relevance. The data must be fit for the specific research question. A claims database may be ideal for studying treatment patterns but inadequate for evaluating clinical outcomes that require lab values or imaging data. Before committing to a data source, sponsors should map the available data elements against the study endpoints and confirm that the data can support the intended analysis.

Reliability. Data must be collected consistently and accurately over time. This means evaluating coding practices (are diagnoses assigned by clinicians or by coders working from incomplete documentation?), data capture completeness (what percentage of visits have complete medication, lab, and diagnosis records?), and temporal consistency (have coding practices or data capture methods changed during the study period in ways that could introduce bias?).

Completeness. Missing data is the most common threat to the validity of RWE studies. Regulators expect sponsors to characterise the pattern and extent of missing data, justify the analytical approach used to handle it (complete case analysis, multiple imputation, inverse probability weighting), and conduct sensitivity analyses to demonstrate that conclusions are robust to plausible assumptions about the missing data.

Traceability and provenance. For regulatory submissions, sponsors must be able to document the chain of custody from raw data source to final analytical dataset. This includes data processing steps, de-identification methods, linkage algorithms (if multiple data sources are combined), and version control. The level of documentation expected is comparable to what sponsors provide for clinical trial databases, a significant step up from the ad hoc data handling common in academic RWE studies.

Representativeness. A dataset that captures only patients at academic medical centres, or only patients with commercial insurance, or only patients in high-income countries may produce evidence that does not generalise to the broader population that will use the drug. This is where the question of population diversity becomes directly relevant to RWE strategy.

The Emerging Markets Opportunity: Africa's Untapped RWD

Real-world data today is overwhelmingly concentrated in North America and Western Europe. The major RWD vendors (Flatiron, Optum, IQVIA, Merative) draw primarily from US and European health systems. This creates a structural gap in the global evidence base.

Think about what this means in practice. A sponsor building an external comparator arm from US claims data is constructing an evidence package that reflects the demographics, comorbidity profiles, treatment patterns, and healthcare utilisation of a high-income, predominantly insured population. For therapies intended for global markets, or for conditions with different epidemiology in low- and middle-income countries, this evidence base is incomplete.

Africa is the largest untapped source of real-world health data globally. The continent accounts for nearly one-fifth of the world's population and carries a disproportionate share of the global burden of infectious disease, maternal and child health conditions, and increasingly, non-communicable diseases like diabetes, cardiovascular disease, and cancer. Yet African patient data is almost entirely absent from the global RWE ecosystem.

This gap matters for several reasons:

Genetic and pharmacogenomic diversity. African populations carry the greatest genetic diversity of any continental group. Drug metabolism pathways, particularly CYP450 enzyme variants, differ significantly across African populations. RWE generated exclusively from European-ancestry populations may not accurately reflect the safety and efficacy profile of a drug in African patients. As precision medicine advances, pharmacogenomic data from diverse populations will become essential for label development and dosing guidance.

Disease presentation and comorbidity patterns. The co-occurrence of infectious diseases (HIV, TB, malaria, hepatitis B) with non-communicable conditions creates comorbidity profiles that are rarely observed in high-income country datasets. Understanding how drugs perform in patients managing these combinations requires RWD from settings where these combinations are common.

Treatment-naïve populations. In many African settings, patients present to care with limited prior pharmaceutical exposure. For sponsors studying first-line therapies or drugs where prior treatment history is a confounding factor, treatment-naïve populations offer cleaner observational datasets.

Regulatory momentum. As African regulatory agencies develop their own frameworks for RWE, sponsors who have already generated evidence from African patient populations will be better positioned to support local registration and market access applications.

The practical barriers to accessing African RWD (fragmented health systems, inconsistent data standards, complex multi-jurisdiction ethics requirements) are real but increasingly surmountable. Digital health infrastructure across East and West Africa has matured considerably. Countries like Kenya, with OpenMRS deployed across thousands of facilities, and Rwanda, with near-universal EMR coverage and national ID linkage, now produce structured clinical data at research-ready quality levels. Kapsule, for example, aggregates structured, de-identified health records from facility networks across multiple African countries, applying quality controls and standardisation that make the data usable for research and regulatory purposes. The question is no longer whether African RWD is accessible, but whether sponsors will integrate it into their evidence strategies before their competitors do.

Building an RWE Strategy: Practical Steps for Sponsors

For sponsors looking to build or strengthen their RWE capabilities, the following steps provide a practical framework.

1. Define the evidence gaps that RWE can address. Start with the clinical and regulatory questions that your existing trial programme cannot answer efficiently. Common starting points include: characterising the natural history of disease, understanding treatment patterns and unmet need, supporting label expansion into new populations, constructing external comparator arms for single-arm trials, and generating post-market safety evidence.

2. Assess data source fitness before committing. Match your evidence needs to the strengths of available data sources. For treatment pattern studies, claims data may suffice. For clinical outcome studies, EHR data with lab values and staging information is typically necessary. For global evidence packages, you will need data from multiple geographies, including emerging markets where traditional RWD vendors have limited or no coverage.

3. Invest in study design and analytical rigour. The most common failure mode in RWE is applying weak analytical methods to strong data (or strong methods to weak data). Target trial emulation (a framework for designing observational studies that mimic the structure of a hypothetical randomised trial) has emerged as a best-practice approach. Ensure your team or analytical partners have expertise in causal inference methods, including propensity score methods, instrumental variable analysis, and regression discontinuity designs, depending on the clinical question and data structure.

4. Engage regulators early. For any RWE study intended to support a regulatory submission, engage the relevant regulatory agency before the study begins. The FDA offers formal and informal mechanisms for discussing RWE study plans, including Type B and Type C meeting requests. The EMA's scientific advice process similarly allows sponsors to discuss RWE methodology before data are generated. Early engagement reduces the risk of investing in a study that regulators ultimately consider insufficient.

5. Build a diverse data portfolio. A credible global RWE strategy cannot rely exclusively on data from North America and Western Europe. Sponsors should actively seek data partnerships in regions that are currently underrepresented in the evidence base, particularly Africa, South Asia, and Southeast Asia. These partnerships take time to establish, so beginning the process before a specific programme demands the data is prudent.

6. Plan for data governance and compliance from the start. RWE studies involve patient data, which means privacy regulation, ethics review, and data use agreements are non-negotiable. In multi-country studies, compliance requirements become considerably more complex. Establish a data governance framework that addresses consent, de-identification, cross-border data transfer, and regulatory reporting obligations for each jurisdiction in your data portfolio.

7. Measure and communicate the value of RWE internally. For RWE to move from a pilot capability to a core function, internal stakeholders (clinical development, regulatory affairs, medical affairs, commercial) need to see concrete evidence of its impact. Track metrics like time savings in trial design, regulatory feedback on RWE submissions, and the incremental value of RWE in market access negotiations. Build the business case with data, not rhetoric.

The organisations that will get the most value from real-world evidence over the next decade are those that treat it as an integrated capability, linking data strategy, analytical expertise, regulatory engagement, and global data access into a coherent programme, rather than as an occasional supplement to their clinical trial output.


Kapsule provides access to structured, de-identified health records covering over 75 million patients across 9 African countries. Contact our team to discuss how African real-world data can complement your evidence generation strategy.


This article is intended for informational purposes only and does not constitute legal, medical, or regulatory advice. Readers should obtain independent professional counsel for their specific circumstances.

Related Articles

Share