Health Technology Assessment 1998; Vol. 2: No. 13 (Executive
monograph in Adobe Acrobat format (442 kbytes)
4-page summary in Adobe Acrobat format (suitable for printing)
Choosing between randomised and non-randomised studies: a systematic review
1 London School of Hygiene and Tropical Medicine, University of
2 University of Queensland,Australia
Studies that compare healthcare interventions can be divided into those that involve
randomisation of subjects between comparison groups, and those that do not. The former, in
its commonest form the randomised controlled trial (RCT), is seen by many as the 'gold
standard' as it should ensure that subjects being compared differ only in their exposure
to the intervention being considered. The RCT has been criticised, however, with some
arguing that design features tend to exclude many individuals to whom the results will
subsequently be applied. Furthermore, practitioner and patient preferences may influence
the outcome of treatment and cause the results to be misleading. These criticisms have led
some to advocate the use of non-randomised designs.
This review explored those issues related to the process of randomisation that may
affect the validity of conclusions drawn from the results of RCTs and non-randomised
The review was based on a series of systematic reviews involving structured searches of
databases. Details of the methods used are described in the main report. Four research
questions were addressed.
- Do non-randomised studies differ systematically from RCTs in terms of treatment effect?
- Are there systematic differences between included and excluded individuals and do these
influence the measured treatment effect?
- To what extent is it possible to adjust for baseline differences between study groups?
- How important is patient preference in terms of outcome?
Previous comparisons of RCTs and non-randomised studies
Eighteen papers that directly compared the results of RCTs and prospective
non-randomised studies were found and analysed. No obvious patterns emerged; neither the
RCTs nor the non-randomised studies consistently gave larger or smaller estimates of the
treatment effect. The type of intervention did not appear to be influential, though more
comparisons need to be conducted before definite conclusions can be drawn.
Several reasons emerged as to why RCTs might produce a greater or lesser estimate of
treatment effect than non-randomised studies. A greater effect may occur in RCTs if
patients receive higher quality care or are selected in a way that gives greater capacity
to benefit. A lower estimate of treatment effect may occur if:
- patient selection produces a study population with less capacity to benefit than would
be the case in non-randomised studies
- strong patient preference exists against a particular treatment in an unblind RCT, thus
reducing the treatment effect
- non-randomised studies of preventive interventions include a disproportionate number of
people with greater capacity to benefit
- publication bias exists; negative results are less likely to be published from
non-randomised trials than from RCTs.
The number of eligible subjects included in the RCTs ranged from 1% to 100%. Reasons
for exclusions may be medical (e.g. high risk of adverse events in certain groups) or
scientific (selecting only small homogeneous groups in order to increase the precision of
estimated treatment effects). Blanket exclusions (e.g. the elderly, women of childbearing
potential) are also common in RCTs.
Large clinical databases containing detailed information on patient severity and
prognosis have been used instead of RCTs, and where database subjects are selected
according to the same inclusion criteria as RCTs, the treatment effects of the two methods
Most RCTs failed to document adequately the characteristics of eligible individuals who
did not participate in trials. However, RCTs were more likely than non-randomised trials
to include university and teaching centres and this may have exaggerated the treatment
effect measured in the RCTs.
Participation in RCTs differed between studies of treatment interventions (subjects
tended to be less affluent, less educated and more severely ill and therefore had greater
capacity to benefit from treatment) and those evaluating preventive interventions (more
affluent, better educated and generally healthier and therefore had less potential to
benefit than eligible subjects who declined to participate).
Adjusting for baseline differences
Adjustment for differences in baseline prognostic factors in non-randomised studies
often changed the treatment effect size but not significantly; importantly, the direction
of change was inconsistent. Most of the case studies were too small to draw conclusions
but where this was possible, the superiority of one treatment over another was probably a
function of the patients' clinical characteristics.
Only four papers directly addressed the role of patient preference on trial results.
However, preference could account for some of the observed differences between RCTs and
Results of RCTs and non-randomised studies do not inevitably differ, and the available
evidence suffers from many limitations. It does, however, suggest that it may be possible
to minimise any differences by ensuring that subjects included in each type of study are
comparable. The effect of adjustment for baseline differences between groups in
non-randomised studies is inconsistent but, where it is done, it should involve rigorously
developed formulae. Existing studies have generally been too small to assess the impact of
Implications for policy
While a high level of exclusion may have some advantages for those conducting an RCT,
it also has important implications for policy. In particular, there is a risk of denial of
effective treatment to those who might benefit but who have been excluded from the RCTs,
and delay in obtaining definitive results because of low recruitment rate. In addition,
there is a danger of unjustified extrapolation of results to other populations, and it is
concluded that it should not be assumed that summary results apply equally to all
Recommendations for research
- A well-designed non-randomised study is preferable to a small, poorly designed and
- RCTs should be pragmatic by including as wide a range of practice settings as possible.
Study populations should be representative of all patients currently being treated for the
- Exclusions for administrative convenience should be rejected.
- Heterogeneity of populations and interventions should be addressed explicitly.
Practitioners should apply caution when extrapolating to populations that differ from
those included in RCTs.
- For both study designs, authors should define their reference population, state the
steps taken to ensure the study population is a representative sample or explain how it
differs. They should also give details of patient and centre participation and the
characteristics of eligible individuals who did not participate.
- Further research is required on patient characteristics, long-term follow-up,
participation of centres and practitioners and patient preference.
Britton A, McKee M, Black N, McPherson K, Sanderson C, Bain C. Choosing
between randomised and non-randomised studies: a systematic review. Health Technol
Assessment 1998; 2(13).
NHS R&D HTA Programme
The overall aim of the NHS R&D Health Technology Assessment (HTA)
programme is to ensure that high quality research information on the costs, effectiveness
and broader impact of health technologies is produced in the most efficient way for those
who use, manage and work in the NHS. Research is undertaken in those areas where the
evidence will lead to the greatest benefits to patients, either through improved patient
outcomes or the most efficient use of NHS resources.
The Standing Group on Health Technology advises on national priorities
for health technology assessment. Six advisory panels assist the Standing Group in
identifying and prioritising projects. These priorities are then considered by the HTA
Commissioning Board supported by the National Coordinating Centre for HTA.
This report is one of a series covering acute care, diagnostics and
imaging, methodology, pharmaceuticals, population screening, and primary and community
care. The views expressed in this publication are those of the authors and not necessarily
those of the Standing Group, the Commissioning Board or the Panel members.
Andrew Stevens, Ruairidh Milne, Ken Stein
Jane Robertson, Jane Royle
©1998 Crown Copyright