Health Technology Assessment 2003; Vol 7: number 27
Executive SummaryView/Download full monograph in Adobe Acrobat format (871 kbytes)
View/Download 4-page summary in Adobe Acrobat format (suitable for printing)
In collaboration with the International Stroke Trial and the European Carotid Surgery Trial Collaborative Groups
for Statistics in Medicine, Institute of Health Sciences, Oxford, UK
2Southampton Health Technology Assessments Centre, University of Southampton, UK
3NHS Centre for Reviews and Dissemination, University of York, UK
4Department of Public Health and Epidemiology, University of Birmingham, UK
5MRC Social and Public Health Sciences Unit, University of Glasgow, UK
* Corresponding author
In the absence of randomised controlled trials (RCTs), healthcare practitioners and policy-makers rely on non-randomised studies to provide evidence of the effectiveness of healthcare interventions. However, there is controversy over the validity of non-randomised evidence, related to the existence and magnitude of selection bias.
To consider methods and related evidence for evaluating bias in non-randomised intervention studies.
1. Three reviews were conducted to consider:
2. New empirical investigations were conducted generating non-randomised studies from two large, multicentre RCTs by selectively resampling trial participants according to allocated treatment, centre and period. These were used to examine:
The resampling design overcame particular problems of meta-confounding and variability of direction and magnitude of bias that hinder the interpretation of previous reviews.
Eight studies compared results of randomised and non-randomised studies across multiple interventions using meta-epidemiological techniques. The studies reached conflicting conclusions, explicable by differences in:
The only deducible conclusions were (a) results of randomised and non-randomised studies sometimes, but not always, differ and (b) both similarities and differences may often be explicable by other confounding factors.
We identified 194 tools that could be or had been used to assess non-randomised studies. Around half were scales and half checklists, most were published within systematic reviews and most were poorly developed with scant attention paid to principles of scale development.
Sixty tools covered at least five of six pre-specified internal validity domains (creation of groups, blinding, soundness of information, follow-up, analysis of comparability, analysis of outcome), although the degree of coverage varied. Fourteen tools covered three of four core items of particular importance for non-randomised studies (How allocation occurred? Was the study designed to generate comparable groups? Were prognostic factors identified? Was case-mix adjustment used?). Six tools were thought suitable for use in systematic reviews.
Of 511 systematic reviews that included non-randomised studies, only 169 (33%) assessed study quality. Many used quality assessment tools designed for RCTs or developed by the authors themselves, and did not include key quality criteria relevant to non-randomised studies. Sixty-nine reviews investigated the impact of quality on study results in a quantitative manner.
The bias introduced by non-random allocation was noted to have two components. First, the bias could lead to consistent over- or underestimations of treatment effects. This occurred for historical controls, the direction of bias depending on time trends in the case-mix of participants recruited to the study. Second, the bias increased variation in results for both historical and concurrent controls, owing to haphazard differences in case-mix between groups. The biases were large enough to lead studies falsely to conclude significant findings of benefit or harm.
Four strategies for case-mix adjustment were evaluated: none adequately adjusted for bias in historically and concurrently controlled studies. Logistic regression on average increased bias. Propensity score methods performed better, but were not satisfactory in most situations. Detailed investigation revealed that adequate adjustment can only be achieved in the unrealistic situation when selection depends on a single factor. Omission of important confounding factors can explain underadjustment. Correlated misclassifications and measurement error in confounding variables may explain the observed increase in bias with logistic regression, as may differences between conditional and unconditional odds ratio estimates of treatment effects.
Results of non-randomised studies sometimes, but not always, differ from results of randomised studies of the same intervention. Non-randomised studies may still give seriously misleading results when treated and control groups appear similar in key prognostic factors. Standard methods of case-mix adjustment do not guarantee removal of bias. Residual confounding may be high even when good prognostic data are available, and in some situations adjusted results may appear more biased than unadjusted results.
Although many quality assessment tools exist and have been used for appraising non-randomised studies, most omit key quality domains. Six tools were considered potentially suitable for use in systematic reviews, but each requires revision to cover all relevant quality domains.
Healthcare policies based upon non-randomised studies or systematic reviews of non-randomised studies may need re-evaluation if the uncertainty in the true evidence base was not fully appreciated when policies were made.
The inability of case-mix adjustment methods to compensate for selection bias and our inability to identify non-randomised studies which are free of selection bias indicate that non-randomised studies should only be undertaken when RCTs are infeasible or unethical.
By Deeks JJ, Dinnes J, DAmico R, Sowden AJ, Sakarovitch C, Song F, et al. Evaluating non-randomised intervention studies. Health Technol Assess 2003;7(27).
The NHS R&D Health Technology Assessment (HTA) Programme was set up in 1993 to ensure that high-quality research information on the costs, effectiveness and broader impact of health technologies is produced in the most efficient way for those who use, manage and provide care in the NHS.
Initially, six HTA panels (pharmaceuticals, acute sector, primary and community care, diagnostics and imaging, population screening, methodology) helped to set the research priorities for the HTA Programme. However, during the past few years there have been a number of changes in and around NHS R&D, such as the establishment of the National Institute for Clinical Excellence (NICE) and the creation of three new research programmes: Service Delivery and Organisation (SDO); New and Emerging Applications of Technology (NEAT); and the Methodology Programme.
The research reported in this monograph was identified as a priority by the HTA Programmes Methodology Panel and was funded as project number 96/26/99.
The views expressed in this publication are those of the authors and not necessarily those of the Methodology Programme, HTA Programme or the Department of Health. The editors wish to emphasise that funding and publication of this research by the NHS should not be taken as implicit support for any recommendations made by the authors.
Criteria for inclusion in the HTA monograph series
Reports are published in the HTA monograph series if (1) they have resulted from work commissioned for the HTA Programme, and (2) they are of a sufficiently high scientific quality as assessed by the referees and editors.
Reviews in Health Technology Assessment are termed systematic when the account of the search, appraisal and synthesis methods (to minimise biases and random errors) would, in theory, permit the replication of the review by others.
Methodology Programme Director: Professor Richard Lilford
HTA Programme Director: Professor Kent Woods
Series Editors: Professor Andrew Stevens, Dr Ken Stein, Professor John Gabbay, Dr Ruairidh Milne and Dr Rob Riemsma
Managing Editors: Sally Bailey and Sarah Llewellyn Lloyd
The editors and publisher have tried to ensure the accuracy of this report but do not accept liability for damages or losses arising from material published in this report. They would like to thank the referees for their constructive comments on the draft document.
© 2003 Crown Copyright Top ^