Bias. Part 1: Selection Bias

Background Information:

In epidemiology, one wishes to determine the relationship between factors of interest and health conditions. However, several errors may occur during an epidemiological study. Errors may be broadly classified as random errors and systematic errors. While random errors are unpredictable and cannot be controlled, systematic errors arise when there is a consistent or deliberate error in any direction (towards the null [there is no difference] or away from it [there is a difference]).

Systematic errors are called bias. Sometimes bias is also used to refer to the mechanism that produces lack of internal validity.

There are several classifications of bias (like the classifications suggested by Sackett and Choi, Kleinbaum et al, Steineck and Ahlbom, etc.). I will discuss bias using the classification suggested by Kleinbaum et al- selection bias, information bias, and confounding. In this first article, I will discuss selection bias.

Selection Bias

It refers to the error introduced when the study population does not represent the target population.

It can be introduced at any stage of a research study design (bad definition of the eligible population, lack of accuracy of sampling frame, uneven diagnostic procedures in the target population) and implementation.

  1. Inappropriate definition of the eligible population
    • Competing risks: This is encountered when two or more mutually exclusive outputs compete with each other in the same subject. When dealing with causes of death, the risk of death due to a specific cause may be affected by death due to another cause. For instance, severe immunodeficiency may result in death before a person dies of a chronic non-communicable disease.
    • Healthcare access bias: This occurs when patients admitted to a healthcare institution do not represent cases originating in the community. Sometimes this may be due to admission based on clinicians’ interest in certain kind of cases (popularity bias), or because patients are attracted to famous/popular clinicians (centripetal bias).
    • Length-bias sampling: When one conducts a survey, diseases with long duration are more likely to be included and may not represent cases originating from the target population.
    • Neyman (selective survival) bias: This may occur in cross-sectional and case-control studies. Those surviving a condition (like acute myocardial infarction) are more likely to be included as subjects than those who did not. If the risk factor/exposure is associated with increased mortality, survivors will show lower frequency of the risk factor.
    • Survivor treatment selection bias: In observational studies those who live longer are more likely to receive certain treatment(s). Therefore, a retrospective analysis may yield a positive association between the treatment(s) and survival.
    • Spectrum bias: When assessing the validity of a diagnostic test this bias occurs when investigators only include “definite” cases, not the entire spectrum of disease manifestations.

In case-control studies, specific biases may occur regarding the selection of controls. The commonest are:

  • Berkson’s bias: This occurs when the probability of hospitalization of cases differs from the probability of hospitalization of controls.
  • Exclusion bias: Occurs when controls with conditions related to the exposure are excluded while cases with the same conditions are retained in the study.
  • Inclusion bias: In hospital-based case-control studies where the exposure is related to one or more conditions in controls, the frequency of exposure is higher than expected in the reference group, producing a toward null bias.
  • Matching: Whether individual matching or frequency matching, it introduces a selection bias that must be adjusted for during analysis.

2. Lack of accuracy of the sampling frame

  • Non-random sampling bias: This is the commonest bias in this group and occurs when a non-representative sample is obtained in which a parameter estimate differs from the actual value.
  • Citation bias: More frequently cited articles are found and included in systematic reviews and meta-analyses more easily.
  • Dissemination bias: Includes biases associated with the entire publication process- from information retrieval to reporting of results.
  • Publication bias: Occurs when published reports regarding an association do not represent the studies carried out on that association.
  • Post-hoc analysis: This is a form of publication bias wherein data dredging results in the generation of post-hoc questions and subgroup analysis.

3. Uneven diagnostic procedures in target population
       In case-control studies, if exposure influences diagnosis of the disease, detection bias occurs.

  • Diagnostic suspicion bias: Exposure is taken as another diagnostic criterion; exposure triggers the search for disease.
  • Unmasking-decision signal bias: Exposure may generate a symptom/sign that favours diagnosis.

In cohort studies detection bias is an information bias.

4. During study implementation

  • Losses/withdrawals to follow up: This occurs in longitudinal studies (cohort and experimental studies) when losses/withdrawals are uneven in both exposure and outcome groups.
  • Missing information in multivariable analysis: Only records with complete information on the variables included in a model are included in multivariable analysis. If the selected records do not represent the target population, it may introduce a selection bias.
  • Non-response bias: This occurs when study participants differ from non-participants. For instance, it could be that those who volunteer to participate in a study are healthier than the general population. If this happens in case of a screening test (for instance), the benefit of the intervention will be spuriously increased.

Reference and Further reading:

1.           Delgado-rodrıguez M, Llorca J. Bias. J Epidemiol Community Heal [Internet]. 2004;(58):635–41. Available from:

Link to the reference article:

2 thoughts on “Bias. Part 1: Selection Bias

  1. Pingback: Bias. Part 2: Information Bias | communitymedicine4all

  2. Pingback: Bias. Part 3: Confounding and Biases in Trials | communitymedicine4all

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.