Diagnostic test studies: assessment and critical appraisal (2023)

Table of Contents
Assessment References FAQs Videos

There are many checklists available for the assessment and critical appraisal of diagnostic test studies, as reporting is frequently inadequate.[1][2][3] However, they all include some variation of three critical questions;[2][3] these are:

  • Is this study valid?
  • Does the diagnostic test under assessment accurately distinguish between people who do and do not have the specific disorder?
  • Can this valid, accurate diagnostic test be applied to a specific patient?


How to assess if a diagnostic test study is valid?

1. Was there an independent, blind comparison with a reference (gold) standard of diagnosis?

  • Participants in the study should have undergone both the index diagnostic test and the reference (gold) standard. This is done to confirm or refute the findings of the index test. The accuracy of the test can be overestimated if the index test is performed initially in people known to have the disease and then separately in healthy people (case-control studies do this) rather than performing both the index and reference tests in the same group of people without knowing whether or not they have the disease.[4]
  • People assessing the results of the index test should be blind to the results of the reference standard. This avoids biasing the results of the index test or the reference standard. Interpreting the results of the reference test while already knowing the results of the index test can lead to an overestimation of the index test accuracy, especially if the reference test is open to subjective interpretation.[4] Blinding is less important if the results of the test are objective (e.g., serodiagnostic tests for tuberculosis where sputum culture results are analysed) than if results require clinical interpretation (e.g., MRI images for diagnosing rotator cuff injury).

2. Was the diagnostic test evaluated in an appropriate spectrum of patients (like those a clinician would see in practice)?

(Video) How to Critically Appraise a Diagnostic Test Study

  • Check that the study include people with all the common presentations of the target disorder, with symptoms of early manifestations as well as more severe symptoms, and/or people with other disorders that are commonly confused with the target disorder when diagnosing? If not, the results of the trial may not reflect actual clinical practice.

3. Was the reference standard applied regardless of the index diagnostic test result?

  • If the patient has a negative index test result, the investigators sometimes do not carry out the reference standard test to confirm the negative result, especially if the test is invasive or risky, as this may be unethical. To overcome this, investigators employ an alternative reference standard for proving that the patient does not have the target disorder, which is long-term follow-up to assess that there are no adverse effects associated with the target disorder present without any treatment.

4. Was the test validated in a second independent group of patients?

  • When a new diagnostic test is evaluated, there is a risk that the results in the initial assessment are caused by other factors: for example, something about that specific group of patients included in the study (e.g., they represent only patients with advanced symptoms of the disease). So, to prove the results are reliable and replicable, the new diagnostic test should be evaluated in a second independent (or test) group of patients.

In conclusion: If the study being evaluated fails any of these 4 criteria, we need to consider whether the flaws of the study make the results invalid.

How to assess the results of the test

(Video) 7. Diagnostic studies

There are two types of result commonly reported in diagnostic test studies. One concerns the accuracy of the test and is reflected in the sensitivity and specificity, often defined as the test's ability to find true positives for the disorder (sensitivity) or true negatives for the disorder (specificity). An ideal diagnostic test finds no false positives but at the same time misses no one with the disease (finds no false negatives).

The other concerns how the test performs in the population being tested and is reflected in predictive values (also called post-test probabilities) and likelihood ratios. To give brief definitions of these terms consider this example (based on reference[5]):

1000 elderly people with suspected dementia undergo an index test and a reference standard. The prevalence of dementia in this group is 25%. 240 people tested positive on both the index test and the reference standard and 600 people tested negative on both tests. The remaining 160 people had inaccurate test results.

The first step is to draw a 2x2 table as shown below. We are told that the prevalence of dementia is 25%; therefore, we can fill in the last row of totals — 25% of 1000 people is 250 — so 250 people will have dementia and 750 will be free of dementia. We also know the number of people testing positive and negative on both tests and so we can fill in two more cells of the table.

(Video) Critical appraisal of a diagnostic study - Dr. Prathap Tharyan

Diagnostic test studies: assessment and critical appraisal (1)

By subtraction the table can easily be completed:

Diagnostic test studies: assessment and critical appraisal (2)

From the 2x2 table the following measures can be calculated:

(Video) Critical appraisal on diagnosis by dr. Andaru DahesihDewi, Sp.PK(K)

Pre-test probability = (true positive + false positive)/total number of peopleThis measure tells us the probability of having a target condition before a diagnostic testIn this example: 390/1000 = 0.39 What does this mean: The probability of a patient in this study having dementia before the tests are run
Sensitivity (Sn) = the proportion of people with the condition who have a positive test resultThe sensitivity tells us how well the test identifies people with the condition. A highly sensitive test will not miss many peopleIn our example, the Sn = 240/250 = 0.96 What does that mean? 10 (4%) people with dementia were falsely identified as not having it, as opposed to the 240 (96%) people who were correctly identified as having dementia. This means the test is fairly good at identifying people with the condition
Specificity (Sp) = the proportion of people without the condition who have a negative test resultThe specificity tells us how well the test identifies people without the condition. A highly specific test will not falsely identify many people as having the conditionIn our example, the Sp = 600/750 = 0.80 What does that mean? 150 (20%) people without dementia were falsely identified as having it. This means the test is only moderately good at identifying people without the condition
Positive predictive value (PPV) = the proportion of people with a positive test who have the conditionThis measure tells us how well the test performs in this population. It is dependent on the accuracy of the test (primarily specificity) and the prevalence of the conditionIn our example, the PPV = 240/390 = 0.62 What does that mean? Of the 390 people who had a positive test result, 62% will actually have dementia
Negative predictive value (NPV) = the proportion of people with a negative test who do not have the conditionThis measure tells us how well the test performs in this population. It is dependent on the accuracy of the test and the prevalence of the conditionIn our example, the NPV = 600/610 = 0.98 What does that mean? Of the 610 people with a negative test, 98% will not have dementia
Positive likelihood ratio (LR+) = sensitivity / (1- specificity)This measure tells us how much the odds of a specific diagnosis increase when a test is positive. The larger the LR+, the more likely it is that the person with a positive test result has the condition. An LR+ of 10 indicates a 10-fold increase in the odds of the patient having the condition (i.e., a large increase in probability), whereas an LR+ of 2 would indicate a modest increase in the odds of the patient having the condition. An LR+ of 1 would mean that the test provides no new information regarding the odds of the patient having the condition.In this example the LR+ = 96/20 = 4.8 What does that mean? There is a 4.8 fold increase in the odds of having dementia in a person with a positive test (i.e., a moderate increase in the probability that they have dementia)
Negative likelihood ratio (LR–) = (1-sensitivity) / specificityThis measure tells us how much the odds of a specific diagnosis decrease when a test is negative. The smaller the LR-, the more likely it is that the person with a negative test result does not have the condition. An LR- of 0.5 indicates a 2 fold decrease in the odds of the patient having the condition (i.e., a modest decrease in probability), whereas an LR- of 0.1 indicates a 10-fold decrease in the odds of having the condition (i.e., a large decrease in probability).In this example LR– =4/80 = 0.05 What does that mean? There is a 20-fold decrease in the odds of having dementia in a person with a negative test result (i.e., a large decrease in the probability that they have dementia)

How to apply the diagnostic test to a specific patient:
Having found a valid diagnostic test study, and decided that its accuracy is sufficiently high to make it a useful tool, here are some useful points to consider when applying the test to a specific patient:

  • Is the test available, affordable, and accurate in your setting?
  • Can a clinically sensible estimate of the pre-test probabilities of the patient be made from personal experience, prevalence statistics, practice databases, or primary studies?
  • Are the study patients similar to the patient in question?
  • How current is the study we are analysing — has evidence moved on since the publication of the study?

Will the post-test probability affect the management of the specific patient?

  • Could the result move the clinician across a test-treatment threshold: for example, could the results of the test stop all further testing? That is, rule the target disorder out so the clinician would stop pursuing that possibility, or make a firm diagnosis of the target disorder and move onto choosing appropriate treatment options.
  • Will the patient be willing to have the test carried out?
  • Will the results of the test help the patient reach their goals?

Critical appraisal
Based on the information given in the Assessment section above, the table below gives some basic check points to look for when critically appraising a diagnostic test study. This list is by no means comprehensive, but should cover all the main issues. The main focus of the list is the first two questions based on validity and the importance of the results.

There are numerous checklists available. The SR toolbox is an online catalogue providing summaries and links to the available guidance and software for each stage of the systematic review process including critical appraisal. Examples for diagnostic test studies include:

(Video) Critical appraisal of articles on diagnostics

The following checklist provides a framework for assessing the quality of a diagnostic test study

  • Framework for assessing a diagnostic test study

Read more


  1. Bossuyt PM, Reitsma JB, Bruns DE, et al. Towards a complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Clin Chem 2003;49:1–6. https://www.ncbi.nlm.nih.gov/pubmed/12507953
  2. CASP UK. Critical Appraisal Skills Programme (CASP) https://www.casp-uk.net
  3. QUADAS-2 for diagnostic accuracy studies. http://www.bristol.ac.uk/population-health-sciences/projects/quadas/quadas-2/
  4. Lijmer JG, Mol BW, Heisterkamp S, et al. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA 1999;282:1061–1066. https://www.ncbi.nlm.nih.gov/pubmed/10493205
  5. Centre for Evidence Based Medicine. https://www.cebm.net/likelihood-ratios/


How do you critically appraise a diagnostic study? ›

The three important issues to be considered when evaluating the validity of the study are to identify how the study population was chosen, how the test was performed and whether there is a comparison to the gold standard test so as to confirm or refute the diagnosis.

What are 4 types of diagnostic testing? ›

Diagnostic tests
  • Biopsy. A biopsy helps a doctor diagnose a medical condition. ...
  • Colonoscopy. ...
  • CT scan. ...
  • CT scans and radiation exposure in children and young people. ...
  • Electrocardiogram (ECG) ...
  • Electroencephalogram (EEG) ...
  • Gastroscopy. ...
  • Eye tests.

What type of study is a diagnostic study? ›

Diagnostic accuracy studies are used to obtain how well a test, or a series of tests, is able to correctly identify diseased patients or, more generally, patients with the target condition, the condition of interest.

What is an example of a diagnostic study? ›

Diagnostic Test or Diagnostic Study Definition

For example, in order to diagnose a herniated disc, physicians may employ Magnetic Resonance Imaging (MRI), Computerized Assisted Tomography (CAT) Scan, and/or Electromyography (EMG) to determine if the herniated disc is impinging on a nerve root.

What is a diagnostic assessment? ›

Diagnostic assessments are intended to help teachers identify what students know and can do in different domains to support their students' learning. These kinds of assessments may help teachers determine what students understand in order to build on the students' strengths and address their specific needs.

What is the best study design for a diagnostic test? ›

The most valid study design for assessing the accuracy of diagnostic tests is a non- experimental cross-sectional study that compares a test's classification of a diagnosis with a reference standard's classification, in a relevant study population.

What are the 7 commonly performed diagnostic tests? ›

The 7 most common diagnostic tests are the following:
  • X-rays. ...
  • CT scan. ...
  • MRI. ...
  • Mammogram. ...
  • Ultrasound. ...
  • PET scans. ...
  • Pathology test:
Jun 23, 2022

What are the 5 testing methods? ›

There are many different types of testing, but for this article we will stick to the core five components of testing:
  • 1) Unit Tests. ...
  • 2) Integration/System Tests. ...
  • 3) Functional Tests. ...
  • 4) Regression Tests. ...
  • 5) Acceptance Tests.
Jun 6, 2017

What is the first step in the critical appraisal of a study? ›

The first step in critically appraising a research article, therefore, is to reflect on the quality of the journal in which it is published.

What are the main techniques used in diagnostic evaluation? ›

Here are some more types of diagnostic assessments that can be used for assessing students:
  • Journals.
  • Quiz/test.
  • Conference/interview.
  • Posters.
  • Performance tasks.
  • Mind maps.
  • Gap-closing.
  • Student surveys.
Oct 13, 2021

What is a critically appraised research study? ›

Critical appraisal is the process of carefully and systematically examining research evidence to judge its trustworthiness, its value and relevance in a particular context. It allows clinicians to use research evidence reliably and efficiently.

What is a critical appraisal of a case study? ›

Critical appraisal is integral to the process of Evidence Based Practice. Critical appraisal aims to identify potential threats to the validity of the research findings from the literature and provide consumers of research evidence the opportunity to make informed decisions about the quality of research evidence.


1. Critical appraisal of studies having diagnostic tests
(Kavitha Raja)
2. Appraising diagnostic studies Interactive session
(SAMS Education Committee)
3. How to Critically Appraise a Prognosis Study
(Terry Shaneyfelt)
4. 1. Introduction to critical appraisal
(Cochrane Mental Health)
5. EBM - Appraising studies of diagnosis
(Evidence Based Medicine)
6. Appraising diagnostic studies - Pre -recorded session by Dr. Chighaf Bakour
(SAMS Education Committee)
Top Articles
Latest Posts
Article information

Author: Maia Crooks Jr

Last Updated: 03/03/2023

Views: 5827

Rating: 4.2 / 5 (63 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Maia Crooks Jr

Birthday: 1997-09-21

Address: 93119 Joseph Street, Peggyfurt, NC 11582

Phone: +2983088926881

Job: Principal Design Liaison

Hobby: Web surfing, Skiing, role-playing games, Sketching, Polo, Sewing, Genealogy

Introduction: My name is Maia Crooks Jr, I am a homely, joyous, shiny, successful, hilarious, thoughtful, joyous person who loves writing and wants to share my knowledge and understanding with you.