Skip to main content

Table 3 Comparison of diagnostic performance between human and artificial intelligence-based judgments in 2-class comparisons

From: Liver fibrosis stage classification in stacked microvascular images based on deep learning

 

Judge

 

AI (95% CI)

Sonographer (95% CI)

Sensitivity ↑

0.842 (0.778–0.894)

0.775 (0.697–0.840)

Specificity ↑

0.835 (0.792–0.872)

0.832 (0.790–0.868)

PPV ↑

0.706 (0.637–0.768)

0.636 (0.559–0.708)

NPV ↑

0.919 (0.883–0.946)

0.907 (0.871–0.936)

Accuracy ↑

0.838 (0.803–0.868)

0.816 (0.780–0.849)

LR + ↑

5.113 (4.005–6.527)

4.611 (3.620–5.874)

LR- ↓

0.189 (0.132–0.269)

0.271 (0.199–0.369)

  1. Values in Table 3 are expressed as point estimates and 95% CI. The up arrow in the first column indicates that a higher value is superior, and the down arrow indicates that a lower value is superior
  2. Abbreviations: AI Artificial intelligence, CI Confidence interval, PPV Positive-predictive value, NPV Negative-predictive value, LR + Positive-likelihood ratio, LR- Negative-likelihood ratio