Small sample sizes are a fact of life for most usability
practitioners. This can lead to serious measurement problems, especially
when making binary measurements such as successful task completion rates
(p). The computation of confidence intervals helps by establishing
the likely boundaries of measurement, but there is still a question of how
to compute the best point estimate, especially for extreme outcomes. In
this paper, we report the results of investigations of the accuracy of different
estimation methods for two hypothetical distributions and one empirical
distribution of p. If a practitioner has no expectation about the
value of p, then the Laplace method ((x+1)/(n+2))
is the best estimator. If practitioners are reasonably sure that p
will range between .5 and 1.0, then they should use the Wilson method if
the observed value of p is less than .5, Laplace when p
is greater than .9, and maximum likelihood (x/n) otherwise.
Practitioner’s Take Away
- Always compute a confidence interval, as it is more informative than
a point estimate. For most usability work, we recommend a 95% adjusted-Wald
interval (Sauro & Lewis, 2005).
- If you conduct usability tests in which your task completion rates typically
take a wide range of values, uniformly distributed between 0 and 1, then
you should use the LaPlace method. The smaller your sample size and the
farther your initial estimate of p is from .5, the more you will
improve your estimate of p.
- If you conduct usability tests in which your task completion rates are
roughly restricted to the range of .5 to 1.0, then the best estimation
method depends on the value of x/n. (3a) If
x/n = .5, use the Wilson method (which you get as part of the
process of computing an adjusted-Wald binomial confidence interval). (3b)
If x/n is between .5 and .9, use the MLE. Any attempt to improve
on it is as likely to decrease as to increase the estimate’s accuracy.
(3c) If x/n = .9, but less than 1.0, apply either
the LaPlace or Jeffreys method. DO NOT use Wilson in this range to estimate
p, even if you have computed a 95% adjusted-Wald confidence interval!
(3d) If x/n = 1.0, use the Laplace method.
- Always use an adjustment when sample sizes are small (n<20).
(It does no harm to use an adjustment when sample sizes are larger.).