

The cutscore and indifference region are defined on the latent ability (theta) metric, and translated onto the proportion metric for computation. While the SPRT was first applied to testing in the days of classical test theory, as is applied in the previous paragraph, Reckase (1983) suggested that item response theory be used to determine the p 1 and p 2 parameters. While this definition may seem to be a relatively small burden, consider the high-stakes case of a licensing test for medical doctors: at just what point should we consider somebody to be at one of these two levels? The upper parameter p 2 is conceptually the highest level that the test designer is willing to accept for a Fail (because everyone below it has a good chance of failing), and the lower parameter p 1 is the lowest level that the test designer is willing to accept for a pass (because everyone above it has a decent chance of passing). Again, the indifference region represents the region of scores that the test designer is OK with going either way (pass or fail). A cutscore should always be set with a legally defensible method, such as a modified Angoff procedure. These points are not specified completely arbitrarily. If the examinee is determined to be at 75%, they pass, and they fail if they are determined to be at 65%. The test then evaluates the likelihood that an examinee's true score on that metric is equal to one of those two points. We could select p 1 = 0.65 and p 2 = 0.75. For instance, suppose the cutscore is set at 70% for a test. The two parameters are p 1 and p 2 are specified by determining a cutscore (threshold) for examinees on the proportion correct metric, and selecting a point above and below that cutscore. The SPRT is currently the predominant method of classifying examinees in a variable-length computerized classification test (CCT). Widgets would be sampled one at a time from the lot (sequential analysis) until the test determines, within an acceptable error level, that the lot is ideal or should be rejected.

In this example, p 1 = 0.01 and p 2 = 0.03 and the region between them is the IR because management considers these lots to be marginal and is OK with them being classified either way. Management would like the lot to have 3% or less defective widgets, but 1% or less is the ideal lot that would pass with flying colors. For example, suppose you are performing a quality control study on a factory lot of widgets. The region between these two points is known as the indifference region (IR). The test is done on the proportion metric, and tests that a variable p is equal to one of two desired points, p 1 or p 2. Sampling should stop when the sum of the samples makes an excursion outside the continue-sampling region.Īpplications Manufacturing 3.3 Detection of anomalous medical outcomesĪs in classical hypothesis testing, SPRT starts with a pair of hypotheses, say H 0.
