¾«¶«Ó°Òµ

Test Assembly

Stochastic Programming for Individualized Test Assembly With Mixture Response Time Models (RR 15-01)

Many standardized tests are now administered via computer rather than paper and pencil. The computer-based delivery mode brings with it certain advantages, such as the ability to record not only the test taker’s response to each item (i.e., question), but also the amount of time the test taker spends considering and answering each item. The analysis of response times (RTs) is still a developing area of research.

Early RT research assumed that a test taker would show consistent RTs over the course of a test. Such models may be unrealistic for various reasons—some items require more time than others to answer, a warm-up effect may cause a test taker to respond more quickly after completing the early items, fatigue may cause a test taker to slow down toward the end of a test, or as time runs out the test taker may quickly guess the answers to the last items on a test. To take these variable RTs into account, mixture RT models have recently been investigated.

Until now, mixture RT models have only been applied for post hoc analyses. This research expands the use of these models by exploring their application in the context of test assembly. Various strategies were applied and the strengths and weaknesses of each described. In general, it was concluded that the application of mixture RT models should prove especially useful for tests with a heterogeneous testing population.

Request the full report

Additional reports in this collection

An Overview of Research on the Testlet Effect: Associated...

A mathematical model called item response theory is often applied to high-stakes tests to estimate test-taker ability level and to determine the characteristics of test questions (i.e., items). Often, these tests contain subsets of items (testlets) grouped around a common stimulus. This grouping often leads to items within one testlet being more strongly correlated among themselves than among items from other testlets, which can result in moderate to strong testlet effects.

Robust Text Similarity and Its Applications for the LSAT...

Text similarity measurement provides a rich source of information and is increasingly being used in the development of new educational and psychological applications. However, due to the high-stakes nature of educational and psychological testing, it is imperative that a text similarity measure be stable (or robust) to avoid uncertainty in the data. The present research was sparked by this requirement. First, multiple sources of uncertainty that may affect the computation of semantic similarity between two texts are enumerated.