Measuring Test Measurement Error: A General Approach
Test-based accountability including value-added assessments and experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet we know little regarding important properties of these tests, an important example being the extent of test measurement error and its implications for educational policy and practice. While test vendors provide estimates of split-test reliability, these measures do not account for potentially important day-to-day differences in student performance.
We show there is a credible, low-cost approach for estimating the total test measurement error that can be applied when one or more cohorts of students take three or more tests in the subject of interest (e.g., state assessments in three consecutive grades). Our method generalizes the test-retest framework allowing for either growth or decay in knowledge and skills between tests as well as variation in the degree of measurement error across tests. The approach maintains relatively unrestrictive, testable assumptions regarding the structure of student achievement growth. Estimation only requires descriptive statistics (e.g., correlations) for the tests. When student-level test-score data are available, the extent and pattern of measurement error heteroskedasticity also can be estimated. Utilizing math and ELA test data from New York City, we estimate the overall extent of test measurement error is more than twice as large as that reported by the test vendor and demonstrate how using estimates of the total measurement error and the degree of heteroskedasticity along with observed scores can yield meaningful improvements in the precision of student achievement and achievement-gain estimates.
Published Versions
D. Boyd & H. Lankford & S. Loeb & J. Wyckoff, 2013. "Measuring Test Measurement Error: A General Approach," Journal of Educational and Behavioral Statistics, vol 38(6), pages 629-663.