Tuesday, May 22, 2018

Automated test scoring at the Board of Ed

Riley: we're at an interesting point in education, where it's time to look in the future
"interested in shrinking the amount of time we spend on testing"
"shrink the amount of time so we can increase the amount of instruction"
"that may mean computer adaptive testing," we can figure out where a child is at a point in time
"leave open the possibilities to explore everything"
"maybe it will not work out; maybe it will"
"figure out how to make it better"
might shrink the wait time to return scores

Wulfson: first of what we expect will be many many discussions with you and with the field
"going to do our best to keep an open mind"
"looked at in some depth when we were first engaging with the PARCC consortium"
many states used it; we did not
"paid extra...to do full human scoring"
"there have been huge advances over the past four or five years"
Wulfson asked Alexa if we'd ever be able to use computers to score these test "and she said 'absolutely'"
"we know you'll have a lot of questions on this; we want to hear your questions"
may not be able to answer them today, but will look for as going forward with research
Stapel: scoring done in 8 scoring centers; scorers must meeting minimum requirements; receive standardized training
essays scored on idea development and conventions
scorers are trained on each individual item and must "qualify" to score
look for exact or adjacent agreement in scoring
scorers must achieve certain percentages of exact and adjacent scoring to continue

now a couple of charts on piloting scoring MCAS essay
scoring engine "worked as we expected, but this is just an initial analysis that we did"
whole continuum of how it can be used
"what to make sure that it works for our program"
"want to move pretty cautiously and deliberately moving forward"

McKenna: "this is why the MCAS scores are late, right?"
to a flurry of responses
Riley: "They arrive on time; it's just the time is late."
McKenna: "can we make this decision...if Massachusetts made this decision..there's nothing to prevent us from making this decision...we can do it ourselves"
Wulfson: This is our decision
Craven: how much do we spend on scoring?
Wulfson: all built into Measured Progress contract, "but we've estimated about $4M"
"that's not what's driving it...it's the timing"
Fernandez: do you have a timeframe in mind?
Wulfson: "we don't...still rolling out the next generation test...history and social science"
"a lot of things on our plate; I won't say this is the highest priority...and I would resist having any arbitrary decision"
Fernandez: broader question of where we might put a stamp on making decision that are perceived as a risk because we're coming out front
Wulfson: "I've never been one to take pride in being an early adopter...I'd like to see others experiment and get the bugs out before I jump into the pool"
use of technology in schools; students need to be technologically proficient when they graduate
Sagan: "spend a lot of time in this space in this industry...we are not the first"
cost savings, sooner, "and we might deliver better, clearer, fairer results"
"My guess is if you keep on this path, you'll get better results sooner into teachers and families" and the results will be fairer
Riley: suspending disbelief that this is possible
"important not to limit ourselves" and really explore this "whether it works or not"

