Testing and Classroom Assessment

Two terms I’ve often heard people use incorrectly–and noticeably absent from any national conversation about standardized testing–are validity and reliability. No, they aren’t the same and not interchangeable. Here are my (simplistic) definitions of the two terms:

Validity is the extent to which a test or other assessment measures what it purports to measure. If a teacher gives an assessment to gauge the students’ abilities to draw inferences, but the test consists of main idea questions, it’s not a valid measure. A test of inferential thinking must contain inference questions.

Reliability is the consistency of a test or assessment. An assessment given of a common population, covering common content, and during a common time frame should yield similar results. If I give all of my 10th graders of the same demographic in my classes the same exam on the same day, I would expect the results to be fairly similar.

“Why is this important?” you may ask. Well, it’s critical. Giving a reliable and valid assessment is fundamental when assessing students. We have to trust and depend on the assessment given. If I want to find out what my students can do or what they know, I have to use assessments that will provide the data I need in a reliable manner. I try to ensure that every assignment that goes into the grade book has both reliability and validity.

However, this really isn’t the point of my post. What I really wanted to say is that the Washington State test (the HSPE, the High School Proficiency Exam) is neither valid nor reliable. Yet, we base so much on this test: graduation, remedial classes, teacher effectiveness, program offerings, and more. It may help us see weaknesses across the state or in a single school system, but its use for individuals is inappropriate.

How can the HSPE be a valid exam with only one or two questions over a specific skill. For example, the students are asked only one or two questions about geometric sense. How many questions need to be asked to gauge accurately the students’ skill in this category? I would probably suggest 6-10, but that’s just me. Of course, this would mean a very, very long state test.

I’ve seen a student get 7 out of 8 questions correct on one of my exams on author’s purpose, which would tell me the student understands the skill. Let’s pretend my eight questions are the bank of questions on the state test for that one skill. If the student missed the only question asked in that category on the state exam, the state assumes the student does not know it. If I gave the same student the other 7 questions, he could get them all correct. We have now ended up with an inadequate and, in my opinion, an invalid assessment.

A couple years ago we had an excellent example of the test’s lack of reliability. The state’s average dropped significantly across the board on the reading portion. It wasn’t only one or two schools or regions; it was the entire state’s average that dropped. With that much variance in the state scores from one year to the next, we have to assume all of the state’s students struggled one year more so than their peers before or after them, or the problem rested with the test. Which is more likely: tens of thousands of students or a single assessment?

Even if we want to say the test was reliable, then the difficulty increased, which threw off the results; yet, we still base AYP (adequate yearly progress) on the test, and schools fell into or deeper into trouble with the state because of the test’s results. Students are placed into remedial classes because of the test, and programs may be cut or created because of the testing results.

The way the state uses the HSPE now is to look at the 10th graders’ scores from one year to the next; this means the results of a different group of students taking a different test are used to judge schools, systems, teachers, and students.

If we want to use the state tests in any capacity whatsoever, the only proper way would be to watch a cohort group over time. How did the same students do year after year? What were the students’ scores in 6th grade, 7th grade, 8th grade, etc.? We would be able to see the students’ progress over time in a school or in a district. The same students would be assessed together; we would be able to see how our schools are doing in moving students along a progression of skills.

My examples may be a bit rough (but it’s Saturday and Spring Break), but my message remains: the state test isn’t a valid or reliable measure for individual students, and it is being misused.

If only the decision-makers ran the state like many of us run our classrooms…with validity and reliability.

3 thoughts on “Testing and Classroom Assessment

  1. Jim Van Pelt

    The pain of all this is that no one, as far as I can tell, who matters in the least or can influence the decisions about how are tests are designed and uses is listening. The thread that I’ve seen running since NCLB came into force is that classroom teachers, the administrators in individual buildings, superintendents of schools, and the school boards themselves have lost all power to make decisions and implement policy. Reasonable essays like the one you just wrote are not being read or considered by the people who are making the decisions.

    I’ve been teaching for thirty years, and I’ve never felt less influential or in control than I do now. Our district is writing a script for 10th and 11th grade English now that will mandate what units are taught, when they’re taught, how long they’ll be, and what assessment will be used to measure them. More of our reading in the various genres will have to be common in all classes.

    Who can I register my objection to this to? How can I make a difference? The short answers are “no one” and “you can’t.” It’s very discouraging.

    Reply
  2. http://dream-analysis.org

    Nice post. I used to be checking constantly this blog and I’m impressed! Extremely useful information specially the final phase 🙂 I deal with such info much. I used to be looking for this particular information for a long time. Thank you and good luck.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s