The Exam Statistics: The Point Biserial Correlation

I'm continuing my explanation of the reams of statistics I get about multiple choice exams. Last time, I explained exam item difficulty scores. (Fascinating, no?) This time: point biserial correlation coefficient, or "rpb". That is, "r" for the correlation coefficient (why, oh why is it the letter r?) and "pb" to specify that it's the point biserial and not some other kind of correlation. Like, um, some other kind.

If I've constructed a good exam item, it should be neither too hard nor too easy. It should also differentiate among students. But I can't tell how well it does that just by looking at the difficulty score. Instead, there's a more complex measure, the rpb. In general, I need a correlation index for a categorical variable with a continuous variable. More specifically, I want to correlate the categorical variable of a test item (i.e., whether a student answered the test item correctly or incorrectly), with the continuous variable of the student's percent score on the examination. Got that? I didn't think so.

Let me try again. Student A did well on the exam, getting 90% correct. Student B did not do so well, getting only 50%. If I look at any given exam question, in general, student A should be more likely to answer it correctly than student B. This is not the same as difficulty, because I'm not simply looking at what proportion of the class answered the question correctly. I'm correlating each student's score with their performance on each question. The key to all this is the word "should" in the sentence above.

If an exam item is poorly constructed for whatever reason, good students may do worse on it than students who did worse on the exam. That is, the better you are overall, the less likely you are to answer it correctly. That is not supposed to happen. The rpb gives me this information for each question on the exam. Experts in exam construction recommend that the rpb should range from 0.30 to 1.00. Anything question getting a rpb lower than 0.30 means that I will take a look at it and try to figure out why that's happening.

And if the rpb is negative,'s a negative correlation. That's the worst case I described: better students are doing worse answering this question; and poorer students are doing well. I won't use any questions getting a negative rpb again unless I can figure out why it's happening. Maybe I can tweak the question, maybe I have to rewrite it to ask about the same knowledge in a different way. Or maybe I'll just give up entirely, go and get a coffee, and check out some LOLcats.

Why aren't you studying?


Anastasia said...

Oh God, not the LOLcats...!

I understood all of that. I must be very smrt...or maybe it's because I just read about this for a psychometrics class.

How often does an rpb lower than 0.30 occur? What would some of the reasons for it be?

Karsten Loepelmann said...

@Anastasia: That's great--psychometrics rocks!

Looking over my three courses that use multiple-choice questions, about 10% of questions have rpb < 0.30. Out of those, however, most are really close to 0.30. Less than 1% of questions have a negative rpb. And I'm working on fixing all of them.

"Why?" is difficult to say. But in answering students' questions about test items, I have some ideas. Sometimes, students may overthink a question--especially better students who are expecting a trick question when there is none.

Sometimes, students have taken other courses and have read about a study that may contradict what I taught in class (or what's in the textbook I use). Again, this would be a greater problem for better students, who actually remember stuff they learned in other courses.

And sometimes, the question itself is poorly constructed, confusing, or plain wrong. (I remember that you were pretty good at identifying sucky questions!)

LOLcats is nothing. I can waste a whole day at LOL president.

Anastasia said...

I try.

I can almost feel my IQ drop whenever I read LOLanything.

Anonymous said...

More reasons why a better student might miss a question:

1. If an answer is too easy or obvious, a better student might subconsciously shy away from it thinking there MUST be more to the question than that.

2. A better student might remember a minor symptom of a disease (for example) and fall for that one, ignoring the more major one.

In a way though it's a "rich get richer" phenomenon, that the better students end up with yet higher scores because their performance is used as an index to measure everyone else's. Which could kinda be unfair.

Karsten A. Loepelmann said...

@Anonymous: It pains me to see it when students think themselves out of the right answer. Unfortunately, I don't have any advice about avoiding "over-thinking".

Find It