4.09.2014

Thoughts on the ALIRA Exam

I just wanted to share my comments about the ALIRA Exam; we gave the exam to our Latin 1 and Latin 3 students today.

The exam is very short; the website says it has a 50 minute timer. Most of the Latin 1s were done in under 10 minutes, most of the Latin 3s were done in under 15. I'm a little concerned that there aren't enough tasks before it ends the exam in order to get a true sense of where the student is at. Most of my I2 and I3 scored students saw 12-14 questions, my I4 and 5s saw about 18. Those in the novice range saw fewer than 10.

The Latin sources used were indeed varied in context and in time. I do question the design choice to provide this weird papyrus background with a fake script font for the bulk of the passages. My students said it was distracting and I'd definitely agree. There's really no reason to create a faux-situated experience on an examination in that manner. You can see some examples of it here in the ALIRA sample test.

There was a big show-stopping bug which impacted a handful of students (I did email Language Testing International about it, hopefully I'll hear back soon) and invalidated their scores. I had a couple of solid Latin 3 students score N1 (and one even score BR - below the range). Interestingly enough, all four of them ended on the exact same question (about 13 questions in) and no one else saw that question in their rotation. I hope that the question can be fixed or removed as others take the exam.

In the end, many of my students walked away with a bolstered confidence level and affirmation that they were on the right track. My Latin 3s were, by in large, right where I thought they'd be knowing the students. Most were I2 and I3, a few I4 and 5.

The Latin 1s were the surprise of the bunch; except for 3 N1s (and I suspect a test error here, too), everyone was I1 or above with a large cluster at I3. In essence, ALIRA is telling me that our Latin 1 students are at the same place our Latin 3 students are in terms of ability to comprehend a text. I need some time to digest this data and think about how a few things may be impacting those scores; but my Latin 3s are really the last vestiges of my former life of reliance on a more grammar-translation-reading approach and the Latin 1s, in addition to having a lot more CI-type activities embedded in their daily instruction, also have the added benefit of 2 full years of refinement to Operation LAPIS aiding them as well.

Overall, though, I'm very pleased with the ALIRA exam, the time to administer it, and the information (if it is indeed accurate) that it provides. I'm not convinced it's a $10 test, however. That price needs to come down so that I (and others) can administer it program-wide each year.

In an effort for those of us using ALIRA to gauge where our students are at as a whole, I want to start compiling data for exams given this year. I created this Google Form for anyone who wants to contribute:

https://docs.google.com/forms/d/1sx87t04SkDYByBWXKghbmXJHm5iAq3Xjc7Nuf_uvSVo/viewform

Names and schools won't be published (included that space, though, to help ensure that we're not getting false data), only raw numbers for proficiency at each course level.

If you have had experience with ALIRA, it would be great to share your thoughts, so please do!

3 comments:

  1. I wanted to add a couple of followup comments to the above post.

    On the issue of the anomalous scores of N1 for about 10% of the students: I firmly believe that there is an issue with the scoring algorithm or one particular question on the exam. We had a couple of the stronger Latin 1 students also end up rated at an N1, also seeing (from what they could remember) the same question at the end.

    I questioned these results with Language Testing International and I was rebuffed. However, upon pressing the issue further, I've learned that only around 120-150 students have taken the ALIRA exam so far. If this is true, that's a very small sample size to claim that there's no way there could be a programming issue. LTI has agreed to look into the issue further but this is a serious problem.

    Outside of those anomalous N1 scores, the next LOWEST rating was two N4 (out of nearly 40 Latin 1 students and 20 Latin 3 students -- and these students were among the weaker in the program), every other student in our program scored in the Intermediate Range, both Latin 1 and Latin 3. Again, this causes me to believe that those N1 scores (and the one Below Range score) were incorrect.

    I hope that LTI will discover what's caused those presumably false N1 reports and either rescuer those students exams or, as I've requested, allow me to retest those students.

    ReplyDelete
  2. Jason Blackburn4/29/14, 3:52 PM

    I really appreciate your comments on the ALIRA. 44 of my students took it at the end of the first semester, and I'll have 41 taking it in early June. I was shocked by a few of the N1s that my students received. . .many of them were booted and forced to log back into the test. I also contacted LTI and was told that it had no impact on their scores. My experience was that students either scored really low or really high. It's interesting that a Latin 2 student who hasn't reached the subjunctive yet can score an I5 (I had one of those). The kids like it, but it needs some tweaks. I have added my results to your google form and will continue to do so.

    ReplyDelete
  3. Jason, thanks for adding that data.

    I think that your N1s were indeed incorrectly assigned and that's the matter I had to press very hard on with LTI. Simple fact of the matter is that most students are out of N1 by the first few weeks of Latin 1; there's literally no way that any of my Latin 3 students who received N1 can be at that level of comprehension.

    It sounds like they are reviewing the exam in earnest to see if there is anything wrong with how certain questions are being evaluated. At the very least, I've secured the ability to have those students retested at no cost.

    With regards to the student reaching I5 not having "reached" the subjunctive; I don't think that's a problem at all. Remember, the ALIRA exam isn't measuring grammatical identification -- it's measuring the ability to comprehend a text for meaning. There's no reason why a student couldn't see amāret and recognize that the clause has something to do with love. It's a shift (and a big one, at that) that I think we, as educators, have to make in our understanding of what a demonstration of learning and understanding looks like in language learning.

    ReplyDelete