If you are looking for help with the MEE/MPT, I offer a self-directed UBE Essays subscription module ($200 as of J23) and if you are looking for more input, I also offer an interactive “automated” MEE/MPT grading system (ranges from $125-$650 as of J23). Over the years I have seen a number of things that suggest that automation may be involved in the grading of essays and MPTs. An NCBE Bar Examiner periodical talked about automated essay grading over 24 years ago (see “TESTING TESTING” below). I can’t imagine they have done nothing since then to implement it. My guess is they don’t want to “announce it” because as explained below, “an examinee who has information about the scoring algorithm would have an unfair advantage over others.”
Automated grading software essentially sets the “rules of the game” by defining grading parameters and then scoring accordingly. For example, E-rater is an automated grader developed by ETS, the largest educational testing company in the United States (and the writer of the MBE). According to a 2001 paper on E-rater, “ … e-rater also examines an essay’s content — its prompt specific vocabulary — both argument-by-argument and for the essay as a whole. The expectation is that words used in better essays will bear a greater resemblance to those used in other good essays than to those used in weak ones, and that the vocabulary of weak essays will more closely resemble words in other weak essays. Programmed on these assumptions, e-rater assigns a number to each essay on the basis of the similarity of its vocabulary to that of other previously scored essays.” see http://seperac.com/pdf/Stumping E-Rater Challenging the Validity of Automated Essay Scoring-Powers-2001.pdf
Logically, the most rudimentary form of AI grading would look for specific keywords (I do this myself in grading tutee essays). For example, following is an Essay Comparison I made for an examinee who received an undeserved score a 1 on an Ohio Wills essay for the J17 exam.
https://seperac.com/bar/examinees/J17-Ohio/
This essay, which should have received a score between 3 or 4, was replete with spelling errors. Some of the misspellings the examinee made included the terms: Vaild will, Invlaid will. docuemtn, the origianl 2005 will, Codicle, competenat witnesses, codicll, confrom with requriments, and Codicles instilments. I believe it is possible that an automated grading system concluded this was a poor essay because it didnt possess the “key-words” the system was looking for. I do something similar with my MEE/MPT Analysis and I have the same problem in trying to identify whether an examinee used the appropriate buzzword when it is misspelled. If this is the case, it is very important to not only discuss the relevant legal terminology keywords/buzzwords, but to also spell them correctly. Furthermore, when your answers are grammatically sound, it shows the grader that you are conscientious. Accurate spelling and grammar will make the grader feel like you are at least somewhat competent (especially if he is unsure of your competency based on your answer). If you have to rush an essay, let it be disorganized, but get the right words down.
As the years go on, it will become more and more likely that some form of AI or automated grading at play in high-stakes exams. With my MEE/MPT automated grading, you would answer released MEE/MPT questions where I have a large bank of graded examinee answers to compare to, and then I provide them with a comprehensive analysis report that generally includes graded examinee answers ranging from 1-10 so examinees can understand what a bad vs. good answer look like. It would take me a bit of time to fully explain everything contained in my automated grading report, so I strongly suggest you simply look at one to learn by example. To illustrate, following is my statistical analysis of an exactly passing answer to question #1 (Torts) from the F19 MEE and the F10 MPT of State vs McLain. Examinees who fully analyze these reports will better understand what a passing MEE/MPT score consists of. Please note I changed the examinee’s name to “Sample” to preserve the examinee’s anonymity:
https://seperac.com/pdf/F23-Automated_Grading-MEE_Question-Sample.docx
https://seperac.com/pdf/F23-Automated_Grading-MPT_Question-Sample.docx
Please look very closely at these samples. Some examinees love the statistics/info/layout while others are overwhelmed by it. I regard the automated grading as more helpful than the grading you receive from your bar review. For example, a Kaplan grader once told me he was required to give points of law and grade on a 0,1,2,3,4 scale. He needed to write a minimum of 3 substantive comments and give an overall comment in the end. To qualify as a grader, you were given a single essay to grade and had to fall within +/- 15 points of the actual grade and then watch a 30 minute training video. He was paid $7-8 per essay and if you graded a certain number of essays, you got a bonus. He said most graders spent only a few minutes on each essay. Often, you hear stories of how an examinee submits a model answer to their bar review grader and receives a low score. This is not possible with my automated grading because it will identify the high essay-to-NCBE Answer percentage as compared to other high-scoring examinees. Therefore, you get a much more realistic appraisal of where your essay stands as compared to others with my automated grading.
My “scoring” algorithm accuracy differs for each MEE/MPT that I do, but it is generally between 0.2-2 points off on a 0-10 scale (meaning if I ran it on a real examinee graded essay, it would come back with a 0-10 score ranging from 0.2-2 points off from the actual grade depending on the essay question and sample size I have). For example, if I have a large sample of graded examinee essays, the variance may be 1 point, meaning if you were to submit to me a previously graded examinee essay, my automated grading system would be incorrect by about 1 point (predicting a 6 instead of the received 5 or predicting a 3 instead of the received 2). The lower the variance, the more you can trust the automated score as a reliable predictor of the score you would receive on the actual exam. I do exactly what E-Rater claims to do: I try to identify how much an answer has in common with good essays. If I can do this with pools of just 20-30 essays to analyze, I can only imagine what ETS can do with hundreds of thousands of essays and a full staff of people smarter than me. Following are some testimonials from past examinees who were in my automated grading program:
A first-time examinee who enrolled in my MPT Automated Grading and passed with a written of 145 told me: “For the MPT, the most valuable thing for me is seeing alternative examples, in descending order, of what a graded exam might look like. … Real examples was a wake up as to what exactly I was missing. They weren’t perfect but they were doing a lot of things consistently right. Seeing some things the top 1/3 were doing consistently, that I was not, really drove home things I could improve on. The word count may not be the “best” indicator of quality but it can be a great red flag if you are coming in far under the average. Finally, the word suggestions to increase the diversity for transition sentences is subtly really useful. … looking at the theoretically scored 1-6 essay responses let me see a lot of what I was doing wrong. Or at least some of the stark differences as I moved to the “worse” essays. I was far too lax on citations and noticed the top essays were not. Also, some typos were acceptable and I needed to just keep moving forward. For the persuasive essays, it made me reexamine the case holding notes I was making and what a better writing of them might look like. … Most of my practice time with the MPT’s was inefficient and I had to create better habits (like citation and outlining) to improve. I had already done almost a dozen MPT’s at this point and was making the same mistakes without realizing it was important. (and therefore not improving) The “scores” I got back from the bar prep services would note these things but there really wasn’t an emphasis on why or how wrong it was. Their feedback was too vague and it helped me more to see it myself in real responses rather than “model” ones. I didn’t really have time to utilize the MPT function in the last week but it pointed out a lot of weaknesses very rapidly and I wish I used it a few times at the start of my course to reinforce some things (such as when I was citing incorrectly or better ways to work in subcase holdings). I screwed up my first MPT on the test day and made a last minute error on the second. But I think my overall quality would have been much higher if I started with better habits at the beginning to reinforce rather then try to work in new ones in the last 4 days.”
A foreign examinee who failed in J22 with a 259 (written of 116) who subscribed to my F23 Automated MEE/MPT Grading package and then passed F23 with a 297 (written of 146) told me: “”Did I find the automated MEE/MPT grading reports helpful? -Of course!! However, the auto graded score may be lower than actual score. My auto graded score was always from 2-5. However my actual MEE&MPT is 297. It was good for me to push myself to improve my writing skill, however I cannot have confidence before the exam because of such a low auto graded score. Pointing out languages which I did not use in my essay!! Anyway, thank you so much for your support!!! … I do not think I pass bar exam without Seperac!””
Another examinee who enrolled in the automated MEE/MPT grading and passed the J22 MA exam with a written of 147.5 told me: “By far the most helpful part was the bank of real, authentic written answers. Seeing actual MPTs and MEEs and their scores helped me get a far better sense of how my own answers would potentially be scored. I also really benefited from the auto-grading feedback I received, as it helped me realize, “The examiner cannot read my mind — if it is not part of the answer, then they have no idea that I know it.” Buzzwords are especially important, in other words.”
Even if you work with a private tutor who has experience in grading MEE/MPT answers, the automated grading serves as an excellent supplement because you will receive useful insights/statistical information that you would never get from a private tutor. For example, if you missed a fact in the question or a legal concept that the majority of high-scoring examinees discussed, your private tutor has no way to know this, but the automated grading will always tell you. The report is designed to efficiently give you all this information along with the sample answers (low scoring, just passing, and high scoring) to make your review as productive as possible. Again, please closely review the sample to understand what this entails.
The available options are as follows:
(2) MEE and (1) MPT for $125
(4) MEE and (2) MPT for $225
(7) MEE and (3) MPT for $350
(14) MEE and (6) MPT for $650
Unlike the UBE Essays module which is always available, enrolling in this automated MEE/MPT grading program depends upon my availability. If you have any further questions about the automated MEE/MPT grading (or how to enroll), please contact me. If you are interested in enrolling, please advise which option, and if available, I will send you a link to make the payment online through my site. The J23 grading will start in mid-to late May 2023. If you are unsure of whether you will find the automated grading helpful, if you subscribe to my self-directed UBE Essays module, I will provide you with one free “auto-graded” MEE or MPT response (you must answer an MEE/MPT from the bank of questions I have available).
—————————–
TESTING, TESTING February 1999
Computer scoring of essay examinations-having a score generated by a computer instead of human readers- is now being extensively researched. Some studies have shown even studies have shown even greater score consistency than can be obtained using human readers, perhaps because computers are not subject to fatigue or other human limitations. Some procedures that employ a combination of readers and computer scoring show a potential for improved score consistency and economy; however, none are yet sound enough for application to a high-stakes examination
All computer essay-grading programs with which I am familiar utilize a regression model. This is based on having an appropriate number of qualified human readers score an appropriate number of essays after which a procedure called regression analysis is utilized to identify characteristics of essays that are correlated with high scores. The characteristics identified must be those that can be recognized and quantified by a computer.
Examples of such characteristics include: average length of sentences, words and paragraphs; number of semicolons; ratio of adjectives to nouns; and the presence (or absence) of certain key words or strings of words. Obviously, many of these are unrelated to legal reasoning or knowledge even though they might characterize examples of good legal writing. Further, an examinee who has information about the scoring algorithm would have an unfair advantage over others. For these reasons, it is unlikely that computer scoring will ever entirely replace bar exam graders.
—————————–