[deprecated] JustScoring Thesis


Machine learning has recently become powerful enough to enable accurate semantic analysis of english texts. Classroom teachers from grade school through college spend a substantial amount of time grading their students’ work to understand their grasp of the material being taught. Constructed response such as short and long text responses enable to educator get a better feel for the student’s understanding they are time consuming to score.

We want to apply modern ML models to student text responses and expected response to score those responses based on their similarity. There are a variety of techniques currently available for performing this analysis. We will avoid depending on any particular technology to avoid lock-in.


Our protagonist, Mrs. Dyer, is a fifth grade teacher at Benjamin Bubb elementary school in Mountain View, California. She is a veteran teacher in her fifties who spends two to three hours a week grading student work.

Like our protagonist, there are currently 3.5 million K-12 teachers in the United States serving 52 million students (2020, https://www.statista.com/). With the digital transformation cause by the pandemic, most of those teachers deliver some formative assessments on-line using Google Forms or another assessment tool.

Given the shear number of students, automating even a few percent of a teacher’s grading workload could collectively save hundreds of thousands, if not millions, of teaching hours per week.


We want to reduce the amount of time Mrs. Dyer spends grading student work, while at the same time increasing the consistency and accuracy of the scores. This problem sits on the context of her daily practices of teaching. This means, to be solved, it needs to be integrated into the tools she is already using throughout her day. Also, it must be easy to use and accurate to actually save Mrs. Dyer time.


The core process we are disrupting is how scoring gets done. Currently Mrs. Dyer pulls up student responses in the LMS she is using, grades the work, and marks the score in the online grade book.
Our solution is to intervene in that process by showing her a provisional score, possibly with the student response marked up to show where it diverges from the expected response.


We’ll provide extensions to the LMS or assessment tool that will allow educators review their student’s work and accept or update the provisional score.

The scoring interface will have settings that allow her to turn on and off various grading features. This might include spelling, grammar, and various semantic checks.

For longer form responses, we will mark-up the student response so that the teacher can see where to student response diverges from the expected response.

Various ML providers have technology that can be used for analyzing the distance between a student’s response and an idea response. One such offering is OpenAI’s Embeddings.

As a stretch feature we want to produce a rationale for any deductions given.