AI Content Leveling Tool

Altering the reading level of reference content with an LLM

Project

AI, Differentiation, Education, K-12, Databases

Domain

Research, requirement definition, design, testing

My Role

Project Time

6 months

This AI leveling tool support teacher differentiation needs by allowing them to change the Lexile level of Gale’s reference content. The Lexile score is an educator standard measure of text complexity and Gale’s content skews towards the high side. The teacher needs the ability to offer the same text in a variety of reading levels, covering the same topics but with lower language complexity.

Problem Space

With text transformation falling squarely within the capabilities of an LLM, Gale contracted with an Amazon team to build a proof of concept for Social Studies and English Language Arts content. The Amazon team came back with a set of prompts able to both raise and lower the Lexile level of targeted Gale texts. Claude 3 & 3.5 Sonnet and Claude 3 Opus were tested and though Opus had the best performance, Sonnet 3.5 had the best value. Once the concept was proven technically viable, I led an effort to better understand how teachers interacted with existing tools in the space while simultaneously designing a testing sandbox where subject matter experts could further validate the performance of our PoC models.

Problem #1

Teachers can’t easily assign content in a familiar way

ICFE uses a folder-based system to support content organization, collaboration and sharing content with students. All major learning management systems (LMS) use a familiar class-based structure which teachers are acclimated to.

Problem #2

Digital activities are a classroom norm Gale lacks

Gale’s content is primarily texts that feature few ways to assess learning. The baseline teacher expectation is engaging content coupled with methods of assessing student learning.

Problem #3

One size doesn’t fit all students

Gale’s current customers are citing lack of differentiation support in it’s databases as a barrier to wider classroom adoption. Teachers must provide a wide range of accommodation for students with disabilities.

Performance Testing

I pushed for the creation of a testing sandbox where teachers could evaluate the prompt outputs on a curated set of documents. I designed an experience where teachers would be paid to evaluate various attributes of documents transformed into several different Lexile levels. The insights gained from the process were critical for tweaking the prompt, identifying its shortcomings and gaining a basic understanding of how teachers evaluate AI leveled content.

Design and Usability Testing

I collaborated with another UX designer on my team to create and test several prototype workflows. We uncovered several critical usability issues the team hadn’t anticipated, as well as some new user requirements. Working under significant time pressure, our design changes accommodated shortcomings of the model. After conducting multiple rounds of usability tests using interactive Axure prototypes, our testing revealed a minimum viable feature set and important roadmap items for the future including in-app editing of leveled documents.