One doctor’s journey to making an AI study tool less corrosive to critical thinking

At the risk of sounding out of the times, I think back to how simple education was back in the day: attend lectures, take notes, read books (gasp), do any relevant flashcards or question sets, and repeat. Indeed, this formula is a functional strategy for many prior generations of doctors. For my own medical journey, now fifteen years in as an otolaryngology resident, this simple strategy has worked well enough.

The landscape of education has significantly changed, however. To my surprise, even the base materials from which students study are significantly different. This became most apparent to me when my company, an AI study tools platform, worked with NAF (formerly the National Academy Foundation). For those not familiar, NAF is a non-profit organization that provides early exposure to career-specific pathways such as pre-health to high school students. Through our collaboration, I was introduced to current educational approaches, and I was surprised how activity plans and supplemental reading have largely supplanted textbooks.

Now, granted, there have certainly been shifts in technology back in my day. The internet became broadly available when I was in middle school, and smartphones started to take their current shape in high school. However, these advancements have largely made existing learning resources more available, rather than shifting the educational paradigm.

This is not to say that I believe new educational techniques are not useful. In fact, they likely better address the multi-modality methods by which students learn as shown in educational trials. All I intend with this anecdote is that education is rapidly changing.

And at no other time period is this more apparent than the current with the rapid adoption of artificial intelligence. In a recent pre-print article in arXiv, MIT researchers Kosmyna and colleagues investigated the effects on thinking by popular AI models like ChatGPT. Their experimental design included three arms. All participants were asked to write an essay, but they were split into three groups with different access to resources. One group had no resources at all, the second had access to traditional search engines, and the last was provided LLM assistance. Using electroencephalography (EEG; a method of mapping brain activity), they found that overall brain engagement decreased from no resources to traditional search engines and finally to AI. Additionally, but not surprisingly, the AI group was less able to quote wording from their own essays.

While there are risks to overreliance in AI, as many of us likely notice from our own personal experiences with the technology, I think that the negative, visceral reaction to studies like the one cited above disregard the potential benefits of AI. At the very least, Pandora’s box has been opened, and AI will forever be within our educational landscape, necessitating strategies to mitigate their negative effects.

In my view, there are at least two major issues with educational AI tools that need to be addressed: the chat-based format, which is primarily designed to provide answers, and the problem of unreliable information caused by hallucinations. Both of these problems, I believe, can be solved with one approach: developing AI tools that promote critical thinking through engagement with expert-validated text resources.

To understand this, we should first cover a little background. Large language models (LLMs) are built to predict the most likely next word (technically the next token) in a sequence of words. Because of their large training data set, an emerging property is that they are able to have convincing conversations with us, and even cite knowledge from the data it was trained on. However, this probabilistic system of predicting the next word can lead to false information, a phenomenon called hallucinations. Moreover, interactions with these LLMs are largely in a chat-based environment, where students primarily ask questions and receive responses.

One technique to improve reliability in AI answers is called Retrieval Augmented Generation, which IBM software engineers explain is a method to provide LLMs a reference when answering questions. This reduces the likelihood of hallucinations as the AI is instructed to pull information from that trusted resource. Furthermore, this affords the reader some reassurance that they are receiving answers from a specific source of interest. One additional approach that programmers can implement is building alternative ways to interact with AI that promote critical thinking, rather than the Q&A format provided by chatbots. For example, one could build in active learning strategies like flashcards with spaced-repetition, practice tests with thoughtful explanations, and conversational partners that use a Socratic learning style, all based on the expert-validated reference text.

This is the approach that many new AI study platforms like ours and Google’s NotebookLM are taking. At our company for example, we host resources by trusted publishers and content experts that can be used with active-learning AI tools. One unexpected challenge that our team faced is the hesitance by for-profit publishers, many of which have large monetary stakes in the current system, to adapt to this new technology. To address this, our early strategy focuses on open-access resources by non-profit organizations like OpenStax, which facilitates textbook development by expert groups for all major high school and college subjects. This has the added benefit of decreasing the cost-barrier to students, who face an increasingly complex and expensive educational environment.

We further pursued this open-access approach by incorporating our own free resources such as OpenMCAT, a series of free MCAT textbooks that is fully integrated into an AI learning space. In the spirit of open-access, we built upon work started by the American Association of Medical Colleges (AAMC) that identified resources from OpenStax that cover the core competencies of the MCAT. These select resources were synthesized into the first iteration of OpenMCAT on biological and biochemical sciences.

In sum, this is a rapidly changing environment for students and educators, with many potential pitfalls. While great caution is needed, there is also opportunity in this new technology to make education more reliable, personalized, and affordable. To accomplish this, close collaboration between students, educators, and technologists is essential.

Mark Lee is an otolaryngology resident.