Student Stories

Mapping surprise in the human mind, with help from AI

By comparing brain scans and AI predictions, a UChicago psychology student is researching how we process unexpected events

We build AI systems to mimic the human brain: writing emails, answering questions and predicting what comes next. But new research aims to turn that relationship around—using large language models (LLMs) to explore how our brains anticipate and process stories.

"I think that the way that an LLM represents events is similar to how humans do. That's a really interesting part of our research," said Bella Summe, a fourth-year data science major currently involved in a research project in the Cognition, Attention and Brain Lab at the University of Chicago.

The project, directed by psychology Assoc. Prof. Monica Rosenberg, is aimed to determine if large language models can predict a fundamental process in human cognition—surprise. Their approach was to compare how humans and AI respond to the same narrative moments.

“By comparing human brain and behavioral responses with surprise signals from large language models, we can identify where AI mirrors human understanding of narratives—and where it diverges,” said Ziwei Zhang, a doctoral student supervising Summe’s work.

To gather data on human surprise, participants listened to stories while researchers recorded their responses in real time using brain scans.

To determine corresponding surprise signals from AI, the researchers fed the same stories to the language model Llama, prompting it to predict what text would come next after each story chunk.

When the AI’s prediction and the actual story were mismatched, that gap served as a measure of surprise. The premise is simple—surprise is what happens when our predictions fail. If an AI’s prediction about future plot is different from reality, that mismatch might mirror the discrepancy human readers feel.

Modeling the Mind

The results showed a striking alignment: The AI's prediction errors correlated with both what participants reported feeling and the activity patterns in their brain scans.

In other words, the AI’s predictions diverged when participants reported surprise, and when their brain scans indicated surprise. 

That matters because past studies had mainly looked at surprise or prediction errors during learning tasks, where the main measure is the learning outcome, rather than a subjective feeling of surprise. 

Bella Sume headshot
Bella Summe

Here, Summe is also interested in seeing whether the neural activity matched people’s self-reported experiences.

“That was a key question we were trying to get at,” Summe said. “How someone thinks about or feels surprise—is that exactly the same as the experience that's going on in their brain?”

In addition, this correlation emerged when researchers analyzed texts in chunks of around 10 to 20 words rather than word by word. This suggests that humans and AI systems encode surprise in narratives at a broader level than individual words—within meaningful segments where actions or ideas unfold.

“You won’t necessarily be surprised by the next word, especially something like ‘at’ or ‘the’,” said Summe. “The experience of surprise happens during the entirety of the event.”

Pioneering the Playbook

In addition to insights about how surprise unfolds in language, the project offered Summe a rare chance to work in an emerging field. In fact, few if any studies have explored whether an LLM's prediction errors could serve as a measure of human surprise.

"It’s not an incredibly defined area of research. There is not a tried and true method people typically use to calculate an LLM’s surprise,” said Summe. “We were kind of on our own. So, that creative process was very interesting.”

In practice, that meant constant problem-solving on the ground: troubleshooting how to measure dissimilarity between predictions and actual text, experimenting with different context windows, fixing technical issues as they arose. Summe was involved throughout the process—coding the behavioral experiment, running participants through the study, and analyzing the data.

She also felt that working hands-on with participants taught her to understand their experience, anticipate what might go wrong, and think through experimental design from the ground up.

"You're often problem-solving as you're going along," Summe said. "If there's an issue during data collection, you have to be very quick and figure out how to resolve it while the participant's right there. Or, there may even be an issue with the way that the data is being saved."

Exploring Across Disciplines

That kind of adaptive thinking—applying computational skills to big questions across domains—is exactly what drew Summe to data science in the first place. 

She started at UChicago as a neuroscience major but hadn't committed to any particular path. Then, in her second year, she took a course called “Quantitative Modeling and Biology.” It covered basic data science techniques applied to biological questions, and something clicked.

"That was probably one of my favorite classes that I had taken," Summe said. “It felt like a very flexible field where you're learning a bunch of skills, but they don't have to apply specifically in a narrow career. The options felt unlimited."

Her route into psychology research was similarly serendipitous. While taking a Mind sequence course to fulfill a social science requirement, her professor learned that Summe wanted to get involved in research—which led to her opportunity at the intersections of AI and human emotions. Now, as a fourth-year, she's applying to graduate programs in health and biomedical informatics, hoping to continue research in academia or industry.

Beyond her interests in developing lab experience, Summe was drawn to the research for its potential to illuminate fundamental questions about human cognition. In fact, the findings of the Cognition, Attention and Brain Lab suggest a new approach to studying the mind—one that uses artificial systems as research tools.

"I think this can help us better explore how humans represent events and how they comprehend any type of story," Summe said.