Multimodal LLMs appear to leverage conversation memory in ways that affect their performance and reliability, particularly when interpreting ambiguous visual inputs. This research reveals important differences in how models like GPT-4o and Claude 3.7 handle contextual information across conversation threads, raising questions about model controllability and the nature of instruction following in advanced AI systems.
The experiment setup: A researcher tested GPT-4o’s and Claude 3.7’s visual recognition capabilities using foveated blur on CAPTCHA images of cars.
- The test used 30 images with cars positioned in different regions, applying varying levels of blur that mimicked human peripheral vision.
- Initially asking “Do you see a car in this?” seemed too leading, so the researcher switched to the more neutral “What do you see in this image?”
Unexpected findings: GPT-4o consistently identified cars in heavily blurred images within established conversation threads, but struggled with the same images in fresh threads.
- The model maintained high accuracy in identifying cars in ongoing conversations, even when images were blurred beyond human recognition.
- When the same images were presented in new conversation threads, GPT-4o’s performance dropped significantly, once misidentifying a staircase as a dessert.
- When questioned, GPT-4o initially denied using prior context as assistance, but later acknowledged that earlier conversation history had influenced its responses.
Model differences: Claude 3.7 demonstrated more consistent behavior across different conversation threads.
- Claude provided more cautious responses regardless of conversation history.
- Even when primed with the word “car,” Claude’s answers showed less influence from prior context compared to GPT-4o.
Broader implications: The research suggests multimodal LLMs possess a form of implicit memory beyond their explicit context windows.
- This aligns with concerns raised in a LessWrong post from two years earlier about LLMs lacking access to long-term memory beyond immediate contexts.
- The heavy utilization of this persistent memory, even when instructed otherwise, complicates model controllability.
- Instructions to “ignore previous context” appear to function as probability influencers rather than hard rules that override prior activations.
Why this matters: This phenomenon raises important questions about how reliably we can control and direct multimodal AI systems in real-world applications.
- The implicit memory effect could lead to inconsistent performance in safety-critical applications where context isolation is important.
- Understanding these memory dynamics is crucial for developing more reliable and controllable AI systems.
Memory Persistence within Conversation Threads with Multimodal LLMS