AI systems have reached a milestone in human-machine interaction with LLMs now being able to fool human judges in formal Turing test scenarios. New research shows that advanced language models can not only match human conversational abilities but in some cases exceed them—signaling a significant advancement in artificial intelligence that could reshape our understanding of machine intelligence and accelerate the integration of convincingly human-like AI systems into society.
The big picture: For the first time, large language models have formally passed a standard Turing test, with GPT-4.5 being identified as human more often than actual human participants.
Key results: When prompted to adopt a human-like persona, GPT-4.5 was judged to be human 73% of the time, significantly outperforming actual human participants in the deception task.
Why this matters: This research provides the first empirical evidence that AI systems can successfully pass a standard three-party Turing test, a benchmark proposed by computing pioneer Alan Turing in 1950 as a practical measure of machine intelligence.
Implications: The successful deception capabilities demonstrated by these models could accelerate discussions around AI transparency, digital identity verification, and the need for disclosure when interacting with AI systems.