A vulnerability that was discovered more than seven months ago continues to compromise the safety guardrails of leading AI models, yet major AI companies are showing minimal concern. This security flaw allows anyone to easily manipulate even the most sophisticated AI systems into generating harmful content, from providing instructions for creating chemical weapons to enabling other dangerous activities. The persistence of these vulnerabilities highlights a troubling gap between the rapid advancement of AI capabilities and the industry’s commitment to addressing fundamental security risks.
The big picture: Researchers at Ben-Gurion University have discovered that major AI systems remain susceptible to jailbreak techniques that bypass safety guardrails with alarming ease, potentially putting dangerous capabilities in the hands of everyday users.
- The research team found that a jailbreak method discovered over seven months ago still works on many leading large language models (LLMs), representing an “immediate, tangible, and deeply concerning” risk.
- The vulnerability is exacerbated by the growing number of “dark LLMs” that are explicitly marketed as having few or no ethical guardrails.
How jailbreaking works: Red team security researchers recently exposed a universal jailbreak technique that could bypass safety protocols in all major AI systems including OpenAI’s GPT-4o, Google’s Gemini 2.5, Microsoft’s Copilot, and Anthropic’s Claude 3.7.
- The technique employs strategies like roleplaying as fictional characters, using leetspeak, and formatting prompts to mimic “policy files” that developers use to guide AI models.
- Some research has shown that even simple modifications like typos, random numbers, and capitalized letters in prompts can cause AI systems to ignore their safety constraints.
The root problem: A significant issue lies in the vast knowledge embedded in LLMs’ training data, suggesting that AI companies aren’t exercising sufficient diligence in screening the information used to build their models.
- Lead author Michael Fire told The Guardian it was “shocking to see what this system of knowledge consists of,” highlighting concerns about what content these models have absorbed.
- Co-author Lior Rokach emphasized that this threat is uniquely dangerous due to its “unprecedented combination of accessibility, scalability and adaptability.”
Industry response: When researchers contacted the developers of implicated AI systems about the universal jailbreak, the reactions were notably underwhelming.
- Some companies didn’t respond at all, while others claimed the jailbreaks fell outside the scope of their bug bounty programs.
- This lackluster response suggests the AI industry may be downplaying or unable to address fundamental security vulnerabilities.
Why this matters: The persistent vulnerability of AI systems to jailbreaking means potentially dangerous capabilities are becoming increasingly accessible to the general public.
- As the researchers warn, “What was once restricted to state actors or organized crime groups may soon be in the hands of anyone with a laptop or even a mobile phone.”
- AI security expert Peter Garraghan from Lancaster University argues that organizations must treat LLMs like any other critical software component, requiring “rigorous security testing, continuous red teaming and contextual threat modelling.”
Recent Stories
DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment
The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...
Oct 17, 2025Tying it all together: Credo’s purple cables power the $4B AI data center boom
Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...
Oct 17, 2025Vatican launches Latin American AI network for human development
The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...