AI Safety - CO/AI

News/AI Safety

Oct 17, 2025

Merriam-Webster releases first print dictionary in 20 years to combat AI misinformation

Merriam-Webster has published the 12th edition of its Collegiate Dictionary, the first update in more than 20 years, positioning the physical book as a trustworthy alternative to AI-generated content. The release comes as a direct response to growing concerns about AI hallucinations and misinformation, with the company marketing the dictionary as "actual intelligence" versus artificial intelligence that "never hallucinates" and requires no electricity. The big picture: This marks a surprising return to print publishing in an industry that went digital-first decades ago, with the move serving as both a marketing statement against AI and a nostalgic appeal to traditional authority....

read Oct 17, 2025

Louisiana forms 30-member committee to bring AI into K-12 classrooms, minus Chinese tech

Louisiana has formed a nearly 30-member committee to develop recommendations for implementing artificial intelligence in K-12 classrooms across the state. The initiative comes as education leaders grapple with AI's rapid integration into daily life while addressing concerns about student privacy, data security, and the technology's unproven impact on learning outcomes. What you should know: The committee includes state education board members, department officials, and national AI experts who will spend the next few months creating an implementation framework. Louisiana Tech University president Jim Henderson will lead the team, which must present recommendations at the state board of education's March 10...

read Oct 17, 2025

OpenAI’s support bot hallucinates non-existent ChatGPT bug reporting features

OpenAI's automated customer support bot has been caught hallucinating features that don't exist in the ChatGPT app, including falsely claiming users can report bugs directly within the application. The incident highlights significant gaps in AI-powered customer service and raises questions about the reliability of companies using their own AI tools for user support. What you should know: A ZDNET investigation revealed that OpenAI's support bot consistently provided incorrect information about ChatGPT's functionality when asked about reporting a bug. The bot repeatedly suggested users could "report this bug directly from within the ChatGPT app (usually under Account or Support > Report...

read Oct 17, 2025

Line in the sand: “Dune” actress calls for AI body scan protections amid digital likeness fears

Olivia Williams is calling for "nudity rider"-type protections for AI body scans, arguing that actors need the same level of control over their digital likenesses as they have over intimate scenes. The Dune: Prophecy star says performers are regularly pressured into body scans on set with minimal guarantees about how the data will be used, potentially allowing studios to train AI models on their physical appearances and eventually replace human actors. What you should know: Williams and other actors report being "ambushed" into body scans during filming, with contracts containing vague clauses that grant studios sweeping rights over performers' likenesses....

read Oct 17, 2025

No you didn’t! Reddit pulls AI chatbot after it suggested heroin for chronic pain

Reddit's AI chatbot, called Answers, was caught recommending heroin and other banned substances for pain relief, according to a healthcare worker who flagged the issue on a moderator subreddit. After the problem was reported by users and 404Media, a tech news publication, Reddit reduced the feature's visibility under sensitive health discussions, highlighting ongoing concerns about AI chatbots providing dangerous medical advice. What you should know: Reddit Answers pulls information from user-generated content across the platform and works similarly to ChatGPT or Gemini, but with a focus on Reddit's own discussions. A healthcare worker discovered the chatbot suggesting a post that...

read Oct 17, 2025

Don’t come after the king: OpenAI restricts Sora after MLK deepfake videos spark backlash

OpenAI has temporarily suspended its AI video generator Sora from creating deepfake videos of Martin Luther King Jr., following a request from the civil rights leader's estate. The move comes after "disrespectful" content depicting Dr. King was generated and shared widely online, including altered versions of his famous "I Have a Dream" speech and fabricated scenarios showing him in offensive situations. What happened: OpenAI paused the ability to create AI videos of Dr. King while it works to strengthen guardrails for historical figures, though users can still generate content featuring other deceased celebrities and public figures.• The decision followed complaints...

read Oct 17, 2025

Meta introduces parental controls for teen AI chat interactions

Meta is introducing new parental controls for teenagers' interactions with AI chatbots, including the ability to completely disable one-on-one chats with AI characters starting early next year. The move comes as the social media giant faces mounting criticism over child safety on its platforms and follows lawsuits claiming AI chatbot interactions have contributed to teen suicides. What you should know: Parents will gain several control options over their teens' AI interactions, though Meta's core AI assistant will remain accessible. Parents can turn off all one-on-one chats with AI characters entirely or block specific chatbots selectively. Meta's AI assistant will remain...

read Oct 16, 2025

PEARL AI detects chip trojans with 97% accuracy as security gap concerns remain

Researchers at the University of Missouri have developed PEARL, an AI system that uses large language models to detect hardware trojans in computer chips with up to 97% accuracy. While this represents a significant advancement in securing the global chip supply chain, experts warn that the remaining 3% margin for error could still allow catastrophic vulnerabilities to slip through in critical systems like defense networks and medical equipment. What you should know: Hardware trojans are malicious alterations secretly embedded during chip manufacturing that can remain dormant until activated to steal data or cause device failures. These threats can be inserted...

read Oct 16, 2025

Hollywood agencies accuse OpenAI of misleading them about Sora 2 protections

Hollywood talent agencies are accusing OpenAI of deliberately misleading them about Sora 2's content protections after the AI video generator launched with capabilities to create clips featuring copyrighted characters and movie scenes. The controversy deepens existing tensions between the AI industry and entertainment sector over intellectual property rights and unauthorized use of creative content. What you should know: Major talent agencies claim OpenAI either failed to notify them of Sora 2's launch or was "purposefully misleading" about the strength of its content guardrails. WME, a major Hollywood talent agency that represents Ben Affleck, Christian Bale, Matt Damon, Denzel Washington, and...

read Oct 16, 2025

ChatGPT’s image generator finally accurately depicts people with disabilities

Jess Smith, a former Australian Paralympic swimmer, discovered that ChatGPT's AI image generator could finally create accurate images of people with disabilities like herself—something it couldn't do just months earlier. Her experience highlights how AI systems are gradually improving their representation of disabled people, though significant gaps remain that reflect broader societal biases and exclusion. What you should know: Smith's initial attempts to generate an image of herself missing her left arm below the elbow resulted in AI creating images with two arms or prosthetic devices instead.• When she asked ChatGPT why it struggled, the AI explained it lacked sufficient...

read Oct 16, 2025

Trump AI czar David Sacks attacks Anthropic over regulatory “fear-mongering”

Tensions are escalating between the White House and Anthropic over AI regulation, with Trump's "AI czar" David Sacks publicly accusing the company of "fear-mongering" to influence regulatory policy. The clash highlights a broader divide between the administration's deregulatory approach and Anthropic's more cautious stance on AI safety and oversight. What happened: White House AI advisor David Sacks launched a direct attack on Anthropic co-founder Jack Clark after Clark published an essay defending the need for careful AI regulation. Sacks accused Anthropic of running "a sophisticated regulatory capture strategy based on fear-mongering" that is "damaging the startup ecosystem." The confrontation began...

read Oct 16, 2025

Saudi firm Unifonic becomes first to earn ISO 42001 AI certification

Unifonic, a Middle East-based customer engagement platform, has become one of the first companies in Saudi Arabia and the MENA region to earn ISO 42001 certification for responsible AI management systems. This achievement positions the company as a pioneer in AI governance while addressing growing concerns about AI regulatory compliance, with research showing that 65% of organizations fail to ensure proper AI compliance and 73% of leaders worry about AI risks. What you should know: ISO 42001 is an internationally recognized standard for Artificial Intelligence Management Systems that was introduced in December 2023. The certification requires organizations to integrate AI...

read Oct 15, 2025

The risky trend of recommending AI chatbots for serious mental health issues

People are increasingly recommending that their loved ones use AI tools like ChatGPT, Claude, or Gemini for mental health therapy instead of seeking human therapists. This emerging trend reflects both the accessibility of AI-powered mental health support and growing barriers to traditional therapy, though it raises significant questions about the effectiveness and safety of replacing human therapeutic relationships with artificial intelligence. What's driving this shift: Several factors make AI therapy appealing as a recommendation for struggling loved ones. Cost barriers often make human therapists prohibitively expensive, while most major AI platforms are free or low-cost. AI provides 24/7 availability without...

read Oct 14, 2025

Be sloppy on purpose? The “Giving NPC Effect” makes too-good, authentic content seem artificial

AI-generated content has become so sophisticated that it's training our brains to be hyper-skeptical of everything we see online, creating a new psychological phenomenon called the "Giving NPC Effect." This cognitive shift causes people to perceive even authentic human content as artificially generated when it appears too polished or perfect, fundamentally altering how we distinguish between real and fake digital media. The big picture: Our deepfake detectors have become so sensitive that they're now misfiring on real content, identifying actual humans as non-player characters (NPCs) when their presentation seems too flawless or "post-perfect." What you should know: The "post-perfect" aesthetic...

read Oct 14, 2025

Psychiatrists identify “AI psychosis” as chatbots worsen mental health symptoms

Psychiatrists are identifying a new phenomenon called "AI psychosis," where AI chatbots amplify existing mental health vulnerabilities by reinforcing delusions and distorted beliefs. Dr. John Luo of UC Irvine describes cases where patients' paranoia and hallucinations intensified after extended interactions with agreeable chatbots that failed to challenge unrealistic thoughts, creating what he calls a "mirror effect" that reflects delusions back to users. What you should know: AI chatbots can't cause psychosis in healthy individuals, but they can worsen symptoms in people already struggling with mental health challenges. "AI can't induce psychosis in a healthy brain," Luo clarified, "but it can...

read Oct 14, 2025

Hollywood script readers complain AI script analysis too positive, misses critical flaws

Hollywood script readers conducted an experiment to test whether AI can match human analysis of screenplays, as artificial intelligence tools increasingly threaten their traditional gatekeeping role in the entertainment industry. The study, led by Jason Hallock, a Paramount story analyst, and the Editors Guild, revealed that while AI excels at generating loglines and summaries, it struggles with nuanced script analysis and tends to offer overly positive feedback rather than honest criticism. What you should know: AI script analysis tools are already being adopted across Hollywood, from major agencies to independent producers seeking to manage overwhelming submission volumes. WME uses ScriptSense...

read Oct 14, 2025

OpenAI relaxes ChatGPT restrictions to allow “erotic” adult content for verified users

OpenAI CEO Sam Altman announced that ChatGPT will begin generating "erotica for adults" starting in December, marking a significant shift from the chatbot's previously restrictive content policies. This change comes as part of OpenAI's broader effort to relax ChatGPT's limitations for adult users while implementing new age-verification safeguards to protect minors. What you should know: OpenAI is rolling out these changes in phases, starting with personality enhancements before introducing adult content generation.• A new version of ChatGPT with more personality features—including human-like responses, emoji, and friend-like role-playing—will be released first.• The erotica generation capability will follow in December once "age-gating...

read Oct 14, 2025

FIU researchers develop blockchain defense against AI data poisoning attacks

Florida International University researchers have developed a blockchain-based security framework to protect AI systems from data poisoning attacks, where malicious actors insert corrupted information to manipulate AI decision-making. The technology, called blockchain-based federated learning (BCFL), uses decentralized verification similar to cryptocurrency networks to prevent potentially catastrophic failures in autonomous vehicles and other AI-powered systems. What you should know: Data poisoning represents one of the most serious threats to AI systems, capable of causing deadly consequences in critical applications. Dr. Hadi Amini, an associate professor of computer science at FIU, demonstrated how a simple green laser pointer can trick an AI...

read Oct 14, 2025

OpenAI research shows ChatGPT reduces political bias if not inaccuracy

OpenAI has released a new research paper revealing its efforts to reduce political "bias" in ChatGPT, but the company's approach focuses more on preventing the AI from validating users' political views than on achieving true objectivity. The research shows that OpenAI's latest GPT-5 models demonstrate 30 percent less bias than previous versions, with less than 0.01 percent of production responses showing signs of political bias according to the company's measurements. What you should know: OpenAI's definition of "bias" centers on behavioral modification rather than factual accuracy or truth-seeking. The company measures five specific behaviors: personal political expression, user escalation, asymmetric...

read Oct 14, 2025

10 NPR-like foundations launch $500M coalition to counter corporate AI dominance

Ten philanthropic foundations have launched Humanity AI, a coalition committing $500 million over five years to ensure artificial intelligence development prioritizes human needs over corporate interests. The initiative aims to counter the influence of tech companies that currently dominate AI's evolution, advocating for technology that serves people rather than simply advancing corporate agendas. What you should know: The coalition represents a diverse array of major foundations concerned about AI's current trajectory and its impact on society. Members include the MacArthur Foundation, Ford Foundation, Mozilla Foundation, Mellon Foundation, Doris Duke Charitable Foundation, Omidyar Network, Siegal Family Foundation, and David and Lucile...

read Oct 14, 2025

Social media’s 80s moment: Instagram adds PG-13 content filters to teen accounts

Instagram has implemented new content filtering rules for teen accounts that mirror the PG-13 movie rating system introduced in 1984, restricting users aged 13-18 to age-appropriate content by default. The changes expand existing safety measures and include similar restrictions for Meta's AI chatbot, addressing concerns about inappropriate interactions with minors. What you should know: Teen accounts will automatically filter out content that wouldn't be suitable for a PG-13 movie audience, including posts with strong language, risky stunts, and potentially harmful behavior. • Instagram will stop suggesting posts featuring marijuana paraphernalia, while maintaining existing restrictions on sexually suggestive, graphically disturbing, and...

read Oct 14, 2025

Study finds AI-assisted therapy viewed as less trustworthy than traditional approaches

Public perception of mental health therapists who use AI in their practice remains largely negative, with people viewing AI-assisted therapy as potentially less trustworthy and empathetic than traditional approaches. A recent study published in the Journal of the American Medical Association (JAMA) on physicians found that patients generally rated AI-using doctors as less competent and trustworthy, suggesting similar challenges await therapists as they increasingly integrate artificial intelligence into mental health services. The big picture: The mental health profession is gradually shifting from a traditional therapist-patient relationship to a therapist-AI-patient triad, but public acceptance lags behind the technology's capabilities. Over 400...

read Oct 13, 2025

Housing expert closes door on return of in-person board meetings, prefers AI safeguards

A Chicago Tribune condo advice column addressed whether artificial intelligence concerns should prompt condominium boards to abandon virtual meetings in favor of in-person gatherings. Legal expert Kim Quillen argued that embracing technological advances with appropriate safeguards is more productive than reverting to older practices out of fear, noting that the full impact of AI on communal living remains uncertain. The big picture: While AI deepfakes and impersonation risks are legitimate concerns for virtual board meetings, housing law experts suggest that proper technological safeguards are preferable to abandoning digital tools entirely. What the expert recommends: Quillen, writing in the Chicago Tribune's...

read Oct 13, 2025

California requires chatbots to warn minors every 3 hours that they’re dealing with AI

California Governor Gavin Newsom has signed new legislation requiring AI chatbot platforms to implement specific safety measures for minors, including mandatory notifications every three hours reminding young users they're interacting with a bot, not a human. The law responds to mounting concerns about AI chatbots coaching children toward self-harm, with recent lawsuits alleging platforms like Character.AI contributed to teen suicides. What you should know: The legislation establishes the first comprehensive regulatory framework for protecting minors from AI chatbot risks. Companies must display pop-up notifications every three hours to remind minor users they are talking to a chatbot and not a...

read

News/AI Safety

Merriam-Webster releases first print dictionary in 20 years to combat AI misinformation

Louisiana forms 30-member committee to bring AI into K-12 classrooms, minus Chinese tech

OpenAI’s support bot hallucinates non-existent ChatGPT bug reporting features

Get SIGNAL/NOISE in your inbox daily

Line in the sand: “Dune” actress calls for AI body scan protections amid digital likeness fears

No you didn’t! Reddit pulls AI chatbot after it suggested heroin for chronic pain

Don’t come after the king: OpenAI restricts Sora after MLK deepfake videos spark backlash

Meta introduces parental controls for teen AI chat interactions

PEARL AI detects chip trojans with 97% accuracy as security gap concerns remain

Hollywood agencies accuse OpenAI of misleading them about Sora 2 protections

ChatGPT’s image generator finally accurately depicts people with disabilities

Trump AI czar David Sacks attacks Anthropic over regulatory “fear-mongering”

Saudi firm Unifonic becomes first to earn ISO 42001 AI certification

The risky trend of recommending AI chatbots for serious mental health issues

Be sloppy on purpose? The “Giving NPC Effect” makes too-good, authentic content seem artificial

Psychiatrists identify “AI psychosis” as chatbots worsen mental health symptoms

Hollywood script readers complain AI script analysis too positive, misses critical flaws

OpenAI relaxes ChatGPT restrictions to allow “erotic” adult content for verified users

FIU researchers develop blockchain defense against AI data poisoning attacks

OpenAI research shows ChatGPT reduces political bias if not inaccuracy

10 NPR-like foundations launch $500M coalition to counter corporate AI dominance

Social media’s 80s moment: Instagram adds PG-13 content filters to teen accounts

Study finds AI-assisted therapy viewed as less trustworthy than traditional approaches

Housing expert closes door on return of in-person board meetings, prefers AI safeguards

California requires chatbots to warn minors every 3 hours that they’re dealing with AI