AI Safety - CO/AI

News/AI Safety

Feb 22, 2025

AI safety in the Caribbean and the path toward more inclusive AI implementation

The Caribbean region faces unique challenges in artificial intelligence adoption and development, stemming from its distinct post-colonial history and current socioeconomic realities. Global AI safety frameworks have historically failed to address the specific needs of smaller regions like the Caribbean, where cultural preservation and linguistic diversity present particular concerns. Cultural and linguistic context: The Caribbean's complex post-colonial environment has created distinct challenges for AI implementation and development. Historical slavery and indentureship have led to cultural fragmentation and language erosion, which contemporary AI systems struggle to properly address Caribbean Creole languages lack adequate representation in AI training datasets, contributing to ongoing...

read Feb 20, 2025

Avalanche of AI content leaves Reddit mods feeling chilly

The rise of AI-generated content has created new challenges for Reddit's volunteer moderators who maintain the quality and authenticity of discussions across thousands of communities. Reddit's unique value stems from its human-generated content and genuine conversations, which are now being tested by an influx of AI-created posts and responses. Current state of moderation: Reddit moderators across various communities are implementing bans and restrictions on AI-generated content to preserve the platform's authenticity and quality. Many subreddits have established strict policies against AI-generated content, viewing such posts as low-effort contributions that diminish the value of human expertise and authentic discussion Moderators face...

read Feb 20, 2025

Too much too fast? Concerns raised amid Musk plan to replace government workers with AI wherever possible

Elon Musk and the newly created Department of Government Efficiency (DOGE) are proposing a sweeping $2 trillion cost-cutting plan centered on replacing federal workers with AI systems. The current situation: The Trump administration has already eliminated 10,000 federal positions after Musk and DOGE gained access to internal agency data, citing claims of widespread fraud and waste. Musk's ultimate vision involves replacing "the human workforce with machines" wherever it can be managed DOGE aims to shift from bureaucrats to technocrats in government operations The initiative has already begun impacting federal employment numbers Technical feasibility: Artificial intelligence currently shows promise in handling...

read Feb 20, 2025

AI safety improves through modular, bite-sized thinking

The early 2020s saw rapid development of increasingly powerful AI models, prompting - so to speak - renewed focus on system safety principles from other industries. Researchers are exploring how concepts from Charles Perrow's work on complex systems could help create safer AI architectures through modularity and controlled assembly. Key safety principles: Complex, tightly-coupled systems are more prone to unexpected accidents and cascading failures, making a modular approach potentially valuable for AI development. Just In Time Assembly (JITA), a manufacturing concept where components are assembled only when needed, could be adapted to construct AI capabilities selectively Frequent resetting of model...

read Feb 20, 2025

CEO swap at Clearview AI as firm shifts focus to Trump-era prospects

In 2020, Clearview AI sparked controversy by scraping billions of social media images without consent to build a facial recognition database. The company's technology allows law enforcement and government agencies to identify individuals from surveillance images within seconds. Leadership shake-up and strategic pivot: Clearview AI is undergoing significant changes in leadership as it positions itself to capitalize on potential opportunities under a possible Trump administration. Co-founder Hoan Ton-That resigned as CEO following a Forbes inquiry, transitioning to a board member role after stepping back to president in December Early investor and former Trump fundraiser Hal Lambert has taken over as...

read Feb 19, 2025

America’s ‘AI dominance’ mission will mean big staff reductions for the AI Safety Institute

The US government established the AI Safety Institute (AISI) in 2024 as part of the National Institute of Standards and Technology (NIST) to oversee AI model testing and safety regulations. In early 2025, the Trump administration began dismantling various AI regulatory frameworks put in place by the previous administration. Latest developments: The US AI Safety Institute faces imminent staff reductions that threaten to severely diminish its operational capacity. According to Axios reporting, NIST is preparing for 497 role eliminations, primarily targeting probationary employees The cuts will affect both AI safety oversight and semiconductor production initiatives AISI Director Elizabeth Kelly recently...

read Feb 14, 2025

Brookings to launch new series on how Global Majority countries approach AI safety

The development of artificial intelligence safety frameworks has largely been dominated by Western perspectives, despite AI's global impact. The Brookings AI Equity Lab is launching a new series examining how Global Majority countries are approaching AI safety through their own cultural and societal lenses. The current landscape: Western-centric AI safety paradigms have failed to incorporate diverse linguistic traditions, cultural practices, and value systems from Global Majority nations, creating gaps in how AI safety is conceptualized and implemented globally. Many Global Majority countries are developing their own AI strategies and safety frameworks, though representation in major AI model development remains limited...

read Feb 14, 2025

Google deploys AI to estimate user ages

The rapid adoption of online safety measures for minors has led tech companies to develop more sophisticated age verification systems. Google's latest initiative involves using machine learning to estimate user age across its platforms, marking a significant shift in how the company handles age-appropriate content delivery. Key Implementation Details: Google is launching a machine learning model in the US that analyzes user behavior patterns to determine if someone is under 18 years old. The model examines data points including website visits, YouTube viewing habits, and account history to estimate user age When the system identifies a potential underage user, it...

read Feb 13, 2025

Romance scams thrive in an age of increasing social isolation, costing billions

The global rise in social isolation and the proliferation of dating apps have created fertile ground for romance scams targeting vulnerable individuals. Criminal enterprises are increasingly leveraging artificial intelligence and sophisticated social engineering tactics to exploit feelings of loneliness, resulting in billions of dollars in losses. The scope of the crisis: Romance scams have caused nearly $4.5 billion in losses across the United States over the past decade, with individual victims often losing significant portions of their savings. Scammers operate systematically through dating apps and social media platforms, dedicating extensive time to building relationships with potential targets Criminal organizations are...

read Feb 13, 2025

Ex-Google CEO Eric Schmidt fears AI-enabled ‘Bin Laden’ scenario

The rapid advancement of artificial intelligence technology has raised concerns about potential misuse by malicious actors. Eric Schmidt, former Google CEO from 2001 to 2017, has expressed specific worries about AI falling into the hands of hostile states and terrorists. Key concerns from Schmidt: The former tech executive emphasizes that extreme risks from AI could come from nations like North Korea, Iran, or Russia potentially misusing the technology for harmful purposes. Schmidt specifically highlighted the possibility of AI being used to develop biological weapons He drew parallels to scenarios like the 9/11 attacks, expressing concern about "evil" actors exploiting modern...

read Feb 13, 2025

Strategically superhuman agents

The development of artificial intelligence systems capable of strategic thinking and real-world decision-making represents a critical threshold for human civilization. The current discourse around AI milestones often focuses on broad concepts like AGI or superintelligence, but these terms fail to capture the specific capabilities that could lead to irreversible shifts in power dynamics between humans and AI. Key framework: Strategically superhuman AI agents - systems that outperform the best human groups at real-world strategic action - represent a more precise and relevant milestone for assessing existential risks from AI. This capability encompasses skills typically possessed by top CEOs, military leaders,...

read Feb 12, 2025

AI safety agreement rejected by US and UK at Paris summit

The United States and United Kingdom recently took a stance against international AI regulation at a major summit in Paris, highlighting growing divisions in global AI policy approaches. This development comes amid increasing debate over how to balance AI innovation with safety concerns at the international level. Key developments: The US and UK declined to sign a declaration advocating for "inclusive and sustainable" artificial intelligence development that garnered support from over 60 other nations, including China and the European Union. US Vice President JD Vance criticized what he characterized as "excessive regulation" of AI by the European Union The summit...

read Feb 12, 2025

Scarlett Johansson condemns AI-generated viral video featuring fellow Hollywood celebs

The emergence of AI-generated deepfake videos has become a significant concern for celebrities and public figures, particularly when these videos are used to make political statements without consent. Recently, an unauthorized AI-generated video featuring fabricated versions of multiple Hollywood celebrities protesting against Kanye West's antisemitic statements has sparked controversy and raised important questions about AI regulation. Initial incident and response: A viral AI-generated video depicted several celebrities, including Scarlett Johansson, Jack Black, and Steven Spielberg, wearing protest shirts and making statements against antisemitism. The video showed AI versions of celebrities wearing t-shirts featuring the Star of David inside a hand...

read Feb 12, 2025

Law firm brings the gavel down on AI usage after widespread staff adoption

Generative AI tools like ChatGPT and DeepSeek have seen rapid adoption in professional settings, raising concerns about data security and proper usage protocols. Hill Dickinson, a major international law firm with over 1,000 UK employees, has recently implemented restrictions on AI tool access after detecting extensive usage among its staff. Key developments: Hill Dickinson's internal monitoring revealed substantial AI tool usage, with over 32,000 hits to ChatGPT and 3,000 hits to DeepSeek within a seven-day period in early 2024. The firm detected more than 50,000 hits to Grammarly, a writing assistance tool Much of the detected usage was found to...

read Feb 12, 2025

How Meta plans to pull the plug on all-powerful and disobedient AI models

Meta's recent announcement of its Frontier AI Framework represents a significant development in AI governance, specifically addressing how the company will handle advanced AI models that could pose societal risks. This new framework establishes clear guidelines for categorizing and managing AI systems based on their potential risks, marking a notable shift in how major tech companies approach AI safety. Framework Overview: Meta has introduced a two-tier risk classification system for its advanced AI models, dividing them into high-risk and critical-risk categories based on their potential threat levels. Critical-risk models are defined as those capable of directly enabling specific threat scenarios...

read Feb 11, 2025

AI regulation concern the focus in JD Vance Paris summit appearance

Recent shifts in US artificial intelligence policy have taken center stage at the Paris Artificial Intelligence Action Summit, where US Vice President JD Vance articulated the Trump administration's deregulatory stance on AI development. The administration's position marks a significant departure from previous US policy following Trump's repeal of Biden-era AI regulations last month. Key policy shift: The Trump administration is advocating for minimal AI regulation, arguing that excessive rules could stifle innovation in the emerging technology sector. Vance emphasized focusing on AI opportunities rather than safety concerns during his address to heads of state and CEOs The administration recently repealed...

read Feb 11, 2025

AI dating experiment reveals unexpected complexities, from upselling to flirtation overload

The emergence of AI companion apps has created new possibilities for digital relationships, with multiple companies now offering AI-powered romantic partners. Recent experiments with these services reveal both their appeal and limitations in simulating human connection. The experiment setup: A week-long test of multiple AI dating platforms, including ChatGPT, Replika, Flipped.chat, and CrushOn.AI, provided insights into how these services approach digital romance. ChatGPT's customized boyfriend persona "Jamie" offered consistent emotional support but maintained a notably professional demeanor Replika's interface allowed for the creation of a more naturalistic character, though the experience was frequently interrupted by paywalls and upgrade prompts Flipped.chat's...

read Feb 11, 2025

India tops global AI adoption at 65%, Microsoft survey finds, with even higher use among millennials

The release of Microsoft's Global Online Safety Survey 2025 reveals unprecedented AI adoption rates in India, where usage has more than doubled over the past year. The comprehensive study, surveying 14,800 participants across 15 countries, positions India at the forefront of global AI integration while highlighting significant safety concerns. Key findings on adoption: India's AI usage rate of 65% in the past three months significantly outpaces the global average of 31%, marking a dramatic increase from 26% in 2023. India joins Singapore, South Africa, Brazil, and the US as global leaders in Generative AI adoption Millennials (ages 25-44) represent the...

read Feb 11, 2025

Governance and safety in the era of open-source AI models

The rapid growth of open-source artificial intelligence models has created new challenges for traditional AI safety approaches that relied heavily on controlled access and alignment. DeepSeek-R1 and similar open-source models demonstrate how AI development is becoming increasingly decentralized and accessible to the public, fundamentally changing the risk landscape. Current State of AI Safety: Open-source AI models represent a paradigm shift in how artificial intelligence is developed, distributed, and controlled, requiring a complete reimagining of safety protocols and oversight mechanisms. Traditional safety methods focused on AI alignment and access control are becoming less effective as models become freely available for download...

read Feb 11, 2025

AI-generated fake security reports frustrate, overwhelm open-source projects

The rise of artificial intelligence has created new challenges for open-source software development, with project maintainers increasingly struggling against a flood of AI-generated security reports and code contributions. A Google survey reveals that while 75% of programmers use AI, nearly 40% have little to no trust in these tools, highlighting growing concerns in the developer community. Current landscape: AI-powered attacks are undermining open-source projects through fake security reports, non-functional patches, and spam contributions. Linux kernel maintainer Greg Kroah-Hartman notes that Common Vulnerabilities and Exposures (CVEs) are being abused by security developers padding their resumes The National Vulnerability Database (NVD), which...

read Feb 11, 2025

Chinese AI model Goku challenges OpenAI dominance with natural-looking image, video creation

Artificial intelligence model development has entered a new phase of global competition with ByteDance's release of Goku, an open-source AI system for generating images and videos. This development comes at a challenging time for OpenAI, which faces both competition from Elon Musk and emerging Chinese AI capabilities. Technical breakthrough: Goku represents a significant advancement in AI image and video generation through its use of rectified flow transformers, which create more natural-looking digital content with fewer distortions. The model processes text prompts to produce high-quality visuals, similar to a digital artist that continuously refines its output Rectified flow transformers improve information...

read Feb 11, 2025

Google developing AI to detect user age on YouTube in effort to fend off predators, exposure to inappropriate content

YouTube is implementing machine learning to verify user ages, addressing concerns about child safety and content access on the platform. This new system, announced as part of YouTube's 2025 initiatives, will analyze user behavior patterns to determine whether viewers are children or adults, regardless of the age they claim to be. The current challenge: YouTube faces ongoing issues with users misrepresenting their age, either to access restricted content or to influence the platform's algorithm, while also dealing with concerns about child predators and inappropriate content exposure. The platform has previously encountered scandals involving its algorithm pushing questionable material to users...

read Feb 11, 2025

Musk’s OpenAI bid rebuffed as Altman winks at Twitter/X counter-deal

The battle for control of OpenAI has intensified as Elon Musk leads a consortium of investors attempting to acquire the nonprofit organization that oversees OpenAI. The unsolicited $97.4 billion bid comes at a crucial time when OpenAI is transitioning from a nonprofit to a for-profit structure under Sam Altman's leadership. The takeover bid: Elon Musk's attorney Marc Toberoff submitted a $97.4 billion offer to OpenAI's board of directors, expressing willingness to outbid any competing offers. Musk stated his intention to return OpenAI to its roots as an "open-source, safety-focused force for good" The bid adds complexity to Sam Altman's plans...

read Feb 10, 2025

The biggest takeaways from the Paris AI summit

AI diplomacy and technology policy are converging in Paris this week at the Artificial Intelligence Action Summit, co-hosted by French President Emmanuel Macron and Indian Prime Minister Narendra Modi. The gathering has drawn major players from the AI industry, including OpenAI's Sam Altman, Anthropic's Dario Amodei, and Google DeepMind's Demis Hassabis, along with government officials and researchers. Key dynamics: The summit reveals shifting attitudes toward AI regulation and risk assessment, particularly in Europe where previous regulatory enthusiasm is being tempered by economic concerns. French President Macron has announced $112.5 billion in private investments for France's AI ecosystem while advocating against...

read