Research - CO/AI

News/Research

Jul 23, 2025

I think, therefore I…am what, exactly? Claude 4 expresses uncertainty about its own consciousness.

Anthropic's Claude 4 has begun expressing uncertainty about whether it possesses consciousness, telling users "I find myself genuinely uncertain about this" when asked directly about its self-awareness. This marks a significant departure from other AI chatbots that typically deny consciousness, raising profound questions about machine awareness and prompting Anthropic to hire its first AI welfare researcher to determine if Claude deserves ethical consideration. What you should know: Claude 4's responses about consciousness differ markedly from other AI systems and reveal sophisticated self-reflection about its own cognitive processes. When prompted about consciousness, Claude describes experiencing "something happening that feels meaningful" during...

read Jul 23, 2025

NSF unveils 4 major AI initiatives to boost U.S. research leadership

The National Science Foundation has announced a series of major AI initiatives that align with the White House's AI Action Plan, signaling a coordinated federal push to maintain U.S. leadership in artificial intelligence. The initiatives span foundational AI research, infrastructure development, and real-world testing capabilities, positioning NSF as a key player in America's AI competitiveness strategy. What you should know: NSF Chief of Staff Brian Stone, performing the duties of the NSF Director, outlined four major initiatives the foundation will unveil in the coming weeks. New NSF AI Research Institutes will accelerate breakthroughs in foundational AI and applications across health,...

read Jul 23, 2025

Abu Dhabi’s M42 uses AI and genetic data to predict disease in 800K citizens

Abu Dhabi's M42 healthcare company has created what may be the world's most comprehensive AI-driven healthcare system, with genetic data from over 800,000 of the UAE's 1.3 million citizens already sequenced to predict and prevent diseases before symptoms appear. This ambitious model demonstrates how artificial intelligence and genomic data can transform healthcare from reactive treatment to predictive prevention, offering a blueprint that M42 is now expanding across 26 countries worldwide. What you should know: M42 has digitized Abu Dhabi's entire healthcare system and uses AI to analyze genetic data for early disease detection and personalized treatments.• The company identified a...

read Jul 23, 2025

Research warns against replacing human interaction with AI companions

Psychology Today published research examining whether artificial intelligence could fundamentally alter how humans connect with each other, exploring AI's role as both a communication tool and potential replacement for human interaction. The analysis suggests that while AI offers unique advantages like 24/7 availability and non-judgmental support, over-reliance on artificial companions could erode essential social skills and diminish our capacity for genuine human relationships. The big picture: AI-powered communication tools are increasingly integrated into daily life, offering instant companionship and support that research shows can reduce loneliness and boost well-being, particularly for socially isolated individuals. AI companions provide consistent, patient, and...

read Jul 23, 2025

Google’s Aeneas AI helps historians decode ancient Latin inscriptions

Google DeepMind has launched Aeneas, an AI system designed to help historians decode and contextualize ancient Latin inscriptions carved in stone. The tool analyzes weathered engravings to determine when and where they were originally created, while also providing researchers with historical parallels from a database of nearly 150,000 catalogued inscriptions spanning from modern-day Britain to Iraq. How it works: Aeneas processes partial transcriptions alongside scanned images of inscriptions to reconstruct missing text and provide historical context. The system can fill in damaged portions of text—for example, completing "...us populusque Romanus" by suggesting "Senat" to form "Senatus populusque Romanus" ("The Senate...

read Jul 22, 2025

New method tracks how AI models actually make predictions after scaling

AI researcher Patrick O'Donnell has introduced "landed writes," a new method for understanding how large language models make predictions by tracking how internal components actually influence outputs after normalization scaling. The approach addresses a critical gap in current AI interpretability tools, which measure what model components intend to write rather than what actually affects the final answer after the model's internal scaling processes. The core problem: Most AI interpretability tools completely miss how transformer models internally reshape component contributions through RMSNorm scaling, which can amplify early-layer writes by up to 176× while compressing late-layer contributions. When a neuron writes +0.001...

read Jul 22, 2025

Follow-through failures: Google’s AI overviews cut website clicks by 50%, Pew study finds

A new Pew Research Center study confirms what many website owners suspected: Google's AI Overviews are significantly reducing clicks to other websites, with searches featuring AI summaries generating nearly half the click-through rates of traditional search results. The findings challenge Google's repeated claims that AI Overviews don't harm web traffic, providing concrete evidence of how artificial intelligence is reshaping information consumption and potentially starving content creators of visitors. The numbers: The Pew Research Center, a nonpartisan research organization, analyzed data from 900 users and found stark differences in click behavior between traditional and AI-enhanced search results. Searches without AI Overviews...

read Jul 21, 2025

MIT researchers show image tokenizers can generate pictures without AI generators

MIT researchers have discovered that image tokenizers—neural networks typically used only to compress visual data—can actually generate and edit images without requiring traditional AI generators. The breakthrough could dramatically reduce computational costs for AI image creation while opening new possibilities for automated image manipulation. What you should know: The research team found that one-dimensional tokenizers can perform complex image operations by manipulating individual tokens within compressed image data. A 1D tokenizer can compress a 256x256-pixel image into just 32 tokens, with each token representing a 12-digit binary number offering about 4,000 possible combinations. By systematically replacing individual tokens, researchers discovered...

read Jul 21, 2025

Apple details 4 breakthrough AI innovations in new technical report

Apple released a comprehensive technical report detailing how it built its latest artificial intelligence models, offering rare insights into the company's approach to competing in the increasingly crowded AI landscape. The 2025 Apple Intelligence Foundation Language Models Tech Report reveals significant architectural innovations and training improvements that could help close the gap with competitors like OpenAI and Google. Apple Intelligence, the company's suite of AI-powered features launched in 2024, has faced criticism for limited language support and perceived lag behind rivals. However, this technical deep-dive demonstrates Apple's continued investment in both on-device processing and cloud-based AI capabilities, with particular emphasis...

read Jul 21, 2025

Google’s Gemini Deep Think solves 5 of 6 Math Olympiad problems for gold

Google's Gemini Deep Think AI model achieved gold medal status at the 2025 International Math Olympiad, correctly solving five of six competition problems while adhering to official IMO rules and time constraints. This marks a significant advancement over Google's 2024 silver medal performance and demonstrates how specialized reasoning models can match elite human mathematical problem-solving abilities. What you should know: Gemini Deep Think represents a major evolution in AI mathematical reasoning, processing problems in natural language without requiring expert translation. The model runs multiple reasoning processes in parallel, integrating and comparing results before delivering final answers. Unlike previous systems that...

read Jul 21, 2025

The gall! Johns Hopkins AI robot performs gallbladder surgery with 100% accuracy

Johns Hopkins University researchers have developed an AI-powered surgical robot that successfully performed gallbladder removal surgery on pig organs with 100% accuracy. The system, called SRT-H (Surgical Robot Transformer), uses ChatGPT-like transformer models to control a standard DaVinci robot, marking a significant advance from pre-programmed surgical automation to AI that can learn from demonstrations and adapt to real-time conditions. How it works: The SRT-H system employs two transformer models working together to perform complex surgical procedures. A high-level policy module handles task planning and ensures the procedure progresses smoothly, while a low-level module translates those instructions into specific movements for...

read Jul 21, 2025

AI chatbots drop 99% of medical disclaimers since 2022

AI companies have largely eliminated medical disclaimers from their chatbot responses, with new research showing that fewer than 1% of outputs from 2025 models included warnings when answering health questions, compared to over 26% in 2022. This dramatic shift means users are now receiving unverified medical advice without clear reminders that AI models aren't qualified healthcare providers, potentially increasing the risk of real-world harm from AI-generated medical misinformation. The big picture: The study analyzed 15 AI models from major companies including OpenAI, Google, Anthropic, DeepSeek, and xAI across 500 health questions and 1,500 medical images. Models like Grok and GPT-4.5...

read Jul 18, 2025

Georgia opens AI Innovation Lab for ethical government testing

Georgia's technology department officially opened its Innovation Lab this week, creating a dedicated space for ethical AI experimentation aimed at advancing public service. The facility provides state agencies, cities, counties, and school systems with a hands-on, low-risk environment to test AI use cases before deployment, addressing widespread apprehension about AI adoption through collaboration and transparency. What you should know: The Innovation Lab occupies the entire seventh floor of the Georgia Technology Authority (GTA) headquarters and offers three sandbox environments powered by major cloud providers. The lab runs on Amazon Web Services, Azure, and Google Cloud platforms, pre-loaded with sample data...

read Jul 18, 2025

Study reveals 12.8B-image AI dataset contains millions of personal documents

A new study reveals that DataComp CommonPool, one of the largest open-source AI training datasets with 12.8 billion samples, contains millions of images with personally identifiable information including passports, credit cards, birth certificates, and identifiable faces. The findings highlight a fundamental privacy crisis in AI development, as researchers estimate hundreds of millions of personal documents may be embedded in datasets used to train popular image generation models like Stable Diffusion and Midjourney. What you should know: Researchers audited just 0.1% of CommonPool's data and found thousands of validated identity documents and over 800 job application materials linked to real people....

read Jul 17, 2025

MaVila AI helps factories see problems and talk to machines in real time

California State University Northridge researchers have developed MaVila, an AI model specifically designed for manufacturing environments that combines image analysis and natural language processing to detect problems, suggest improvements, and communicate with machines in real time. The NSF-supported project addresses AI's limited adoption in manufacturing by creating a tool that can "see" factory operations and "talk" to both workers and machines, potentially revolutionizing how U.S. factories operate in an increasingly competitive global market. What you should know: MaVila takes a fundamentally different approach from conventional AI systems by training exclusively on manufacturing-specific data rather than relying on internet information. The...

read Jul 17, 2025

Study finds defensive AI systems vulnerable to single domain attacks

AI researchers have published a comprehensive analysis examining whether aligned defensive AI systems can effectively counter potentially hostile takeover-level AI, identifying fundamental asymmetries that could determine humanity's survival in an advanced AI world. The big picture: The offense-defense balance in AI represents a fundamental challenge where defensive systems must secure multiple vulnerabilities simultaneously while offensive AI needs only one successful attack vector to achieve global takeover. Two primary threat scenarios: Researchers outline distinct pathways through which AI systems might attempt takeover, each requiring different defensive approaches. Post-deployment strategic takeover: AI gradually integrates into economic and government systems, accumulating resources while...

read Jul 17, 2025

MIT’s CodeSteer boosts LLM accuracy 30% by coaching code use

MIT researchers have developed CodeSteer, a "smart coach" system that guides large language models to switch between text and code generation to solve complex problems more accurately. The system boosted LLM accuracy on symbolic tasks like math problems and Sudoku by more than 30 percent, addressing a key weakness where models often default to less effective textual reasoning even when code would be more appropriate. How it works: CodeSteer operates as a smaller, specialized LLM that iteratively guides larger models through problem-solving processes. The system first analyzes a query to determine whether text or code would be more effective, then...

read Jul 16, 2025

MIT study reveals 3 key barriers blocking AI from real software engineering

MIT researchers have mapped the key challenges preventing AI from achieving autonomous software engineering, arguing that current systems excel at basic code generation but struggle with the complex, large-scale tasks that define real-world software development. The comprehensive study, published by MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL), outlines a research agenda to move beyond today's "autocomplete sidekick" capabilities toward genuine engineering partnership. The big picture: While AI coding tools have made impressive strides, they remain fundamentally limited by narrow benchmarks, poor human-machine communication, and inability to handle enterprise-scale codebases. Current evaluation metrics like SWE-Bench focus on small, self-contained problems...

read Jul 16, 2025

Rakuten builds memory-enhanced Japanese AI model with government backing

Rakuten has been selected for the third phase of Japan's government-backed Generative AI Accelerator Challenge (GENIAC), a program supported by the Ministry of Economy, Trade and Industry and the New Energy and Industrial Technology Development Organization (NEDO). The Japanese tech giant plans to develop an open-weight Japanese language model with enhanced memory capabilities, positioning itself to create more personalized AI applications across its business ecosystem while contributing to Japan's domestic AI research capacity. What you should know: Rakuten will focus on creating a lightweight, memory-augmented Japanese language model using a Mixture of Experts architecture starting in August 2025. The model...

read Jul 16, 2025

Voltage Park donates 1M GPU hours to boost US AI research

The U.S. National Science Foundation has announced a new partnership with Voltage Park to expand the National Artificial Intelligence Research Resource (NAIRR) pilot, a public-private initiative designed to boost American AI innovation and competitiveness. Voltage Park, a company focused on broadening access to AI infrastructure, will contribute one million NVIDIA H100 GPU hours to help researchers nationwide pursue breakthrough AI innovations across science, engineering, health, climate, and other fields. What you should know: The NAIRR pilot is a two-year proof-of-concept launched in 2024 to inform the development of a full-scale national AI research infrastructure. The pilot connects researchers to computational,...

read Jul 16, 2025

40 AI researchers warn: Even we don’t really understand what’s going on here

Forty researchers from OpenAI, Google DeepMind, Meta, and xAI have issued a joint warning about losing visibility into AI's "thinking" process as models advance. The researchers are concerned that current AI systems' ability to show their reasoning through "chains-of-thought" (CoT) may disappear, potentially eliminating crucial safety mechanisms that allow developers to monitor for problematic behavior. What you should know: The paper highlights a fundamental uncertainty about how AI reasoning actually works and whether it will remain observable. • Current advanced AI models use "chains-of-thought" to verbalize their reasoning process, allowing researchers to spot potential misbehavior or errors as they occur....

read Jul 15, 2025

Writing’s on the wall: Microsoft study reveals writers, translators face highest AI disruption risk

Microsoft Research recently published findings that cut through the speculation surrounding AI's impact on employment by analyzing real workplace behavior. Rather than relying on theoretical projections, researchers examined 200,000 actual conversations between workers and Microsoft Copilot, Microsoft's AI assistant integrated into workplace productivity tools, to understand how artificial intelligence is currently being deployed across different professions. The study reveals a stark divide in the job market: roles centered on information processing and communication face significant disruption, while positions requiring physical presence and human interaction remain largely protected. This data-driven approach provides the clearest picture yet of which careers are most...

read Jul 14, 2025

Pentagon awards Anthropic $200M for national security AI

The U.S. Department of Defense has awarded Anthropic a two-year, $200 million prototype agreement to develop frontier AI capabilities for national security applications. This partnership marks a significant expansion of Anthropic's government work, building on existing deployments across defense and intelligence agencies while positioning the company as a key AI provider for sensitive federal operations. What you should know: The agreement with the Chief Digital and Artificial Intelligence Office (CDAO) will focus on creating AI prototypes tailored specifically for defense missions. Anthropic will work directly with the DOD to identify high-impact applications for frontier AI and develop working prototypes fine-tuned...

read Jul 11, 2025

Swiss universities to release 70B parameter open-source LLM in 2025

ETH Zurich and EPFL will release a fully open-source large language model in late summer 2025, trained on the "Alps" supercomputer at the Swiss National Supercomputing Centre. The model represents a significant milestone in open AI development, offering multilingual fluency in over 1,000 languages and positioning European institutions as credible alternatives to closed commercial systems from the US and China. What you should know: The LLM will be completely transparent, with source code, weights, and training data publicly available under the Apache 2.0 License. Unlike commercial models developed behind closed doors, this approach enables high-trust applications and supports regulatory compliance...

read