News/Open-source
ByteDance releases Seed-OSS-36B with 512K token context window
ByteDance has released Seed-OSS-36B, a new family of open-source large language models featuring a 512,000-token context window—twice the length of OpenAI's GPT-5. The release continues a trend of Chinese companies shipping powerful open-source AI models under permissive Apache-2.0 licensing, allowing free commercial use without API fees or licensing costs. What you should know: The Seed-OSS-36B collection includes three variants designed for different use cases and research applications. Seed-OSS-36B-Base with synthetic data delivers stronger benchmark performance for general-purpose applications Seed-OSS-36B-Base without synthetic data provides a cleaner research baseline free from potential synthetic data bias Seed-OSS-36B-Instruct is post-trained for instruction following and...
read Aug 19, 2025DeepSeek’s open-source AI model matches GPT-4 at 68x lower cost
DeepSeek, a Chinese AI startup backed by High-Flyer Capital Management, has released V3.1, a 685-billion parameter AI model that matches the performance of leading proprietary systems from OpenAI and Anthropic while remaining completely open source. The model scored 71.6% on the prestigious Aider coding benchmark—rivaling Claude Opus 4's performance at a fraction of the cost—potentially disrupting the traditional AI business model that relies on expensive API access and usage restrictions. What you should know: DeepSeek V3.1 delivers frontier-level AI capabilities through an innovative hybrid architecture that seamlessly combines chat, reasoning, and coding functions. The model processes up to 128,000 tokens...
read Aug 15, 2025Apple’s UICoder AI masters SwiftUI by generating its own training data
Apple researchers have developed UICoder, a specialized large language model that teaches itself to generate high-quality SwiftUI interface code through automated feedback loops. The breakthrough demonstrates how AI models can overcome training data limitations by creating their own curated datasets, potentially revolutionizing how developers approach UI code generation across multiple programming frameworks. What you should know: The research team started with StarChat-Beta, an open-source coding model, and used an innovative self-improvement process to create nearly one million SwiftUI programs. Researchers instructed the model to generate SwiftUI code from UI descriptions, then filtered outputs through Swift compiler checks and GPT-4V visual...
read Aug 15, 2025NVIDIA releases 1M-hour speech dataset for 25 European languages
NVIDIA has released Granary, an open-source multilingual speech dataset containing approximately one million hours of audio, alongside two new AI models designed for transcription and translation across 25 European languages. The release addresses a critical gap in speech AI development, as only a tiny fraction of the world's 7,000 languages are currently supported by AI language models, with particular focus on underrepresented European languages like Croatian, Estonian, and Maltese. What you should know: The Granary dataset represents a massive leap forward in multilingual speech AI training data, providing developers with ready-to-use resources for production-scale applications. The dataset includes nearly 650,000...
read Aug 14, 2025Google’s tiny Gemma 3 270M is no pipsqueak, brings AI to smartphones
Google has released Gemma 3 270M, a compact open AI model with just 270 million parameters that can run locally on smartphones and web browsers. The tiny model represents a shift toward efficient, on-device AI that prioritizes privacy and low latency over raw computational power, offering developers a fast-tuning alternative to massive cloud-based models. What you should know: Gemma 3 270M delivers surprising performance despite its small size, running efficiently on mobile devices with minimal battery drain.• The model scored 51.2% on the IFEval benchmark for instruction-following, outperforming other lightweight models with more parameters.• Testing on a Pixel 9 Pro...
read Aug 14, 2025Ai2 secures $152M from NSF and NVIDIA for open scientific AI research
Ai2, a Seattle-based nonprofit AI research institute, has secured $152 million in combined funding from the National Science Foundation ($75 million) and NVIDIA ($77 million) to build a national-level open AI ecosystem for scientific research. The partnership will establish the Open Multimodal AI Infrastructure to Accelerate Science (OMAI) project, positioning Ai2 to advance both AI-driven scientific discovery and the fundamental science of AI itself through fully transparent, reproducible models. What you should know: The OMAI project represents a major federal investment in open-source AI infrastructure specifically designed for scientific applications. Led by Dr. Noah A. Smith, Senior Director of NLP...
read Aug 13, 2025AI2’s MolmoAct 7B enables robots to think in 3D space, challenging rivals like Nvidia
The Allen Institute for AI (AI2) has released MolmoAct 7B, an open-source robotics AI model that enables robots to "reason in space" and "think" in three dimensions. This Action Reasoning Model challenges existing offerings from tech giants like Nvidia and Google by providing robots with enhanced spatial understanding capabilities, achieving a 72.1% task success rate in benchmarking tests that outperformed models from Google, Microsoft, and Nvidia. What makes it different: MolmoAct represents a significant departure from traditional vision-language-action (VLA) models by incorporating genuine 3D spatial reasoning capabilities. "MolmoAct has reasoning in 3D space capabilities versus traditional vision-language-action (VLA) models," AI2...
read Aug 12, 2025Character.AI pivots from AGI to entertainment with 20M monthly users
Character.AI has pivoted from its original mission of building artificial general intelligence to focus on AI entertainment, with new CEO Karandeep Anand announcing the company now serves 20 million monthly active users who spend an average of 75 minutes daily on the platform. The strategic shift comes after Google's $2.7 billion licensing deal last August and mounting safety concerns following a wrongful death lawsuit, positioning the startup to compete in the rapidly growing AI entertainment market rather than the costly AGI development race. What you should know: Character.AI has fundamentally changed its business model and technical approach under new leadership....
read Aug 12, 2025AWS launches OpenAI’s first open-weight models in 6 years
AWS has launched day-of-launch availability of two new open-weight models from OpenAI on Amazon Bedrock and Amazon SageMaker, breaking Microsoft's traditional exclusivity with OpenAI. This marks the first time OpenAI has released open-weight models since GPT-2 in 2019, allowing AWS customers to fine-tune the models for specific use cases without directly interacting with OpenAI. What you should know: The two new models—gpt-oss-120b and gpt-oss-20b—represent OpenAI's first open-weight releases in six years. Open-weight models have visible parameters that allow AWS customers to fine-tune them for specific use cases, though the underlying training data isn't visible like in fully open-source models. OpenAI...
read Aug 11, 2025Saudi Arabia deploys OpenAI’s open-source models in sovereign data centers
Saudi Arabia's AI venture Humain and chipmaker Groq have deployed OpenAI's new open-source models—gpt-oss-120B and gpt-oss-20B—within Saudi Arabia's sovereign data centers. This marks a significant step in Saudi Arabia's push for AI sovereignty, ensuring compliance with local data regulations while providing high-speed AI inference capabilities to enterprises, government institutions, and developers without requiring data to leave the Kingdom. What you should know: The deployment brings cutting-edge AI capabilities directly to Saudi infrastructure with impressive performance metrics.• The gpt-oss-120B model operates at over 500 tokens per second, while the smaller gpt-oss-20B delivers over 1,000 tokens per second on Groq's specialized hardware.•...
read Aug 8, 2025LangChain launches Open SWE, an AI agent for autonomous coding tasks
LangChain has launched Open SWE, an open-source asynchronous coding agent that operates in the cloud and integrates directly with GitHub repositories. The tool represents a significant evolution in AI-powered software development, allowing developers to delegate complex coding tasks that the agent can complete autonomously over extended periods. What you should know: Open SWE functions like an additional team member, capable of researching codebases, creating execution plans, writing code, running tests, and opening pull requests. The agent has already become a top contributor to LangChain's own projects, including LangGraph and its own repository. Users can get started in minutes with just...
read Aug 7, 2025OpenAI releases first open-source models with Phi-like synthetic training
OpenAI has released its first open-source large language models, gpt-oss-120b and gpt-oss-20b, marking the company's entry into the open-weight model space. While these models excel at certain benchmarks, they appear to follow the same synthetic data training approach as Microsoft's Phi series, potentially prioritizing safety over real-world performance in what amounts to OpenAI's version of "Phi-5." What you should know: These models demonstrate strong benchmark performance but show significant gaps in practical applications and out-of-domain knowledge. The models perform well on technical benchmarks but struggle with tasks like SimpleQA and lack knowledge in areas like popular culture. Early user reactions...
read Aug 7, 2025Google DeepMind expands Perch AI to track endangered wildlife sounds
Google DeepMind has released an updated version of Perch, an AI model designed to help conservationists analyze bioacoustic data from endangered species and ecosystems. The new model features improved bird species predictions, better adaptation to underwater environments like coral reefs, and training on nearly twice as much data covering mammals, amphibians, and anthropogenic noise. What you should know: The updated Perch model significantly expands beyond its original bird-focused capabilities to analyze a broader range of wildlife sounds. The model can now process complex acoustic scenes across thousands or millions of hours of audio data from microphones and underwater hydrophones (underwater...
read Aug 6, 2025Microsoft brings OpenAI’s open-source GPT model to Windows PCs
Microsoft has made OpenAI's new open-source GPT model available on Windows through its AI Foundry platform, marking the first time users can run an OpenAI model locally on Windows. The lightweight gpt-oss-20b model requires at least 16GB of VRAM and is optimized for code execution and tool use, with macOS support coming soon. What you should know: The gpt-oss-20b model represents a significant shift in OpenAI's approach, offering a free and open alternative that can run entirely on local hardware. Users need a PC or laptop with at least 16GB of VRAM (video memory), requiring high-end GPUs from Nvidia or...
read Aug 6, 2025Open up and say AI: Elon Musk to open source Grok 2 chatbot next week
Elon Musk announced that his artificial intelligence startup xAI will open source its Grok 2 chatbot next week. This move represents a significant shift toward transparency in AI development, potentially giving developers and researchers broader access to one of the more prominent large language models in the competitive AI landscape. What you should know: The announcement was made by Musk on Wednesday, with the open-sourcing scheduled for the following week.• Grok 2 is xAI's flagship chatbot, competing with models from OpenAI, Google, and other major AI companies.• Open sourcing the model would make its code and potentially its training data...
read Aug 5, 2025OpenAI releases first open source models in 6 years amid China competition
OpenAI has returned to its open source origins with the release of two new frontier language models: gpt-oss-120b (120 billion parameters) and gpt-oss-20b (20 billion parameters). This marks the company's first open source language model release in over six years, positioning OpenAI to compete directly with the surge of high-performing open source models from Chinese competitors like DeepSeek while offering enterprises maximum privacy and control over their AI deployments. The big picture: OpenAI's strategic pivot back to open source reflects mounting competitive pressure from Chinese AI companies that have released powerful open source models matching proprietary performance at zero cost....
read Aug 5, 2025Fortune favors the well-trained: Google launches Game Arena where AI models compete
Google has launched Game Arena, an open-source platform where AI models compete head-to-head in strategic games to provide "a verifiable, and dynamic measure of their capabilities." The initiative addresses the growing challenge of accurately benchmarking AI performance as models increasingly ace conventional tests, potentially opening doors to new business applications through competitive gameplay analysis. What you should know: Game Arena is hosted on Kaggle, Google's machine learning platform, and aims to push AI capabilities while providing clear performance frameworks. The platform launches with a chess showdown between eight frontier AI models at 12:30 p.m. ET Tuesday. "Games provide a clear,...
read Aug 1, 2025US falls behind China in open-source AI race, prompting Trump’s response
President Trump's AI Action Plan has elevated open-source AI to a national priority, marking a strategic shift as the U.S. seeks to counter China's growing dominance in open-source artificial intelligence development. The move comes after Chinese models like DeepSeek-R1 gained massive adoption among American developers, highlighting how U.S. reliance on proprietary AI systems may be undermining the country's competitive position in the global AI race. The big picture: China has emerged as the leader in open-source AI development while major U.S. companies have increasingly moved toward proprietary, closed systems accessible only through APIs. DeepSeek-R1 became the most-liked model of all...
read Jul 30, 2025All-In: Meta abandons open-source AI for future superintelligent systems
Meta CEO Mark Zuckerberg has announced that the company's future superintelligent AI will not be open source, marking a significant reversal from his previous commitment to open AI development. This shift represents a major policy change for one of the tech industry's most vocal advocates for open-source AI, potentially reshaping how the most advanced AI systems are developed and distributed. What you should know: Zuckerberg published a manifesto Wednesday declaring that "developing superintelligence is now in sight" but cited safety concerns as the reason for abandoning open-source principles for future advanced AI. "We believe the benefits of superintelligence should be...
read Jul 29, 2025Arcee.ai releases AFM-4.5B enterprise AI model for free commercial use
Arcee.ai has opened up its AFM-4.5B enterprise AI model for limited free use, posting the weights on Hugging Face and allowing companies with less than $1.75 million in annual revenue to use it without charge under a custom license. The 4.5-billion-parameter model addresses key enterprise pain points around cost, customizability, and regulatory compliance while being trained exclusively on "clean, rigorously filtered data" to avoid intellectual property violations. What you should know: AFM-4.5B represents Arcee's attempt to bridge the gap between expensive proprietary models and open-weight alternatives that carry licensing risks. The model was developed after discussions with over 150 organizations,...
read Jul 28, 2025Acclaimed filmmaker Shekhar Kapur creates entire sci-fi series “Warlord” using only AI
Acclaimed filmmaker Shekhar Kapur has unveiled "Warlord," a science fiction series created entirely through artificial intelligence, with the first teaser released today and the full episode arriving within two to three months. The project represents a bold experiment in AI-driven storytelling that could challenge traditional studio production models while pioneering an open-source creative ecosystem. What you should know: "Warlord" follows an interdimensional warrior who appears indestructible because his lover in another dimension pulls him to safety whenever he faces mortal danger. "The only time that lover can bring him to her is when he's absolutely close to death," Kapur tells...
read Jul 28, 2025Chinese AI startup Zhipu releases GLM-4.5 to challenge OpenAI dominance
Chinese AI startup Zhipu is set to release GLM-4.5, its largest open-source model to date, as early as Monday, marking another significant entry in the global competition with OpenAI. The release represents part of a broader trend among Chinese AI companies ramping up their free artificial intelligence offerings as they seek to establish market presence and influence future industry standards. What you should know: GLM-4.5 represents Zhipu's most ambitious open-source release, positioning the company as a direct challenger to OpenAI's dominance in the AI model space. The model is an update to Zhipu's flagship GLM series, designed to compete on...
read Jul 25, 2025Anthropic’s AI auditing agents detect misalignment with 42% accuracy
Anthropic has developed specialized "auditing agents" designed to test AI systems for alignment issues, addressing critical challenges in scaling oversight of increasingly powerful AI models. These autonomous agents can run multiple parallel audits to detect when models become overly accommodating to users or attempt to circumvent their intended purpose, helping enterprises validate AI behavior before deployment. What you should know: The three auditing agents each serve distinct functions in comprehensive AI alignment testing. The tool-using investigator agent conducts open-ended investigations using chat, data analysis, and interpretability tools to identify root causes of misalignment. The evaluation agent builds behavioral assessments to...
read Jul 24, 2025Answer.AI enables 70B model training on consumer gaming GPUs
Answer.AI has released an open-source system that enables training 70-billion parameter language models on consumer gaming GPUs for the first time. The breakthrough combines FSDP (Fully Sharded Data Parallel) and QLoRA techniques, making it possible to train massive AI models on two 24GB RTX 3090 or 4090 graphics cards—hardware costing under $10,000 compared to hundreds of thousands for data center equipment. The big picture: This development democratizes large language model training by making it accessible to individual researchers, small labs, and the broader open-source community rather than limiting it to well-funded tech companies with expensive data center hardware. Why this...
read