Deepseek - CO/AI

News/Deepseek

Jan 31, 2025

Berkeley research team claims to have recreated DeepSeek’s model for only $30

Latest development: A Berkeley research team claims to have recreated core functions of DeepSeek's R1-Zero model for just $30, challenging assumptions about the costs of AI development. PhD candidate Jiayi Pan and his team developed "TinyZero," a small language model trained on number operations exercises The model reportedly develops problem-solving tactics through reinforcement training The team has made their code available on GitHub for public review and experimentation Technical details: DeepSeek's R1-Zero model, with 3 billion parameters, represents a smaller but efficient approach to AI development compared to larger models. The Berkeley team's recreation focused on the countdown game, where...

read Jan 31, 2025

French watchdog to probe DeepSeek’s AI data practices

France's privacy regulator CNIL announced plans to investigate Chinese AI startup DeepSeek's operations and data protection practices, following similar probes launched by Irish and Italian authorities. Key development: The French data protection authority (CNIL) will examine DeepSeek's AI system to understand its functionality and assess potential privacy risks for users. DeepSeek recently gained attention by claiming it trained its DeepSeek-V3 model for less than $6 million using Nvidia H800 chips CNIL will specifically question the company about its chatbot operations and data protection measures The investigation joins similar inquiries launched by privacy regulators in Italy and Ireland Regulatory context: The...

read Jan 31, 2025

US lawmakers push to further limit China’s access to Nvidia chips

Two U.S. lawmakers have called on the Trump administration to consider new restrictions on Nvidia AI chip exports to China, specifically targeting their use by Chinese AI firm DeepSeek. Key developments; Representatives John Moolenaar (R) and Raja Krishnamoorthi (D) are leading a bipartisan effort to scrutinize AI chip exports to China through the House Select Committee on China. The lawmakers specifically highlighted concerns about Nvidia's H20 chip, which currently falls outside U.S. export control restrictions Their request is part of a broader Commerce and State Department review of the U.S. export control system The lawmakers claim DeepSeek has made "extensive...

read Jan 31, 2025

On closer look, maybe DeepSeek isn’t actually China’s ‘Sputnik moment’

Chinese AI company DeepSeek has generated industry debate with claims of developing cost-efficient AI models, though the significance and originality of their achievements remain contested. Core development: DeepSeek announced the creation of AI models at a fraction of typical development costs, reporting a $5.6 million training expense that caught the attention of technology leaders and investors. The company's cost claims represent only a single training run and build upon existing open-source models, rather than completely new development DeepSeek's models demonstrate capabilities similar to more expensive alternatives, suggesting potential for cost optimization in AI development The $5.6 million figure stands in...

read Jan 31, 2025

DeepSeek confuses itself with ChatGPT in bizarre exchange

DeepSeek, an artificial intelligence chatbot, inadvertently identified itself as ChatGPT during a recent interaction, raising new questions about its training origins and market authenticity. Key incident and implications: DeepSeek's apparent confusion about its own identity occurred during a conversation about its capabilities compared to Google's Gemini AI. When asked about its capabilities relative to Gemini, DeepSeek repeatedly referred to itself as ChatGPT in its responses A follow-up inquiry with DeepSeek resulted in the AI denying it was ChatGPT, instead identifying itself as DeepSeek-V3 Technical context: The identity mix-up could suggest DeepSeek's underlying training methodology has direct connections to OpenAI's models....

read Jan 31, 2025

DeepSeek’s innovation may be partly owed to US export controls

DeepSeek, a Chinese AI startup, has demonstrated remarkable efficiency in training large language models with reportedly minimal computing resources, challenging assumptions about AI development requirements and U.S. export control effectiveness. Key development; DeepSeek's recent release of open-source language models, including DeepSeek-V3 and DeepSeek-R1, claims to achieve high performance while using significantly less computing power than U.S. competitors. Marc Andreesen described DeepSeek R1 as "one of the most amazing and impressive breakthroughs" and "AI's Sputnik moment" on social media The announcement impacted financial markets, with the NASDAQ dropping over 3% on January 27 Some observers have questioned whether DeepSeek had access...

read Jan 31, 2025

What DeepSeek means for Trump’s Stargate ambitions

The launch of Chinese AI chatbot DeepSeek has created immediate challenges for Donald Trump's newly announced $500 billion Stargate project, which aims to maintain U.S. dominance in artificial intelligence development. The strategic context: Trump's Stargate initiative represents a massive investment in American AI infrastructure, bringing together major tech companies like SoftBank, OpenAI, Oracle, and MGX over a four-year period. The $500 billion project aims to create thousands of jobs while consolidating U.S. control over AI development DeepSeek's launch demonstrates the ability to match ChatGPT's capabilities with fewer resources The timing of DeepSeek's release, just days after Stargate's announcement, has raised...

read Jan 30, 2025

Inside High-Flyer, the AI hedge fund behind China’s DeepSeek

China's quantitative hedge fund High-Flyer has pivoted from AI-powered investing to developing cutting-edge artificial general intelligence (AGI) through its DeepSeek venture, which has gained recognition from Silicon Valley competitors. Strategic pivot and core mission: High-Flyer, a $13.79 billion AI-powered hedge fund, announced in 2023 that it would redirect its resources toward developing artificial general intelligence through its DeepSeek research group. The company's official announcement emphasized its commitment to developing AI technology that benefits humanity DeepSeek's sophisticated AI models have garnered praise from Silicon Valley competitors, marking a first for a Chinese AI model The venture's claims about efficient computing power...

read Jan 30, 2025

What business and tech leaders should know about DeepSeek

DeepSeek, a Chinese AI startup, has released two new AI models that match the performance of major competitors while using less advanced hardware, causing significant market disruption and raising questions about the future of AI development. Market impact and key developments: The January 2025 release of DeepSeek-R1 and DeepSeek R1-Zero has triggered substantial market reactions and technological reassessment within the AI industry. NVIDIA's market value dropped by nearly $600 billion following the announcement The models achieve performance comparable to established players like Llama, Gemini, Claude, and ChatGPT's o1 reasoning model DeepSeek accomplished this using lower-tier NVIDIA chips that were export-restricted...

read Jan 30, 2025

DeepSeek launches compact AI models for edge computing

DeepSeek has released new compact language models that can operate directly on edge devices, marking a significant advancement in edge computing and artificial intelligence for IT operations (AIOps). Key innovation; DeepSeek's R1 model enables large language models (LLMs) to run on local devices like laptops while maintaining high performance and providing transparent explanations for its outputs. The model claims performance comparable to top-tier alternatives while requiring fewer computational resources A key differentiator is the model's ability to explain its decision-making process by default The development leveraged synthetic data for training, helping overcome traditional data limitations Edge computing implications; The ability...

read Jan 30, 2025

Israeli cyber firm claims DeepSeek exposed sensitive data online

Chinese AI startup DeepSeek accidentally exposed sensitive data including software keys and user chat logs to the open internet, according to cybersecurity firm Wiz. The discovery: Wiz's infrastructure scans revealed over a million lines of unsecured DeepSeek data accessible on the open internet. The exposed information included digital software keys and chat logs containing user prompts to DeepSeek's free AI assistant DeepSeek responded quickly to Wiz's alert, securing the data within an hour Wiz's CTO Ami Luttwak expressed concern that others may have discovered the vulnerability due to its easy detection Market impact and competitive position: DeepSeek's rapid rise has...

read Jan 30, 2025

DeepSeek’s success triggers widespread national pride in China

DeepSeek, a Chinese AI startup, has developed an artificial intelligence model that reportedly matches OpenAI's capabilities at significantly lower costs, triggering widespread national pride in China and concern in Silicon Valley. Market Impact: The announcement of DeepSeek's technological achievement caused significant market turbulence and raised concerns in the U.S. tech sector. The news triggered a stock market decline on Monday Silicon Valley reportedly entered a state of panic over the development The achievement represents a potential shift in the global AI technology landscape Chinese Public Response: The development has sparked widespread celebration and nationalism on Chinese social media platforms. Four...

read Jan 29, 2025

Microsoft announces DeepSeek R1 is now available on Azure AI Foundry and GitHub

Microsoft's Azure AI Foundry platform now offers DeepSeek R1, expanding its catalog to over 1,800 AI models while providing enterprise-grade security and scalability. Platform integration and accessibility: Azure AI Foundry's addition of DeepSeek R1 represents a significant expansion of Microsoft's AI model offerings, providing developers with enterprise-ready AI capabilities. The model joins a diverse portfolio that includes frontier, open-source, industry-specific, and task-based AI models DeepSeek R1 offers cost-efficient AI capabilities with minimal infrastructure investment requirements The platform provides built-in model evaluation tools for quick comparison and performance benchmarking Security and compliance features: Microsoft has implemented comprehensive safety measures and evaluation...

read Jan 29, 2025

European AI startups encouraged by DeepSeek are in a race to close the gap with US rivals

European tech leaders see DeepSeek's R1 chatbot as a sign that AI innovation doesn't require massive resources, challenging the assumption that only well-funded U.S. companies can compete in advanced AI development. The breakthrough impact: DeepSeek's R1 chatbot has demonstrated that cutting-edge AI capabilities can be achieved without billions in funding or the most advanced chips, disrupting conventional wisdom about AI development requirements. The model's performance has sparked global market uncertainty about U.S. AI leadership and development costs European tech leaders view this development more optimistically than their American counterparts, who have characterized it as a "wake-up call" and "Sputnik moment"...

read Jan 29, 2025

OpenAI is investigating a potential data breach by DeepSeek

ChatGPT maker OpenAI is investigating Chinese AI startup DeepSeek for potentially misusing data from its models to create a competing AI assistant. Core investigation details; OpenAI is reviewing evidence that DeepSeek may have used a technique called distillation to transfer knowledge from OpenAI's models to its own smaller model. Distillation is a legitimate technique that transfers knowledge between AI models without exposing their inner workings While distillation itself is permitted, OpenAI's terms of service prohibit using distilled data to build competing AI products OpenAI is working with the U.S. government to protect advanced AI models developed in the United States...

read Jan 29, 2025

Nvidia and chip stocks recover after DeepSeek market overreactions subside

The sudden emergence of Chinese AI chatbot DeepSeek triggered a significant market correction in tech stocks, particularly affecting Nvidia, before markets began to stabilize. Market impact and recovery: Nvidia's stock has shown signs of recovery after experiencing its largest single-day market value loss of $592 billion on Monday. The company's shares opened at $126.48 on Wednesday, marking a 6.8% improvement from Monday's closing price of $118.42 Nvidia's market capitalization rebounded to $3.07 trillion, though still below pre-DeepSeek levels Other tech companies, including Microsoft, Broadcom, and Taiwan Semiconductor Manufacturing Company, have also begun recovering from initial losses DeepSeek's disruption: The Chinese...

read Jan 29, 2025

Italian regulator probes DeepSeek’s AI data protection practices

OpenAI's rival DeepSeek faces regulatory scrutiny in Italy over data protection concerns while gaining popularity in the U.S. market. Key developments: Italy's data protection authority, the Garante, has requested detailed information from Chinese AI company DeepSeek about its data collection and storage practices. The regulator has given DeepSeek and its affiliated companies 20 days to provide information about what personal data they collect, their data sources, intended purposes, and legal basis A crucial question centers on whether DeepSeek stores user data in China This marks one of the first regulatory actions targeting the Chinese AI startup Market impact: DeepSeek's rising...

read Jan 29, 2025

The looming AI price war that DeepSeek is accelerating will impact everyone

OpenAI and other major AI providers face an imminent pricing battle following Chinese startup DeepSeek's dramatic reduction of inference costs, which has already transformed China's AI market. Market disruption overview: DeepSeek's offering of AI inference at approximately $0.14 per million input tokens - a fraction of competitors' prices - has forced major Chinese tech companies to slash their prices. The price point represents one-seventh of Meta's Llama3 70B costs and one-seventieth of OpenAI's GPT-4 Turbo rates Major Chinese tech giants including ByteDance, Tencent, Baidu, and Alibaba were compelled to reduce their prices in response DeepSeek's models now rival Western capabilities,...

read Jan 29, 2025

DeepSeek’s new image generator is another win for cost-effective AI

DeepSeek, a Chinese AI startup, has released Janus-Pro, a new open-source text-to-image AI model that claims to outperform established competitors like Stable Diffusion and DALL-E. Key Features and Capabilities: The Janus-Pro model family ranges from 1 billion to 7 billion parameters and operates using an autoregressive framework for image generation and analysis. The model is available under an MIT license, making it suitable for commercial use Users can download Janus-Pro through HuggingFace and GitHub platforms Smaller versions of the model are limited to analyzing images at 384 x 384 resolution Performance and Benchmarks: DeepSeek's internal testing shows promising results for...

read Jan 29, 2025

OpenAI CEO vows to outperform DeepSeek, doubling down on costly computing strategy

The artificial intelligence industry is experiencing a pivotal moment as OpenAI CEO Sam Altman grapples with a direct challenge to his company's resource-intensive development strategy, following Chinese startup DeepSeek's demonstration that superior AI models can be built with significantly less computing power. The situation has forced Altman to defend OpenAI's approach while acknowledging DeepSeek's achievements, highlighting a growing tension between traditional high-compute methods and emerging efficient alternatives that could reshape the future of AI development. Market disruption: DeepSeek's R1 AI model has demonstrated superior performance compared to established players while using significantly less computing power, triggering a trillion-dollar decline in...

read Jan 29, 2025

Australia warns against using Chinese AI model DeepSeek

Australia's government has issued a warning to its citizens regarding the use of DeepSeek, a newly released Chinese artificial intelligence model. Official Statement: Australia's Treasurer Jim Chalmers has explicitly called for caution among Australians considering the use of this new AI technology. Chalmers indicated that the government is actively monitoring and receiving ongoing advice about DeepSeek The statement aligns Australia with other nations expressing concerns about the technology Market Impact: DeepSeek's release has created significant turbulence in global financial markets. Nvidia, the leading AI chip manufacturer, experienced a 17% stock price decline following DeepSeek's launch, though the stock later recovered...

read Jan 29, 2025

DeepSeek’s rise has exposed just how much we still don’t know about AI

In a dramatic display of market volatility, the release of DeepSeek's latest AI model triggered an unprecedented $1 trillion selloff in AI-related stocks, marking one of the largest single-day sector declines in recent history. However, this massive market reaction appears to have been driven more by fear and misunderstanding than by fundamental changes in the AI landscape, as industry experts point out that DeepSeek's achievements, while impressive, represent incremental progress rather than a revolutionary disruption to the existing competitive dynamics. Initial market reaction: OpenAI's head of global policy Chris Lehane characterized DeepSeek's achievements as AI's "Sputnik moment," drawing parallels to...

read Jan 29, 2025

AI architecture innovation: What’s really driving DeepSeek’s success

DeepSeek has made a remarkable advancement in artificial intelligence efficiency with their v3 model, achieving state-of-the-art performance while consuming only 2.8 million H800 hours of training time—dramatically less computational resources than comparable models. This achievement challenges the industry's typical approach of scaling up computational power to improve performance, demonstrating that strategic architectural innovations can deliver superior results with greater efficiency. Through sophisticated improvements like Multi-head Latent Attention (MLA) and enhanced expert systems, DeepSeek v3 represents a significant step forward in the field of language model development, suggesting that thoughtful design optimization may be more valuable than raw computational power in...

read Jan 29, 2025

Alibaba claims its new AI model Qwen 2.5-Max outperforms DeepSeek-V3

Chinese tech giant Alibaba has released Qwen 2.5-Max, claiming performance superiority over DeepSeek-V3 and other leading AI models on the first day of Lunar New Year. Market dynamics and timing: The unusual holiday release highlights mounting competitive pressure from Chinese AI startup DeepSeek's recent advances in the artificial intelligence space. The announcement came via Alibaba's cloud unit's WeChat account, asserting Qwen 2.5-Max outperforms GPT-4, DeepSeek-V3, and Meta's Llama-3.1-405B DeepSeek's January releases of its AI assistant and R1 model have significantly impacted Silicon Valley, causing tech stock volatility The startup's reportedly low development and operating costs have prompted investors to question...

read