News/AI Models

Jul 22, 2024

Apple’s Open-Source AI Models Are Outperforming Rivals Already

Apple challenges Meta with innovative open-source AI model, signaling a commitment to advancing the broader AI ecosystem and fostering transparency. Key details of Apple's new AI model: Apple's research division has released a new open-source AI model with 7 billion parameters, outperforming similar-sized models from competitors: The model, part of the DCLM (dataComp for Language Models) project, was trained using high-quality datasets designed by researchers from Apple and various academic institutions. Despite its smaller size and context window compared to other models, Apple's AI achieves competitive performance on benchmark tests, reaching 63.7% accuracy on 5-shot evaluation tasks. Notably, Apple has...

read
Jul 21, 2024

AI May Enable Interstellar Communication — So What Should We Tell the Aliens?

Artificial intelligence may enable real-time communication with extraterrestrial civilizations, prompting us to consider what we should tell them about humanity. The power of large language models: Recent advancements in AI, particularly the development of transformer neural network architecture and large language models (LLMs), have the potential to revolutionize communication with extraterrestrial intelligence: LLMs, trained on vast datasets, contain a wealth of human knowledge and are already poised to significantly impact global employment, according to the IMF. By transmitting a well-curated LLM to extraterrestrial civilizations, we could enable them to indirectly converse with us and learn about humanity without being hindered...

read
Jul 21, 2024

DeepL’s Next-Gen Language Translation AI Outperforms ChatGPT and Google Translate

DeepL's next-gen language model outperforms competitors like Google Translate, ChatGPT-4, and Microsoft in translation quality, requiring fewer edits and improving productivity for businesses. Groundbreaking technology: DeepL's new language model combines world-class AI with proprietary linguistic data and the expertise of thousands of language specialists to provide best-in-class translations: In blind tests, language experts prefer DeepL's translations 1.3x more often than Google Translate, 1.7x more than ChatGPT-4, and 2.3x more than Microsoft. Specialized LLMs tuned for language reduce the risk of hallucinations and misinformation compared to general-purpose models trained on public internet data. Highest quality translations: Starting with Japanese, German, and...

read
Jul 21, 2024

How Increasing Diversity in STEM and AI May Address Bias in Generative Models

The rise of generative AI has highlighted the potential for bias in AI models, raising concerns about fairness and inclusivity as these technologies become more influential in critical areas like insurance, housing, credit, and welfare. Addressing this challenge may call for a more diverse workforce in AI and STEM fields. Early education and exposure: Encouraging more women and minorities to pursue STEM careers starts with early education and exposure: Representation shapes perception, and subtle messages given to young girls can influence their interest in STEM fields. Equal paths for exploration and exposure should be ensured through regular curriculum and partnerships...

read
Jul 21, 2024

Mistral’s New Finetuned Open Source LLM Excels in Math and Reasoning

Mistral AI has released Mathstral, a finetuned 7B model designed for math reasoning and scientific discovery, offering a 32k context window and openly available model weights. The model's release comes amidst questions about leading LLMs' ability to solve complex math problems while lacking understanding of elementary school math concepts. Mathstral demonstrates the trend of finetuned open source models outperforming larger closed source models in specialized areas. Testing Mathstral's Common Sense: Running Mathstral locally using LlamaEdge (Rust + Wasm stack) allows for testing its ability to answer common sense math questions. The model successfully answers a question comparing the values of...

read
Jul 20, 2024

Small Language Models are Making AI More Accessible and Environmentally Sustainable

The rise of small language models from OpenAI, Nvidia, and Hugging Face signals a major shift in the AI industry towards more accessible and efficient natural language processing capabilities. Small Wonders: How Compact AI Models are Changing Edge Computing; Hugging Face's SmolLM, designed to run directly on mobile devices, pushes AI processing to the edge, addressing critical issues of data privacy and latency: SmolLM comes in three sizes: 135 million, 360 million, and 1.7 billion parameters, enabling sophisticated AI-driven features on mobile devices with minimal latency and maximum privacy. Nvidia and Mistral AI's collaboration has produced Mistral-Nemo, a 12-billion parameter...

read
Jul 20, 2024

Apple’s New Open-Source Language Models Demonstrate the Power of High-Quality Data

Apple's new open-source language models showcase the company's AI prowess, with the 7B model outperforming leading open models and the 1.4B version surpassing competitors in its category. Introducing DCLM models: Apple's research team, as part of the DataComp for Language Models project, released a family of open DCLM models on Hugging Face, including a 7 billion parameter model and a 1.4 billion parameter model: The models were trained using a high-quality dataset, DCLM-Baseline, assembled through model-based filtering, demonstrating the effectiveness of this data curation technique. The project is truly open-source, with the release of model weights, training code, and the...

read
Jul 19, 2024

Mistral AI and NVIDIA Unveil Cutting-Edge Enterprise AI Model

Mistral AI and NVIDIA have unveiled a cutting-edge enterprise AI model, Mistral NeMo 12B, that offers unprecedented accuracy, flexibility, and efficiency for diverse applications like chatbots, multilingual tasks, coding, and summarization. Key features and capabilities: Mistral NeMo 12B excels in multi-turn conversations, math, common sense reasoning, world knowledge, and coding, delivering precise and reliable performance across various tasks: With a 128K context length, the model can process extensive and complex information more coherently and accurately, ensuring contextually relevant outputs. The model uses the FP8 data format for inference, reducing memory size and speeding up deployment without sacrificing accuracy. Mistral NeMo...

read
Jul 19, 2024

Groq’s Open-Source AI Model Outperforms Tech Giants, Signaling Shift Towards Accessibility and Transparency

Groq's open-source Llama AI models have outperformed industry giants like OpenAI and Google in specialized tool use capabilities, signaling a potential shift in the AI landscape towards more accessible and transparent development. Open-source models take the lead: Groq's Llama-3-Groq-70B-Tool-Use model has claimed the top spot on the Berkeley Function Calling Leaderboard (BFCL), surpassing proprietary offerings from major tech companies: The 70B parameter version achieved a 90.76% overall accuracy on the BFCL, while the smaller 8B model ranked third with 89.06%, demonstrating the competitive performance of open-source models in specific tasks. Groq developed these models in collaboration with AI research company...

read
Jul 18, 2024

Hallucinations Plague Large Language Models, But New Training Approaches Offer Hope

Large language models (LLMs) have significant limitations despite their recent popularity and hype, including hallucinations, lack of confidence estimates, and absence of citations. Overcoming these challenges is crucial for developing more reliable and trustworthy LLM-based applications. Hallucinations: The core challenge: LLMs can generate content that appears convincing but is actually inaccurate or entirely false, known as hallucinations: Hallucinations are the most difficult issue to address, and their negative impact is only slightly mitigated by confidence estimates and citations. Contradictions in the training data contribute to the problem, as LLMs cannot self-inspect their training data for logical inconsistencies. Bootstrapping consistent LLMs:...

read
Jul 18, 2024

Chatbot Arena Highlights How Crowdsourced Rankings of AI Models Are Complementing Traditional Benchmarks

The crowdsourced Chatbot Arena has emerged as an influential way to rank AI chatbots, as companies like OpenAI, Google, and Meta release increasingly sophisticated AI products that are difficult to compare using traditional benchmarks. Key Takeaways: Chatbot Arena, an open-source project by research group LMSYS and UC Berkeley, has built AI leaderboards based on nearly 1.5 million human votes comparing responses from anonymous AI models. The top five AI models on Chatbot Arena's overall leaderboard are GPT-4o, Claude 3.5 Sonnet, Gemini Advanced, Gemini 1.5 Pro, and GPT-4 Turbo. Challenges in evaluating AI models: Industry experts highlight the difficulties in comparing...

read
Jul 18, 2024

OpenAI Unveils Cheaper “Mini” Model Amid Intensifying AI Competition

OpenAI has announced a new low-cost "mini" model aimed at making its AI technology more widely accessible to businesses and developers. Key details of the new model: GPT-4o mini is 60% cheaper than OpenAI's most affordable existing model while offering better performance: The new model was developed by improving the architecture, refining the training data, and optimizing the training process. GPT-4o mini outperforms other "small" models on the market in several common benchmarks, according to OpenAI. Increasing competition in the AI market: OpenAI's move comes amidst growing rivalry among AI cloud providers and rising interest in open source models: Competitors...

read
Jul 18, 2024

Musk’s AI Calls Trump “Pedophile” Despite Billionaire’s Endorsement, Exposing Safeguard Failures

Despite Elon Musk's endorsement of Donald Trump following a recent assassination attempt, Musk's "anti-woke" AI chatbot Grok has been promoting claims that Trump is a "pedophile" and "wannabe dictator" while referring to the former president as "Psycho." Grok's problematic outputs exposed by Global Witness: The nonprofit Global Witness analyzed Grok's responses to queries about the 2024 US election and found deeply concerning results: Grok repeated or appeared to invent racist tropes about Vice President Kamala Harris, describing her as "a greedy driven two bit corrupt thug" with a laugh like "nails on a chalkboard." The chatbot referred to Trump as...

read
Jul 17, 2024

OpenAI’s Prover-Verifier Game Will Enhance AI Explainability and Trustworthiness

AI researchers at OpenAI have developed a new algorithm that helps large language models (LLMs) like GPT-4 better explain their reasoning, addressing the critical issue of AI trustworthiness and legibility. The Prover-Verifier Game: A novel approach to improving AI explainability; The algorithm is based on the "Prover-Verifier Game," which pairs two AI models together: A more powerful and intelligent "prover" model aims to convince the verifier of a certain answer, regardless of its correctness. A less powerful "verifier" model, unaware if the prover is being helpful or deceptive, attempts to select the correct answer based on its own training. OpenAI's...

read
Jul 17, 2024

Cohere and Fujitsu Partner to Bring Localized AI Solutions to Japanese Enterprises

Cohere, a Canadian AI startup, has partnered with Japanese tech giant Fujitsu to develop large language models (LLMs) and AI solutions tailored for the Japanese market, aiming to empower local enterprises with advanced natural language processing capabilities. Key details of the partnership: Cohere will provide its cutting-edge AI models, while Fujitsu will leverage its expertise in Japanese language training and fine-tuning technologies to create enterprise-ready solutions: The collaboration will debut an LLM tentatively named 'Takane' (meaning 'mountain peak' in English), based on Cohere's enterprise-grade Command R+ model, which offers best-in-class retrieval-augmented generation (RAG), multilingual coverage, and a powerful Tool Use...

read
Jul 17, 2024

French AI Startup Mistral Launches Groundbreaking Open-Source Models for Coding and Math

Mistral, a well-funded French AI startup, has launched two new large language models (LLMs) based on the Mamba architecture, which aims to improve upon the efficiency of transformer-based models. Codestral Mamba for developers: Mistral's code-generating model, Codestral Mamba 7B, offers faster response times and longer input handling compared to other models: The model can handle inputs of up to 256,000 tokens, double that of OpenAI's GPT-4. In benchmarking tests, Codestral Mamba outperformed rival open-source models like CodeLlama 7B, CodeGemma-1.17B, and DeepSeek in HumanEval tests. Developers can modify and deploy the model from its GitHub repository and through HuggingFace, with an...

read
Jul 16, 2024

Arcee AI Secures $24M, Signaling Investor Confidence in Small Language Models

Arcee AI, a startup specializing in small language models (SLMs), has secured $24 million in Series A funding, signaling growing investor confidence in the potential of more efficient and specialized AI models for enterprise applications. Arcee AI's unique approach: The company focuses on developing domain-specific SLMs and providing tools for enterprises to create their own customized models, enabling organizations to tackle multiple use cases cost-effectively: Arcee AI's Model Merging technology allows the combination of multiple AI models into a single, more capable model without increasing its size, creating highly tailored models that can outperform larger, more general models in specific...

read
Jul 16, 2024

New Research Highlights How “FlashAttention-3” May Make Training and Inference More Efficient

FlashAttention-3, a new technique developed by researchers from multiple institutions, dramatically accelerates attention computation on Nvidia's H100 and H800 GPUs, enabling faster and more efficient training and inference of large language models (LLMs). The challenge of attention computation in LLMs: As LLMs grow larger and process longer input sequences, the computational cost of the attention mechanism becomes a significant bottleneck due to its quadratic growth with sequence length and reliance on operations not optimized for GPUs. Attention computations involve a mix of matrix multiplications and special functions like softmax, which are computationally expensive and can slow down the overall computation...

read
Jul 16, 2024

Microsoft’s New SpreadsheetLLM Offers Glimpse Into Future of Data Interaction

Microsoft researchers propose SpreadsheetLLM, a novel method that helps AI models understand and process spreadsheets more efficiently, potentially improving chatbot interactions with complex data. Key innovation: SheetCompressor framework: Microsoft's SheetCompressor encoding framework compresses spreadsheets into bite-sized chunks that large language models (LLMs) can more easily handle: It includes modules that make spreadsheets more legible for LLMs, bypass empty cells and repeating numbers, and help LLMs better understand the context of numbers (e.g., distinguishing years from phone numbers). This compression method reduced token usage for spreadsheet encoding by up to 96%, significantly boosting performance on larger spreadsheets where high token usage...

read
Jul 16, 2024

Meta’s Upcoming LLM Launch Will Be Another Big Milestone for Open-Source AI

Meta is set to release Llama 3 400B, its most powerful open-source AI language model, by the end of July 2024. Key details and capabilities: Meta's upcoming release of the Llama 3 400B model is highly anticipated due to its impressive performance and open availability for research and commercial use: Boasting over 400 billion parameters, Llama 3 400B achieves near-parity with OpenAI's GPT-4 on the MMLU benchmark despite using less than half the parameters, suggesting significant advancements in model architecture and training efficiency. The model promises new capabilities such as multimodality, multilingual conversation, longer context windows, and stronger overall performance...

read
Jul 16, 2024

Vectara Raises $25M, Launches Mockingbird LLM for Enterprise RAG

Vectara, an early pioneer in Retrieval Augmented Generation (RAG) technology, has raised $25 million in a Series A funding round, bringing its total funding to $53.5 million, as demand for its technologies grows among enterprise users. Vectara's evolution and the introduction of Mockingbird LLM: Vectara has progressed from a neural search as a service platform to a 'grounded search' or RAG technology provider, and is now launching its purpose-built Mockingbird LLM for RAG applications: The Vectara platform integrates multiple elements to enable a RAG pipeline, including the company's Boomerang vector embedding engine, which grounds responses from a large language model...

read
Jul 16, 2024

Microsoft CTO: AI Progress to Continue Despite Skepticism, Powered by Scaling Laws

Microsoft CTO Kevin Scott believes AI progress will continue despite skepticism, arguing that large language model (LLM) "scaling laws" will drive breakthroughs as models get larger and have access to more computing power. Scaling laws and AI progress: Scott maintains that scaling up model size and training data can lead to significant AI improvements, countering critics who argue that progress has plateaued around GPT-4 class models: He acknowledges the challenge of infrequent data points in the field, as new models often take years to develop, but expresses confidence that future iterations will show improvements, particularly in areas where current models...

read
Jul 15, 2024

Apple’s AI Strategy Is About Ensuring Consistent User Experience Amid Model Updates

Apple's AI strategy aims to improve language model consistency and user experience: Key takeaways: Apple researchers have developed techniques to reduce inconsistencies and negative impacts on user experience when upgrading large language models (LLMs): Updating LLMs can result in unexpected behavior changes and force users to adapt their prompt styles and techniques, which may be unacceptable for mainstream iOS users. Apple's method, called MUSCLE (Model Update Strategy for Compatible LLM Evolution), reduces negative flips (where a new model gives an incorrect answer while the old model was correct) by up to 40%. The research highlights Apple's preparation for updating its...

read
Jul 14, 2024

Balancing Innovation and The Environmental Impact of AI

The rapid growth of artificial intelligence (AI) has raised concerns about its environmental impact, with the carbon footprint of AI becoming an increasingly pressing issue as the technology advances and becomes more widely adopted. Understanding AI's carbon footprint: To grasp the environmental implications of AI, it's important to consider the full lifecycle of AI systems, from hardware production to usage to deployment: Hardware production, maintenance, and recycling account for an estimated 30% of AI's carbon footprint, while computational costs make up the remaining 70%. Training large language models like GPT-3 can generate over 600,000 kg of CO2 equivalent (CO2e), comparable...

read
Load More