News/AI Models
AI-powered notebook rival built in 24 hours challenges Google
Open-source AI challenges Google's NotebookLM: A data scientist in Singapore has created an open-source alternative to Google's NotebookLM, highlighting the growing capabilities of individual developers in the AI space. Rapid development and key features: Gabriel Chua, a data scientist at Singapore's GovTech agency, built "Open NotebookLM" in just one afternoon using publicly available AI models. The tool transforms PDF documents into personalized podcasts, mirroring a key feature of Google's NotebookLM. It utilizes Meta's Llama 3.1 405B language model and MeloTTS for voice synthesis. A user-friendly interface built with Gradio and hosted on Hugging Face Spaces makes the tool accessible to...
read Sep 30, 2024MIT research investigates how AI models really perceive faces
Groundbreaking study explores AI pareidolia: MIT researchers have conducted an extensive study on pareidolia, the phenomenon of perceiving faces in inanimate objects, revealing significant insights into human and machine perception. Key findings and implications: The study introduces a comprehensive dataset of 5,000 human-labeled pareidolic images, uncovering surprising differences between human and AI face detection capabilities. Researchers discovered that AI models struggle to recognize pareidolic faces in the same way humans do, highlighting a gap in machine perception. Training algorithms to recognize animal faces significantly improved their ability to detect pareidolic faces, suggesting a potential evolutionary link between animal face recognition...
read Sep 30, 2024Why some experts believe AGI is far from inevitable
AGI hype challenged: A new study by researchers from Radboud University and other institutes argues that the development of artificial general intelligence (AGI) with human-level cognition is far from inevitable, contrary to popular claims in the tech industry. Lead author Iris van Rooij, a professor at Radboud University, boldly asserts that creating AGI is "impossible" and pursuing this goal is a "fool's errand." The research team conducted a thought experiment allowing for AGI development under ideal circumstances, yet still concluded there is no conceivable path to achieving the capabilities promised by tech companies. Their findings suggest that replicating human-like cognition...
read Sep 30, 2024AI learns to diagnose like doctors in groundbreaking study
Breakthrough in AI medical reasoning: OpenAI's latest language model, o1, has demonstrated a significant leap in medical question-answering capabilities, outperforming its predecessor GPT-4 by 6.2% in a recent study. The key to this improvement lies in the model's ability to utilize Chain-of-Thought (CoT) reasoning, a process that closely mimics the complex clinical thinking patterns of human physicians. CoT reasoning allows the AI to break down intricate medical queries into a series of iterative steps, much like how doctors approach complex cases in real-world scenarios. This advancement enables "o1" to engage in more dynamic and context-rich dialogues that closely resemble actual...
read Sep 30, 2024AI still can’t explain its own output — we need more humans who can
AI's knowledge conundrum: The limitations of large language models: Large language models (LLMs) like ChatGPT and Gemini are increasingly relied upon by millions for information on various topics, but their outputs lack true justification and reasoning, raising concerns about their reliability as knowledge sources. Over 500 million people monthly use AI systems like Gemini and ChatGPT for information on diverse subjects, from cooking to homework. OpenAI CEO Sam Altman has claimed that AI systems can explain their reasoning, allowing users to judge the validity of their outputs. However, experts argue that LLMs are not designed to reason or provide genuine...
read Sep 29, 2024AMD releases AMD-135M, its first open-source small language model
AMD's Foray into Small Language Models: AMD has unveiled its first small language model (SLM), AMD-135M, marking a significant step in the company's artificial intelligence initiatives. AMD-135M is part of the Llama family of models and was trained from scratch on AMD Instinct™ MI250 accelerators. The model comes in two variants: AMD-Llama-135M for general use and AMD-Llama-135M-code, which is fine-tuned for code-related tasks. This release aligns with AMD's commitment to an open approach to AI, aiming to foster inclusive, ethical, and innovative technological progress. Training Process and Specifications: The development of AMD-135M involved substantial computational resources and time investment to...
read Sep 29, 2024A well-paid niche job market is emerging for human experts to train AI models
The evolution of AI training: The landscape of artificial intelligence development is undergoing a significant transformation, with a growing demand for highly specialized human trainers to enhance the capabilities of AI models. Companies like OpenAI, Cohere, AI21, and Microsoft are now relying on trainers with advanced degrees and specialized knowledge to improve their AI systems. This shift represents a departure from earlier AI training methods that primarily employed low-cost workers for basic tasks such as image labeling. The new approach aims to reduce errors and "hallucinations" in AI outputs by teaching concepts of fact versus fiction and incorporating expert knowledge...
read Sep 29, 2024Schrödinger’s LLM: Why deriving meaning from AI requires human observation
AI-generated language models and human knowledge creation: Large language models (LLMs) have revolutionized content generation, but their role in knowledge creation is more complex than it may initially appear. LLMs produce vast arrays of potential responses and connections based on statistical associations in their training data, existing in a state of informational superposition. The output generated by LLMs is not yet knowledge, but rather a scaffold of language containing words and phrases that form potential ideas. Human interpretation is necessary to transform LLM output into concrete knowledge by reading, contextualizing, and extracting meaning from the generated text. The quantum analogy:...
read Sep 27, 2024Can AI save a slowing SaaS industry?
The SaaS industry at a crossroads: The Software as a Service (SaaS) business model, once a beacon of growth in the tech sector, is showing signs of deceleration, prompting companies to seek new avenues for expansion. SaaS companies have traditionally thrived by offering cloud-based software solutions for monthly or annual fees, effectively locking customers into their ecosystems. Recent industry trends indicate a slowdown, with declining revenue growth rates and challenges in customer retention. Major tech companies heavily rely on the SaaS model for revenue growth, making this slowdown a significant concern for the broader tech industry. AI as the new...
read Sep 27, 2024New research shows bigger AI models not always better
Llama-3 models performance in medical AI: Unexpected results and implications: A recent study comparing various Llama-3 models in medical and healthcare AI domains has revealed surprising findings, challenging assumptions about model size and performance. The Llama-3.1 70B model outperformed the larger Llama-3.2 90B model, particularly in specialized tasks like MMLU College Biology and Professional Medicine. Unexpectedly, the Meta-Llama-3.2-90B Vision Instruct and Base models showed identical performance across all datasets, an unusual occurrence for instruction-tuned models. Detailed performance breakdown: The study evaluated models using datasets such as MMLU College Biology, Professional Medicine, and PubMedQA, providing insights into their capabilities in medical...
read Sep 27, 2024AI models on Hugging Face surge past 1 million milestone
AI model explosion on Hugging Face: Hugging Face, a leading AI hosting platform, has reached a significant milestone by surpassing 1 million AI model listings, showcasing the rapid expansion and diversification of the machine learning field. The platform, which began as a chatbot app in 2016, pivoted to become an open-source hub for AI models in 2020, now offering a wide array of tools for developers and researchers. Hugging Face hosts numerous high-profile AI models, including Llama, Gemma, Phi, Flux, Mistral, Starcoder, Qwen, Stable Diffusion, Grok, Whisper, Olmo, Command, Zephyr, OpenELM, Jamba, and Yi, along with 999,984 others. Customization driving...
read Sep 26, 2024Meta’s new Llama AI model can now see and run on your device
Llama 3.2 Introduces Multimodal and On-Device Models: Meta's latest update to its Llama language model series brings significant advancements in AI capabilities, including vision processing and compact on-device models. Key Features and Enhancements: The Llama 3.2 release incorporates new multimodal vision models and smaller language models optimized for on-device applications, expanding the versatility and accessibility of AI technologies. Two sizes of vision models (11B and 90B parameters) are now available, each with base and instruction-tuned variants, enabling the processing of both text and images in tandem. New 1B and 3B parameter text-only models have been introduced, designed specifically for on-device...
read Sep 25, 2024The best open-source AI model yet is purpose built for AI agents
Breakthrough in open-source AI: The Allen Institute for AI (Ai2) has unveiled Multimodal Open Language Model (Molmo), a groundbreaking open-source AI model that combines image interpretation and conversational abilities, potentially revolutionizing AI agent development. Key capabilities and features: Molmo represents a significant advancement in open-source AI technology, offering a range of functionalities that were previously limited to proprietary models. The model can interpret images and engage in chat-based conversations, making it suitable for a variety of AI agent applications. Molmo is designed to assist AI agents in performing complex tasks such as web browsing, file navigation, and document drafting. Unlike...
read Sep 25, 2024AI models stumble on basic queries as size grows, study finds
AI models struggle with simple tasks as they grow: Large language models (LLMs) are becoming less reliable at answering basic questions as they increase in size and complexity, despite improvements in handling more difficult queries. Research findings: A study conducted by José Hernández-Orallo and colleagues at the Polytechnic University of Valencia, Spain, examined the performance of various LLMs as they scaled up in size and were fine-tuned through human feedback. The research analyzed OpenAI's GPT series, Meta's LLaMA AI models, and the BLOOM model developed by BigScience. Five types of tasks were used to test the AIs, including arithmetic problems,...
read Sep 25, 2024Rabbit Launches Beta Version of New Large Action Model
Rabbit R1's new LAM playground: Rabbit has launched a beta version of its next-generation Large Action Model (LAM) playground, aiming to deliver on its initial promises for the Rabbit R1 device. The company has released 16 over-the-air updates since the R1's launch, addressing bugs and adding new features. The new LAM playground, set to launch in beta on October 1, represents a significant step towards fulfilling the company's original vision. How the new LAM playground works: The system operates as a Generic Website Agent, capable of performing tasks through text prompts or natural language requests to the Rabbit R1. Users...
read Sep 25, 2024AI2’s New Small Open-Source Model Performs as Well as Big Ones
Groundbreaking open-source AI model challenges industry giants: The Allen Institute for Artificial Intelligence (Ai2) has unveiled Molmo, a family of open-source multimodal language models that rival the performance of proprietary models from leading tech companies. Ai2 claims its largest Molmo model, with 72 billion parameters, outperforms OpenAI's GPT-4o in tests measuring image, chart, and document understanding. A smaller Molmo model with just 7 billion parameters reportedly approaches the performance of OpenAI's state-of-the-art model, highlighting Ai2's efficient data collection and training methods. Key innovations in data curation and training: Molmo's impressive performance stems from a novel approach to data collection and...
read Sep 25, 2024Researchers Use Search Algorithms to Improve LLM Planning Capabilities
Breakthrough in LLM planning: Researchers from Cornell University and IBM Research have introduced AutoToS, a novel technique that combines the planning capabilities of large language models (LLMs) with the efficiency of rule-based search algorithms. AutoToS addresses key challenges in LLM-based planning, including computational expense and reliability issues. The new approach eliminates the need for human intervention and significantly reduces the computational cost of solving complex planning problems. This innovation makes AutoToS a promising solution for LLM applications that require reasoning over extensive solution spaces. The evolution of LLM-based planning: AutoToS builds upon previous techniques, such as Tree of Thoughts, while...
read Sep 24, 2024Alibaba’s AI Video Generator Will Join a Growing Market of Sora Competitors
Alibaba unveils AI video generator: The Chinese tech giant has introduced a new text-to-video model as part of its Tongyi Wanxiang portfolio, joining the rapidly expanding field of AI-generated video tools. Key features of Alibaba's AI video tool: Produces high-quality videos from text prompts in both Chinese and English Capable of generating videos from still images Utilizes advanced diffusion transformer (DiT) architecture Maintains video quality across various styles, including realistic live-action and animation Broader context of Alibaba's AI push: The video generator is part of a larger AI rollout, including over 100 new large language models (LLMs) Tongyi Wanxiang, Alibaba's...
read Sep 23, 2024Together AI Launches Enterprise Platform for Secure AI Deployment
AI deployment in private environments: Together AI has unveiled its Enterprise Platform, enabling organizations to deploy AI models in virtual private cloud and on-premises environments, addressing key concerns of performance, cost-efficiency, and data privacy. The platform extends AI deployment capabilities to customer-controlled cloud and on-premises environments, building upon Together AI's existing full-stack platform for open-source LLMs. This new offering aims to meet the needs of businesses that have established privacy and compliance policies within their own cloud setups. Vipul Prakash, CEO of Together AI, emphasizes the importance of efficiency, cost, and data privacy as companies scale up their AI workloads....
read Sep 23, 2024Quantum Computing May Make AI Models More Interpretable
Quantum AI breakthrough for interpretable language models: Researchers at Quantinuum have successfully integrated quantum computing with artificial intelligence to enhance the interpretability of large language models used in text-based tasks like question answering. Key innovation: The team developed QDisCoCirc, a new quantum natural language processing (QNLP) model that demonstrates the ability to train interpretable and scalable AI models for quantum computers. QDisCoCirc focuses on "compositional interpretability," allowing researchers to assign human-understandable meanings to model components and their interactions. This approach makes it possible to understand how AI models generate answers, which is crucial for applications in healthcare, finance, pharmaceuticals, and...
read Sep 22, 2024Research Breakthrough Enables AI Models to Learn from Their Own Mistakes
Advancing self-correction in language models: Researchers have developed a novel reinforcement learning approach called SCoRe that significantly improves the self-correction abilities of large language models (LLMs) using only self-generated data. The study, titled "Training Language Models to Self-Correct via Reinforcement Learning," was conducted by a team of researchers from various institutions. Self-correction, while highly desirable, has been largely ineffective in modern LLMs, with existing approaches requiring multiple models or relying on more capable models for supervision. Key innovation - SCoRe approach: SCoRe utilizes a multi-turn online reinforcement learning method to enhance an LLM's ability to correct its own mistakes without...
read Sep 22, 2024Google’s AI Search Projected to Reach 200 Million Devices by 2024
Google's AI search expands reach: Google's Circle to Search feature, currently exclusive to select Samsung and Google devices, is set to expand to premium Xiaomi and vivo smartphones later this year. Circle to Search allows users to search for on-screen content without exiting their current app by circling, highlighting, or scribbling on elements like text, images, or video content. The feature is powered by Google's AI technology and provides instant search results for the selected content. Google aims to have Circle to Search available on 200 million devices by the end of 2024, indicating a significant expansion of the feature's...
read Sep 22, 2024New Diffusion Model Solves Aspect Ratio Problem in AI Image Generation
Breakthrough in AI image generation: Rice University computer scientists have created a new approach called ElasticDiffusion that addresses a significant limitation in current generative AI models, potentially improving the consistency and quality of AI-generated images across various aspect ratios. ElasticDiffusion tackles the "aspect ratio problem" that plagues popular diffusion models like Stable Diffusion, Midjourney, and DALL-E, which struggle to generate non-square images without introducing visual artifacts or distortions. The new method separates local and global image information, allowing for more accurate generation of images in different sizes and resolutions without requiring additional training. Moayed Haji Ali, a Rice University computer...
read Sep 22, 2024What to Expect at Meta Connect 2024
Event details and format: Meta's annual developer conference, Meta Connect 2024, is set to take place on Wednesday, September 25, with a focus on the company's mixed reality and AI initiatives. The event kicks off at 10 am Pacific time with a one-hour keynote presentation by Meta CEO Mark Zuckerberg. Following Zuckerberg's address, Meta CTO Andrew Bosworth will lead a developer-focused session at 11 am. Viewers can tune in through various platforms, including the Meta Connect website, Meta's YouTube channel, or in virtual reality via Meta Horizon. Anticipated hardware announcements: While no groundbreaking high-end VR headset is expected, Meta may...
read