Open-source - CO/AI

News/Open-source

Nov 26, 2024

Newcomer ‘SmolVLM’ is a small but mighty Vision Language Model

The emergence of SmolVLM represents a significant advancement in making vision-language models more accessible and efficient, while maintaining strong performance capabilities. Core Innovation: Hugging Face has introduced SmolVLM, a family of compact vision language models that prioritizes efficiency and accessibility without sacrificing functionality. The suite includes three variants: SmolVLM-Base, SmolVLM-Synthetic, and SmolVLM-Instruct, each optimized for different use cases Built upon the SmolLM2 1.7B language model, these models demonstrate that smaller architectures can deliver impressive results The design incorporates an innovative pixel shuffle strategy that aggressively compresses visual information while processing larger 384x384 image patches Technical Specifications: SmolVLM achieves remarkable efficiency...

read Nov 22, 2024

Chinese AI model LLaVA-o1 rivals OpenAI’s o1 in new study

The emergence of LLaVA-o1 represents a significant advancement in open-source vision language models (VLMs), bringing new capabilities in structured reasoning and image understanding to match commercial offerings from major AI companies. Key innovation: Chinese researchers have developed LLaVA-o1, a new vision language model that implements inference-time scaling and structured reasoning similar to OpenAI's o1 model, marking a breakthrough in open-source AI capabilities. The model introduces a four-stage reasoning process: summary, caption, reasoning, and conclusion Only the conclusion stage is visible to users, while the other stages handle internal processing The approach allows for more systematic problem-solving and reduces errors in...

read Nov 22, 2024

Photo-editing startup Lightricks just released an open-source AI video generator

The race to democratize AI video generation has intensified as Lightricks, known for its popular photo-editing app Facetune, enters the arena with an open-source solution that challenges the industry's biggest players. The breakthrough announcement: Lightricks has unveiled LTX Video (LTXV), an open-source AI model that generates five seconds of high-quality video in just four seconds, marking a significant departure from the proprietary approaches of tech giants. The model features two billion parameters and runs efficiently on consumer-grade GPUs like the NVIDIA RTX 4090 LTXV will be released on GitHub and Hugging Face under an OpenRAIL license The technology maintains high...

read Nov 22, 2024

AMD is developing an open-source software platform for AI development

The artificial intelligence industry is moving towards more open and flexible development environments, with AMD leading efforts to create hardware-agnostic software solutions that could reshape how AI applications are built and deployed. Strategic vision and market positioning: AMD is developing an open-source, hardware-agnostic software ecosystem to accelerate AI development while competing with Nvidia in the generative AI chip market. CEO Lisa Su outlined this vision during an address at the Indian Institute of Science (IISc) in Bengaluru The company is making substantial investments in tools, compilers, and abstraction layers to support this open ecosystem The initiative aims to create a...

read Nov 21, 2024

Half of Gen AI users want the technology open-sourced — Here’s why

Artificial intelligence development and deployment increasingly relies on open-source software, with organizations embracing open solutions for both cost efficiency and transparency according to new research from the Linux Foundation. Key findings from the research: The Linux Foundation's comprehensive survey of 316 AI professionals reveals widespread adoption of generative AI, with significant emphasis on open-source infrastructure. 94% of organizations currently use generative AI, with 42% reporting high or very high adoption rates On average, 41% of organizations' AI code infrastructure is open source, rising to 47% among high adopters 71% of respondents indicate that open source positively influences their decision-making processes...

read Nov 21, 2024

OpenScholar: The open-source AI tool that outperforms GPT-4 in scientific research

The emergence of OpenScholar marks a significant advancement in AI-assisted scientific research, offering researchers a powerful open-source tool to navigate and synthesize millions of academic papers efficiently. Core innovation: OpenScholar, developed by the Allen Institute for AI and the University of Washington, combines advanced retrieval systems with a specialized language model to provide evidence-based answers to complex research questions. The system processes over 45 million open-access academic papers, delivering citation-backed responses that outperform larger proprietary models Unlike traditional AI models, OpenScholar actively retrieves and synthesizes information from real papers rather than relying solely on pre-trained knowledge The platform uses a...

read Nov 20, 2024

aiOla releases AI audio model that protects sensitive data

The growing need for secure AI transcription solutions has led to the development of innovative tools that can protect sensitive information while converting speech to text. Key Innovation: Israeli startup aiOla has released Whisper-NER, an open-source AI model that automatically masks sensitive information during audio transcription. Built on OpenAI's Whisper model, this new tool combines automatic speech recognition with named entity recognition to identify and obscure sensitive data in real-time The model can mask specific information like names, phone numbers, and addresses during the transcription process A demo version is available on Hugging Face, allowing users to test the masking...

read Nov 18, 2024

Mistral unveils Pixtral Large, an open-weights multimodal model

Mistral AI's latest release marks a significant advancement in multimodal AI technology with the introduction of Pixtral Large, a powerful model that combines image and text processing capabilities. Key specifications: Pixtral Large is built upon Mistral Large 2, featuring a 123B multimodal decoder and a 1B parameter vision encoder, with a 128K context window capable of processing at least 30 high-resolution images simultaneously. The model is available under both research and commercial licenses, catering to different use cases and applications Built on top of Mistral Large 2, it maintains strong text processing capabilities while adding sophisticated image understanding The extensive...

read Nov 17, 2024

OpenCoder is a truly open language model for coding — here’s how to get it

The rise of open-source code language models continues to reshape the AI development landscape, with OpenCoder emerging as a significant new entrant in the field of code-focused large language models (LLMs). Core technology and capabilities: OpenCoder represents a family of open-source code language models available in both 1.5B and 8B parameter versions, supporting English and Chinese languages. The model was trained on an extensive dataset of 2.5 trillion tokens, consisting of 90% raw code and 10% code-related web content Both base and chat models are available, making it versatile for different use cases The model family achieves performance metrics comparable...

read Nov 15, 2024

SUSE rebrands, launches AI platform to safeguard enterprise data

SUSE is significantly expanding and rebranding its enterprise software portfolio while making a strategic push into AI infrastructure, marking a pivotal evolution for the long-standing Linux and open-source solutions provider. Major rebranding initiative: SUSE is streamlining its product naming conventions to create a more cohesive brand identity across its enterprise software offerings. The company's flagship container platform Rancher has been renamed to SUSE Rancher, while Liberty Linux becomes SUSE Multi Linux Support Infrastructure products Harvester and Longhorn are now rebranded as SUSE Virtualization and SUSE Storage respectively These name changes reflect a broader effort to unify SUSE's diverse product portfolio...

read Nov 15, 2024

This grassroots initiative aims to bring more diversity to AI-generated voices

The increasing dominance of English and American accents in AI voice technology has sparked efforts to create more linguistically diverse and inclusive voice systems, with Mozilla's Common Voice project emerging as a leading grassroots initiative. Project overview and scope: Mozilla's Common Voice initiative represents a significant effort to democratize voice technology by building an open-source database of diverse speech patterns and languages. Since 2017, the project has amassed over 31,000 hours of voice recordings spanning approximately 180 languages Volunteers contribute by recording voice samples and verifying recordings submitted by others The dataset is freely available and open source, marking a...

read Nov 15, 2024

How open-source LLMs empower all developers to become an AI engineers

The democratization of AI engineering is accelerating rapidly, with new tools and frameworks making it increasingly accessible to developers who possess basic coding and deployment skills. The paradigm shift in AI development: The evolution from DevOps to MLOps to GenAI has followed a consistent pattern of simplification and standardization, making previously complex technologies more approachable. The transition mirrors earlier developments in software engineering, where complex processes became streamlined and standardized Traditional software development skills like IDE usage and YAML configuration are now sufficient for AI engineering The barrier to entry has significantly lowered, enabling a broader range of developers to...

read Nov 13, 2024

Thanks to Exo Labs you can run powerful open-source LLMs on a Mac

The emergence of local AI model computing has taken a significant leap forward with Exo Labs enabling powerful open source AI models to run on Apple's new M4-powered Mac computers. Breaking new ground: Exo Labs has successfully demonstrated running advanced large language models (LLMs) locally on Apple M4 devices, marking a significant shift away from cloud-based AI computing. A cluster of four Mac Mini M4 devices and one Macbook Pro M4 Max, totaling around $5,000, can now run sophisticated AI models like Qwen 2.5 Coder-32B This setup provides a cost-effective alternative to traditional GPU solutions, as a single Nvidia H100...

read Nov 13, 2024

Alibaba’s AI coding assistant Qwen2.5-Coder-32B also runs locally on Macs

The rise of locally-run AI coding assistants marks a significant shift in how developers can access powerful language models for programming tasks, with Alibaba's new Qwen2.5-Coder series emerging as a notable player in this space. Key capabilities and specifications: Qwen2.5-Coder-32B-Instruct represents a breakthrough in open-source code models, claiming performance comparable to GPT-4o while maintaining a relatively modest size of 32B parameters. The model is Apache 2.0 licensed, making it freely available for both personal and commercial use With a 32B parameter size, it can run on high-end consumer hardware like a 64GB MacBook Pro M2 The quantized version requires approximately...

read Nov 12, 2024

Alibaba just released a free, open-source AI coding assistant and it’s very good

The release of Alibaba Cloud's Qwen2.5-Coder represents a significant advancement in AI-powered coding assistance, offering enterprise-grade capabilities through an open-source model that could reshape software development practices globally. Key innovation details: Qwen2.5-Coder has quickly become the second most popular demo on Hugging Face Spaces, with performance metrics rivaling GPT-4. The system includes six model variants ranging from 0.5 billion to 32 billion parameters, accommodating different computational resources The flagship model achieved impressive scores: 92.7% on HumanEval, 90.2% on MBPP, and 31.4% accuracy on LiveCodeBench Technical improvements stem from refined data processing, synthetic data generation, and balanced training datasets Technical capabilities:...

read Nov 11, 2024

DeepMind open sources its groundbreaking AlphaFold3 AI protein predictor

The release of AlphaFold3's source code marks a significant shift in how artificial intelligence tools are being shared within the scientific community, particularly for protein structure prediction and drug discovery research. Major development: Google DeepMind has made its AlphaFold3 protein structure prediction model available as open-source software for non-commercial applications, reversing its earlier restrictive approach. The announcement comes six months after DeepMind initially withheld the code from their scientific paper John Jumper, AlphaFold team leader and recent Chemistry Nobel Prize winner, expressed enthusiasm about potential applications of the tool The software allows scientists to model protein interactions with other molecules,...

read Nov 10, 2024

OpenCoder is a new code-focused LLM that is truly open

The growing importance of code-focused Large Language Models (LLMs) has created a need for open-source alternatives that can match proprietary solutions while providing transparency for scientific research and development. Key Innovation: OpenCoder represents a significant advancement in open-source code LLMs by offering complete transparency in its development process and achieving performance levels comparable to leading proprietary models. The project makes available not just the model weights and inference code, but also the complete training data and processing pipelines The release includes detailed experimental results and training protocols to enable reproducible research This level of openness is unusual in the field,...

read Nov 9, 2024

How Roboflow saved 74 years of developer time with Meta’s SAM model

Meta's Segment Anything Model (SAM) has transformed the landscape of image segmentation, dramatically reducing the time and effort required to create training data for AI models. This innovation has far-reaching implications across various industries and applications. Key developments in SAM technology: Meta released the first SAM model in 2023, enabling flexible interactive and automatic image segmentation. SAM 2, launched in July 2024, expanded capabilities to include real-time, promptable object segmentation for both images and videos. The open-source nature of SAM has fostered collaboration and continuous improvement, leading to significant advancements between versions. Quantifying the impact: Roboflow, a company leveraging SAM...

read Nov 7, 2024

OpenHands is a new developer tool with zero set up and instant access to AI

Introducing OpenHands: A new frontier in AI-assisted coding: OpenHands, launched on Daytona's platform, offers developers instant access to autonomous AI coding capabilities without the need for complex setups or waitlists. The new tool combines cloud-based Visual Studio Code with advanced AI coding assistance, aiming to streamline the development process and increase productivity. OpenHands is positioned as a zero-setup solution, eliminating barriers to entry for developers interested in leveraging AI for coding tasks. The platform was launched in the categories of Open Source, Software Engineering, and Artificial Intelligence, indicating its broad applicability across various development domains. Key features and benefits: OpenHands...

read Nov 6, 2024

The best open-source AI models you can use for free

The rise of open-source AI: The artificial intelligence landscape is experiencing a significant shift with the growing prominence of open-source and free-to-use AI models across various domains, including text, image, and audio processing. The Open Source Initiative (OSI) has introduced the Open Source AI Definition (OSAID) to establish clear criteria for truly open-source AI models, emphasizing full transparency in design and training data. Many popular AI models, such as Meta's LLaMA and Stability AI's Stable Diffusion, fall short of fully complying with OSAID standards due to licensing restrictions or lack of transparency in their development process. Diverse landscape of AI...

read Nov 6, 2024

Closed AI models are gaining ground on open ones, prompting debate over future of innovation

The AI model landscape: A significant debate is emerging in the field of artificial intelligence regarding the merits and drawbacks of open versus closed AI systems, particularly as these technologies advance in their reasoning capabilities. Open AI models are defined as those with downloadable model weights, allowing insight into their inner workings, while closed systems are either unreleased or accessible only through APIs or hosted services. The debate is particularly relevant as AI technologies are developing the ability to engage in step-by-step reasoning processes with error correction, mimicking human thought patterns more closely than ever before. Key findings on model...

read Nov 5, 2024

Microsoft’s ‘Magentic-One’ framework directs multiple AI agents to complete your tasks

A new multi-agent AI infrastructure: Microsoft researchers have introduced Magentic-One, an open-source framework designed to manage multiple AI agents working together to complete complex, multi-step tasks. Magentic-One is described as a generalist agentic system that aims to enhance productivity and transform daily life by enabling AI agents to solve tasks requiring multiple steps. The framework is available to researchers and developers for both research and commercial purposes under a custom Microsoft License. Alongside Magentic-One, Microsoft released AutoGenBench, an open-source agent evaluation tool built on their previously released Autogen framework. System architecture and functionality: Magentic-One operates with an Orchestrator agent that...

read Nov 5, 2024

The performance gap between open and closed AI models is shrinking

The AI landscape evolves: Open AI models are catching up to closed models in performance, with only a one-year lag, according to a new report by Epoch AI. Meta's Llama 3.1 405B, released in July, took about 16 months to match the capabilities of GPT-4's first version. The gap between open and closed models could shrink further if Meta releases its next-generation AI, Llama 4, as an open model. Researchers analyzed hundreds of notable models released since 2018, measuring performance on technical benchmarks and computing power used for training. Implications for policymakers: The narrowing gap between open and closed AI...

read Nov 5, 2024

New research explores how to train AI agents with an ‘evolving online curriculum’

Breakthrough in training open-source AI web agents: WebRL, a novel self-evolving online curriculum reinforcement learning framework, has been developed to train high-performance web agents using open large language models (LLMs). Researchers from multiple institutions have created WebRL to address key challenges in building LLM web agents, including the scarcity of training tasks, sparse feedback signals, and policy distribution drift in online learning. The framework aims to bridge the gap between open-source and proprietary LLM-based web agents, potentially democratizing access to powerful autonomous web interaction systems. Key components of WebRL: The framework incorporates three main elements to enhance the capabilities of...

read