Why bigger isn't always better when it comes to AI models

The efficiency revolution in AI: Researchers are challenging the “bigger is better” paradigm in artificial intelligence, demonstrating that smaller models can achieve impressive results with far fewer resources and a reduced carbon footprint.

The Allen Institute for Artificial Intelligence (Ai2) has developed Molmo, a family of open-source multimodal large language models that outperform much larger competitors while using significantly fewer parameters.
Ai2’s largest Molmo model, with 72 billion parameters, reportedly outperforms OpenAI’s GPT-4o (estimated to have over a trillion parameters) in tests measuring image, chart, and document understanding.
A smaller Molmo model, with just 7 billion parameters, is said to approach the performance of state-of-the-art models, thanks to more efficient data collection and training methods.

Challenging the status quo: The “scale is all you need” mindset, which has dominated AI research since 2017, is being questioned as researchers explore more efficient approaches to model development.

This prevailing mindset has led to a race among tech companies to create ever-larger models, often at the expense of accessibility and environmental considerations.
Sasha Luccioni, AI and climate lead at Hugging Face, criticizes the current trend, stating that today’s AI models are “way too big” for practical use by average individuals.
The focus on scale has resulted in models that are too large to be downloaded or tinkered with by most people, even if they were open-source.

Environmental and ethical concerns: The push for larger AI models has led to significant environmental and ethical challenges that researchers are now trying to address.

Larger models have a substantially bigger carbon footprint due to their increased energy requirements for operation.
Scaling up often involves invasive data-gathering practices and can lead to the inclusion of problematic content, such as child sexual abuse material, in training datasets.
The extreme concentration of power in the hands of a few tech giants is another consequence of the focus on scale, as only elite researchers with substantial resources can build and operate such large models.

Rethinking AI development: Researchers are advocating for a shift in focus towards more efficient and targeted AI models that prioritize specific tasks and important metrics.

Ani Kembhavi, senior director of research at Ai2, emphasizes the need to “think completely out of the box” to find better ways to train models.
The Molmo project aims to prove that open models can be as powerful as closed, proprietary ones, while remaining accessible and cost-effective to train.
Luccioni suggests that instead of using large, general-purpose models, developers should prioritize models designed for specific tasks, focusing on factors such as accuracy, privacy, and data quality.

Transparency and accountability: The AI community is calling for greater transparency in model development and evaluation to promote more responsible and efficient practices.

Current AI research often lacks a clear understanding of how models achieve their results or what data they are trained on.
There is a growing need for tech companies to be more mindful and transparent about their model development processes.
Shifting incentives in the research community could encourage a focus on doing more with less, rather than simply scaling up existing approaches.

Broader implications: The push for smaller, more efficient AI models could lead to significant changes in the AI landscape, potentially democratizing access to advanced AI capabilities.

By reducing the resources required to develop powerful AI models, smaller research teams and organizations may be able to contribute more significantly to the field.
More efficient models could also lead to reduced environmental impact from AI development and deployment, aligning with broader sustainability goals.
As the AI community reconsiders its approach to model development, we may see a shift towards more specialized, task-specific AI solutions that better address real-world needs while minimizing resource consumption.

Why bigger isn’t always better when it comes to AI models

Recent Stories

DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment

Tying it all together: Credo’s purple cables power the $4B AI data center boom

Vatican launches Latin American AI network for human development