The Technology Innovation Institute (TII) in Abu Dhabi has unveiled Falcon3, a new family of open-source large language models designed to advance AI capabilities while maintaining accessibility and efficiency.
The big picture: The Falcon3 release introduces five base models ranging from 1 billion to 10 billion parameters, with a particular focus on enhancing performance in science, mathematics, and coding applications.
- The family includes Falcon3-1B-Base, Falcon3-3B-Base, Falcon3-Mamba-7B-Base, Falcon3-7B-Base, and Falcon3-10B-Base
- All models are released under the Falcon LLM license to promote AI accessibility and collaboration
- The models support context lengths up to 32,000 tokens (except for the 1B model, which supports 8,000 tokens)
Technical innovations: TII implemented several breakthrough approaches to optimize model performance and training efficiency.
- A single large-scale pretraining run using 1,024 H100 GPUs processed 14 trillion tokens of diverse data for the 7B model
- The 10B model was created through depth up-scaling of the 7B model, achieving state-of-the-art performance in its category
- Smaller models (1B and 3B) were developed using pruning and knowledge distillation techniques with less than 100GT of curated data
- The Falcon3-Mamba-7B-Base model received additional training on 1.5 trillion tokens of high-quality data
Performance benchmarks: The models demonstrate impressive capabilities across various evaluation metrics.
- Falcon3-10B-Base achieves 22.9 on MATH-Lvl5 and 83.0 on GSM8K for mathematical reasoning
- The models show strong coding abilities, with Falcon3-10B-Base scoring 73.8 on MBPP
- Scientific knowledge capabilities are evident in MMLU benchmarks, with Falcon3-10B-Base scoring 73.1 on MMLU and 42.5 on MMLU-PRO
- The Instruct versions of the models show particularly strong performance in reasoning and common-sense understanding
Ecosystem integration: The models are designed for broad compatibility and practical implementation.
- All transformer-based Falcon3 models are compatible with the Llama architecture
- Multiple variants are available including Instruct, GGUF, GPTQ-Int4, GPTQ-Int8, AWQ, and 1.58-bit versions
- The models integrate with popular frameworks and tools in the AI ecosystem
Future developments: TII’s roadmap signals continued innovation in the AI space.
- Multi-modal capabilities including image, video, and audio support are planned for release in January 2025
- A comprehensive technical report detailing methodologies will accompany future releases
- The team welcomes community feedback and collaboration for ongoing refinement
Looking ahead: While Falcon3 represents a significant advancement in open-source language models, its true impact may be measured by how effectively it democratizes access to advanced AI capabilities while maintaining competitive performance against larger, more resource-intensive models.
Welcome the Falcon 3 Family of Open Models!