Silicon Valley company Diffbot has released a new AI model that combines Meta’s LLama 3.3 with their trillion-fact Knowledge Graph to improve factual accuracy in AI responses.
The innovation: Diffbot’s new AI model introduces Graph Retrieval-Augmented Generation (GraphRAG), which queries a constantly updated knowledge database instead of relying solely on pre-trained data.
- The system leverages Diffbot’s Knowledge Graph, an automated database that has been crawling the web since 2016
- The Knowledge Graph refreshes every 4-5 days with millions of new facts
- The model can search for real-time information and cite original sources when responding to queries
Technical implementation: The model represents a significant departure from traditional large language models by separating reasoning capabilities from knowledge storage.
- CEO Mike Tung believes general purpose reasoning can be distilled to about 1 billion parameters
- The system prioritizes tool usage and external knowledge querying over storing information within the model
- The approach allows for real-time fact verification and updates, unlike static training data used in conventional AI models
Performance metrics: Diffbot’s solution has demonstrated strong results in industry-standard testing environments.
- Achieved 81% accuracy on Google’s FreshQA benchmark for real-time factual knowledge, outperforming ChatGPT and Gemini
- Scored 70.36% on MMLU-Pro, a challenging test of academic knowledge
- Real-world applications include data services for major companies like Cisco, DuckDuckGo, and Snapchat
Technical specifications and accessibility: The model is being released as open-source software with flexible deployment options.
- Available immediately through GitHub with a public demo at diffy.chat
- 8 billion parameter version runs on a single Nvidia A100 GPU
- Full 70 billion parameter version requires two H100 GPUs
- Organizations can run the model locally, addressing data privacy concerns
Market implications: Diffbot’s approach challenges the industry’s focus on increasingly larger AI models.
- Addresses growing concerns about AI hallucinations and false information generation
- Offers enterprises a more accurate and auditable solution for sensitive data handling
- Provides an alternative to the “bigger is better” paradigm in AI development
Looking beyond size: The sustainable development of AI technology may depend more on efficient knowledge organization and access than on expanding model parameters.
- The approach emphasizes data provenance and knowledge modification capabilities
- Facts can be updated and verified in real-time through the Knowledge Graph
- This methodology could influence how future AI systems are designed and deployed
Future implications: While Diffbot’s innovation presents a promising direction for AI development, its ability to reshape industry practices will depend on widespread adoption and real-world performance at scale.
Diffbot’s AI model doesn’t guess—it knows, thanks to a trillion-fact knowledge graph