×
Diffbot’s new AI model aims to improve AI accuracy with its trillion-fact knowledge graph
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Silicon Valley company Diffbot has released a new AI model that combines Meta’s LLama 3.3 with their trillion-fact Knowledge Graph to improve factual accuracy in AI responses.

The innovation: Diffbot’s new AI model introduces Graph Retrieval-Augmented Generation (GraphRAG), which queries a constantly updated knowledge database instead of relying solely on pre-trained data.

  • The system leverages Diffbot’s Knowledge Graph, an automated database that has been crawling the web since 2016
  • The Knowledge Graph refreshes every 4-5 days with millions of new facts
  • The model can search for real-time information and cite original sources when responding to queries

Technical implementation: The model represents a significant departure from traditional large language models by separating reasoning capabilities from knowledge storage.

  • CEO Mike Tung believes general purpose reasoning can be distilled to about 1 billion parameters
  • The system prioritizes tool usage and external knowledge querying over storing information within the model
  • The approach allows for real-time fact verification and updates, unlike static training data used in conventional AI models

Performance metrics: Diffbot’s solution has demonstrated strong results in industry-standard testing environments.

  • Achieved 81% accuracy on Google’s FreshQA benchmark for real-time factual knowledge, outperforming ChatGPT and Gemini
  • Scored 70.36% on MMLU-Pro, a challenging test of academic knowledge
  • Real-world applications include data services for major companies like Cisco, DuckDuckGo, and Snapchat

Technical specifications and accessibility: The model is being released as open-source software with flexible deployment options.

  • Available immediately through GitHub with a public demo at diffy.chat
  • 8 billion parameter version runs on a single Nvidia A100 GPU
  • Full 70 billion parameter version requires two H100 GPUs
  • Organizations can run the model locally, addressing data privacy concerns

Market implications: Diffbot’s approach challenges the industry’s focus on increasingly larger AI models.

  • Addresses growing concerns about AI hallucinations and false information generation
  • Offers enterprises a more accurate and auditable solution for sensitive data handling
  • Provides an alternative to the “bigger is better” paradigm in AI development

Looking beyond size: The sustainable development of AI technology may depend more on efficient knowledge organization and access than on expanding model parameters.

  • The approach emphasizes data provenance and knowledge modification capabilities
  • Facts can be updated and verified in real-time through the Knowledge Graph
  • This methodology could influence how future AI systems are designed and deployed

Future implications: While Diffbot’s innovation presents a promising direction for AI development, its ability to reshape industry practices will depend on widespread adoption and real-world performance at scale.

Diffbot’s AI model doesn’t guess—it knows, thanks to a trillion-fact knowledge graph

Recent News

7 ways to optimize your business for ChatGPT recommendations

Companies must adapt their digital strategy with specific expertise, consistent information across platforms, and authoritative content to appear in AI-powered recommendation results.

Robin Williams’ daughter Zelda slams OpenAI’s Ghibli-style images amid artistic and ethical concerns

Robin Williams' daughter condemns OpenAI's AI-generated Ghibli-style images, highlighting both environmental costs and the contradiction with Miyazaki's well-documented opposition to artificial intelligence in creative work.

AI search tools provide wrong answers up to 60% of the time despite growing adoption

Independent testing reveals AI search tools frequently provide incorrect information, with error rates ranging from 37% to 94% across major platforms despite their growing popularity as Google alternatives.