Gradient and Crusoe collaborate to create open-source LLM with 1 million token context window, potentially reshuffling the AI landscape and unlocking new applications.
Key takeaways: Gradient and Crusoe have extended the context window of Llama-3 models to 1 million tokens, a significant milestone in the race to create open-source models with long context windows:
- Most LLMs with very long context windows, such as Anthropic Claude, OpenAI GPT-4, and Google Gemini, are private models.
- Open-source models with long context windows could reshuffle the LLM market and enable applications not possible with private models.
Enterprise need for open models: Gradient works with enterprise customers who require LLMs integrated into their workflows but face context limitations and data privacy restrictions:
- Extending the capabilities of coding copilots to generate entire code modules requires models to reference entire codebases, which is challenging with limited context windows.
- Many companies have restrictions on sending data to third parties, making private models like Gemini or Claude unsuitable.
Leveraging open research: Gradient relied heavily on open research from universities and institutes worldwide to develop their long-context models:
- They used Meta’s open model Llama 3 as the base, along with techniques from Berkeley AI Research, code from a Singapore research institute, and mathematical formulas from a Shanghai AI lab.
- Evaluation benchmarks from Nvidia helped track the performance of their models compared to other long-context LLMs.
Addressing compute challenges: Compute resources are a major bottleneck in LLM research, but Crusoe’s purpose-built AI cloud helped Gradient build and explore models cost-efficiently:
- Crusoe provided a customized Nvidia L40S cluster, optimized for Gradient’s specific needs, considerably reducing the cost of training the models.
- Close collaboration and open communication between Crusoe and Gradient enabled tailored compute offerings that are more difficult with other cloud providers.
Evaluating the models: Gradient used various benchmarks to assess the performance of their long-context models:
- The “needle in a haystack” test showed near-perfect performance up to around 2 million context length, comparable to Google’s Gemini 1.5 Pro.
- More advanced measures, such as multiple needles or adversarial needles, were also considered.
- The models were evaluated on Nvidia’s RULER benchmark, which includes 13 tasks for evaluating long-context LLMs.
Potential enterprise applications: Long-context open models could make it easier for companies and developers to build LLM-based applications:
- Agentic systems can do more with fewer calls by processing more information with each request.
- Complex data processing pipelines for tasks like style transfer could be simplified.
- The need for retrieval-augmented generation (RAG) could be reduced.
- Prototyping and demonstrating the possibilities of LLMs to enterprises becomes more accessible.
Analyzing deeper: While the creation of open-source, long-context LLMs is a significant milestone, it remains to be seen how they will compare to private models in terms of performance, safety, and scalability. Additionally, the compute resources required to train and deploy these models at scale may still be a barrier for many organizations. Nonetheless, the collaboration between Gradient and Crusoe showcases the potential for open research and purpose-built AI infrastructure to drive innovation in the rapidly evolving field of large language models.
Recent Stories
DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment
The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...
Oct 17, 2025Tying it all together: Credo’s purple cables power the $4B AI data center boom
Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...
Oct 17, 2025Vatican launches Latin American AI network for human development
The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...