Hugging Face overhauls file transfer system for faster AI model sharing

Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage

Join Now

The Hugging Face Hub team is undertaking a significant redesign of their upload and download infrastructure to better handle the growing demands of machine learning model and dataset storage.

Current infrastructure overview: Hugging Face’s existing system utilizes Amazon S3 for storage in us-east-1 and AWS CloudFront as a content delivery network, but faces limitations with large file transfers and optimization capabilities.

CloudFront’s 50GB file size limit forces large models like Meta-Llama-3-70B (131GB) to be split into smaller chunks
The current setup lacks advanced deduplication and compression capabilities
Recent analysis revealed 8.2 million upload requests and 130.8 TB of data transferred from 88 countries in a single day

Proposed architectural changes: A new content-addressed store (CAS) will serve as the primary point for content distribution, implementing a custom protocol focused on “dumb reads and smart writes.”

The read path emphasizes simplicity and speed, with requests routed through CAS for reconstruction information
The write path operates at the chunk level, optimizing upload speeds by transferring only necessary new data
The system maintains S3 as backing storage while adding enhanced security and validation capabilities

Technical optimizations: The new architecture enables format-specific optimizations and improved efficiency.

Byte-level file management allows for format-specific compression techniques
Parquet file deduplication and Safetensors compression could reduce upload speeds by 10-25%
Enhanced telemetry provides detailed logging and audit trails for enterprise customers

Global deployment strategy: After careful analysis of traffic patterns, the team has designed a three-region deployment plan.

Primary regions: us-east-1 (Americas), eu-west-3 (Europe/Middle East/Africa), and ap-southeast-1 (Asia/Oceania)
Resource allocation: 4 nodes each in US and Europe, 2 nodes in Asia
The top 7 countries account for 80% of uploaded bytes, while the top 20 contribute 95%

Implementation timeline: The rollout will proceed gradually throughout 2024.

Initial deployment begins with a single CAS in us-east-1
Internal repository migration will serve as a benchmark for transfer performance
Additional points of presence will be added based on performance testing results

Future implications: This infrastructure overhaul positions Hugging Face to gain unique insights into global AI development trends and patterns.

The platform hosts one of the largest collections of open-source machine learning data
Future analysis could reveal geographic trends in different AI modalities
Expected 12% reduction in bandwidth will be offset by system optimizations

Strategic considerations: While the new architecture introduces some initial latency for certain users, the benefits of enhanced security, optimization capabilities, and scalability make this a calculated trade-off that positions Hugging Face for future growth in the rapidly evolving AI infrastructure landscape.

Rearchitecting Hugging Face Uploads and Downloads

huggingface

Menu

Hugging Face overhauls file transfer system for faster AI model sharing

Recent News

Meta hires ChatGPT co-creator as chief scientist for $14B AI push

AI models secretly inherit harmful traits through sterile training data

Be concrete, and 4 other lessons from successful enterprise AI implementations

Join the revolution

CO/AI

Resources

Join the revolution

Menu

Welcome

Hugging Face overhauls file transfer system for faster AI model sharing

Recent News

Meta hires ChatGPT co-creator as chief scientist for $14B AI push

AI models secretly inherit harmful traits through sterile training data

Be concrete, and 4 other lessons from successful enterprise AI implementations

Join the revolution

CO/AI

Resources

Join the revolution