×
Hugging Face overhauls file transfer system for faster AI model sharing
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The Hugging Face Hub team is undertaking a significant redesign of their upload and download infrastructure to better handle the growing demands of machine learning model and dataset storage.

Current infrastructure overview: Hugging Face’s existing system utilizes Amazon S3 for storage in us-east-1 and AWS CloudFront as a content delivery network, but faces limitations with large file transfers and optimization capabilities.

  • CloudFront’s 50GB file size limit forces large models like Meta-Llama-3-70B (131GB) to be split into smaller chunks
  • The current setup lacks advanced deduplication and compression capabilities
  • Recent analysis revealed 8.2 million upload requests and 130.8 TB of data transferred from 88 countries in a single day

Proposed architectural changes: A new content-addressed store (CAS) will serve as the primary point for content distribution, implementing a custom protocol focused on “dumb reads and smart writes.”

  • The read path emphasizes simplicity and speed, with requests routed through CAS for reconstruction information
  • The write path operates at the chunk level, optimizing upload speeds by transferring only necessary new data
  • The system maintains S3 as backing storage while adding enhanced security and validation capabilities

Technical optimizations: The new architecture enables format-specific optimizations and improved efficiency.

  • Byte-level file management allows for format-specific compression techniques
  • Parquet file deduplication and Safetensors compression could reduce upload speeds by 10-25%
  • Enhanced telemetry provides detailed logging and audit trails for enterprise customers

Global deployment strategy: After careful analysis of traffic patterns, the team has designed a three-region deployment plan.

  • Primary regions: us-east-1 (Americas), eu-west-3 (Europe/Middle East/Africa), and ap-southeast-1 (Asia/Oceania)
  • Resource allocation: 4 nodes each in US and Europe, 2 nodes in Asia
  • The top 7 countries account for 80% of uploaded bytes, while the top 20 contribute 95%

Implementation timeline: The rollout will proceed gradually throughout 2024.

  • Initial deployment begins with a single CAS in us-east-1
  • Internal repository migration will serve as a benchmark for transfer performance
  • Additional points of presence will be added based on performance testing results

Future implications: This infrastructure overhaul positions Hugging Face to gain unique insights into global AI development trends and patterns.

  • The platform hosts one of the largest collections of open-source machine learning data
  • Future analysis could reveal geographic trends in different AI modalities
  • Expected 12% reduction in bandwidth will be offset by system optimizations

Strategic considerations: While the new architecture introduces some initial latency for certain users, the benefits of enhanced security, optimization capabilities, and scalability make this a calculated trade-off that positions Hugging Face for future growth in the rapidly evolving AI infrastructure landscape.

Rearchitecting Hugging Face Uploads and Downloads

Recent News

China-based DeepSeek just released a very powerful ultra large AI model

Chinese startup achieves comparable performance to GPT-4 while cutting typical training costs by 99% through an innovative parameter activation approach.

7 practical tips and tools for using AI to improve your relationships

AI tools offer relationship support through structured communication guidance and conflict management, but experts emphasize they should complement rather than replace human interaction.

How AI-powered tsunami prediction will save lives in future disasters

Emergency response teams are leveraging AI systems to cut tsunami warning times from hours to minutes while improving evacuation planning and damage assessment.