×
Hugging Face overhauls file transfer system for faster AI model sharing
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

The Hugging Face Hub team is undertaking a significant redesign of their upload and download infrastructure to better handle the growing demands of machine learning model and dataset storage.

Current infrastructure overview: Hugging Face’s existing system utilizes Amazon S3 for storage in us-east-1 and AWS CloudFront as a content delivery network, but faces limitations with large file transfers and optimization capabilities.

  • CloudFront’s 50GB file size limit forces large models like Meta-Llama-3-70B (131GB) to be split into smaller chunks
  • The current setup lacks advanced deduplication and compression capabilities
  • Recent analysis revealed 8.2 million upload requests and 130.8 TB of data transferred from 88 countries in a single day

Proposed architectural changes: A new content-addressed store (CAS) will serve as the primary point for content distribution, implementing a custom protocol focused on “dumb reads and smart writes.”

  • The read path emphasizes simplicity and speed, with requests routed through CAS for reconstruction information
  • The write path operates at the chunk level, optimizing upload speeds by transferring only necessary new data
  • The system maintains S3 as backing storage while adding enhanced security and validation capabilities

Technical optimizations: The new architecture enables format-specific optimizations and improved efficiency.

  • Byte-level file management allows for format-specific compression techniques
  • Parquet file deduplication and Safetensors compression could reduce upload speeds by 10-25%
  • Enhanced telemetry provides detailed logging and audit trails for enterprise customers

Global deployment strategy: After careful analysis of traffic patterns, the team has designed a three-region deployment plan.

  • Primary regions: us-east-1 (Americas), eu-west-3 (Europe/Middle East/Africa), and ap-southeast-1 (Asia/Oceania)
  • Resource allocation: 4 nodes each in US and Europe, 2 nodes in Asia
  • The top 7 countries account for 80% of uploaded bytes, while the top 20 contribute 95%

Implementation timeline: The rollout will proceed gradually throughout 2024.

  • Initial deployment begins with a single CAS in us-east-1
  • Internal repository migration will serve as a benchmark for transfer performance
  • Additional points of presence will be added based on performance testing results

Future implications: This infrastructure overhaul positions Hugging Face to gain unique insights into global AI development trends and patterns.

  • The platform hosts one of the largest collections of open-source machine learning data
  • Future analysis could reveal geographic trends in different AI modalities
  • Expected 12% reduction in bandwidth will be offset by system optimizations

Strategic considerations: While the new architecture introduces some initial latency for certain users, the benefits of enhanced security, optimization capabilities, and scalability make this a calculated trade-off that positions Hugging Face for future growth in the rapidly evolving AI infrastructure landscape.

Rearchitecting Hugging Face Uploads and Downloads

Recent News

Artist beta testers leak OpenAI’s Sora video model in protest

Artists claim they are being asked to test OpenAI's Sora video generator without compensation while the company's valuation soars to $150 billion.

Dell server sales surge 58% amid AI boom

As enterprises accelerate AI adoption, Dell's server and infrastructure business outpaces its traditional PC sales with $2.9 billion in AI-related revenue.

If you make 3D images, you need to try Nvidia’s Edify

Nvidia's AI software aims to streamline 3D modeling for major brands by converting basic images and text into detailed digital assets.