AI PCs fall short of performance expectations: Recent benchmarks reveal that AI-powered PCs are struggling to deliver on their promised computational capabilities, particularly in the realm of neural processing units (NPUs).
Qualcomm’s NPU technology under scrutiny: Pete Warden, a long-time advocate of Qualcomm’s NPU technology, has expressed disappointment with the performance of these chips in Windows tablets, specifically the Microsoft Surface Pro running on Arm.
- Warden’s history with Qualcomm includes collaborating on experimental support for their HVX DSP in TensorFlow back in 2017.
- The promise of up to 45 trillion operations per second on Windows tablets equipped with Qualcomm’s NPUs initially generated significant excitement.
Benchmarking reveals significant performance gaps: Warden’s open-source benchmark, focusing on matrix multiplication as a fundamental AI operation, exposed a substantial discrepancy between advertised and actual performance.
- The NPU achieved only 573 billion operations per second, less than 1.3% of the advertised 45 trillion operations per second.
- This performance was even lower than that of the CPU and significantly behind the 2.16 teraops achieved by an Nvidia RTX 4080 in a gaming laptop using the same benchmark.
Potential causes for underperformance: While the exact reason for the poor performance remains unclear, several factors could be contributing to the issue.
- Software stack limitations, including the Onnx runtime, drivers, and on-chip code, may not be fully optimized yet.
- The inability to compile and run custom operations on the DSP in Windows further limits potential workarounds.
- It’s possible that the method of calling the code could be a factor, though Warden claims to have followed the documentation closely.
Industry implications: The underperformance of AI PCs could have broader implications for the tech industry and consumer expectations.
- This situation highlights the challenges in translating theoretical hardware capabilities into real-world performance gains.
- It raises questions about the readiness of AI acceleration hardware for mainstream consumer devices.
- The discrepancy between advertised and actual performance could potentially impact consumer trust in AI PC marketing claims.
Looking ahead: Despite the current disappointment, there’s still hope for improvement in AI PC performance.
- Many of the potential issues identified could be addressed through software updates, suggesting that performance could improve over time.
- The open-sourcing of the benchmark invites collaboration and further investigation from the tech community.
- Warden remains optimistic about the hardware’s potential, pending resolution of the current performance bottlenecks.
Broader context: This situation underscores the complexities involved in bringing cutting-edge AI technologies to consumer-grade devices.
- It highlights the gap between theoretical capabilities and practical implementation in real-world scenarios.
- The challenges faced by Qualcomm and Microsoft in this instance may offer valuable lessons for other companies working on AI acceleration in consumer electronics.
- This experience emphasizes the importance of robust testing and optimization before bringing AI-powered devices to market.
AI PCs aren’t very good at AI