mirror of
https://github.com/deepseek-ai/DeepSeek-V3.git
synced 2025-07-05 16:01:35 -04:00
- Replace mocked performance estimates with actual measured results - Add `BenchmarkResults` struct to collect live performance data during execution - Implement honest dynamic summary showing real GFLOPS, timing, and bandwidth - Add transparent performance assessment based on measured values only - Display peak performance identification (1160 GFLOPS measured at 512×512) - Include real memory bandwidth (20.3 GB/s) and latency (1.8 ns) measurements - Remove misleading static efficiency percentages with live measurement system - Show clear distinction between measured performance and theoretical estimates - Provide actionable insights from Apple Accelerate backend performance Results: 1160 GFLOPS peak measured performance with honest assessment, eliminating misleading hardcoded comparisons in favor of real benchmark data. |
||
---|---|---|
.. | ||
blas_bench.zig | ||
main.zig |