Performance-Focused Memory Subsystem Verification in Modern GPUs

Mohit Gupta

doi:10.32996/jcsts.2025.4.1.79

Authors

Mohit Gupta Intel Inc., USA

DOI:

https://doi.org/10.32996/jcsts.2025.4.1.79

Keywords:

GPU memory subsystem, performance verification, bottleneck detection, simulation-based verification, pre-silicon optimization, memory hierarchy

Abstract

Modern GPUs have shifted from compute-bound to memory-bound performance bottlenecks, particularly for AI and high-performance computing workloads. Traditional functional verification methods cannot detect performance-critical issues that emerge under real workload conditions, especially as memory hierarchies become increasingly complex with multiple cache levels and advanced interconnect. The article presents an end-to-end verification framework that combines cycle-accurate simulation with detailed memory hierarchy instrumentation to capture stall events, memory latencies, and cache behavior. Our bottleneck detection methods, using both rule-based and machine learning approaches, achieve high detection accuracy while maintaining low false positive rates across diverse GPU workloads. The framework uses trace-driven simulation to replay real workload behavior rather than synthetic benchmarks, enabling verification scenarios that closely match production memory access patterns. Integrated performance regression testing tracks key metrics throughout design iterations, preventing unintended performance degradation during optimization. Pre-silicon optimization capabilities allow architecture teams to explore design alternatives—including cache organizations, memory hierarchy configurations, and interconnect topologies—with confidence before costly silicon implementation, significantly reducing the risk of discovering fundamental performance bottlenecks post-tape-out.

Performance-Focused Memory Subsystem Verification in Modern GPUs

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

rightbar

submission

menus

Notice: