NVIDIA Research

Thumb "Increasing Memory Miss Tolerance for SIMD Cores"
David Tarjan, Jiayuan Meng, Kevin Skadron, in "Proc. Supercomputing '09", August 2009

Author(s): David Tarjan, Jiayuan Meng and Kevin Skadron
Date: August 2009



Abstract: Manycore processors with wide SIMD cores are becoming a popular choice for the next generation of throughput oriented architectures. We introduce a hardware technique called “diverge on miss” that allows SIMD cores to better tolerate memory latency for workloads with non-contiguous memory access patterns. Individual threads within a SIMD “warp” are allowed to slip behind other threads in the same warp, letting the warp continue execution even if a subset of threads are waiting on memory. Diverge on miss can either increase the performance of a given design by up to a factor of 3.14 for a single warp per core, or reduce the number of warps per core needed to sustain a given level of performance from 16 to 2 warps, reducing the area per core by 35%.