Almost Fully Utilized Caches for Iterative Methods
Craig C. Douglas ({\tt douglas@ccs.uky.edu}) \\ Departments of Computer Science, Mechanical Engineering, \\ and Center for Computational Sciences \\ 325 McVey Hall - CCS, University of Kentucky \\ Lexington, KY 40506-0045, USA \\ \\ Jonathan Hu ({\tt jhu@ms.uky.edu}) \\ University of Kentucky, Department of Mathematics \\ 715 Patterson Hall, Lexington, KY 40506-0027, USA \\ \\ Wolfgang Karl ({\tt karlw@in.tum.de}) \\ Lehrstuhl für Rechnertechnik und Rechnerorganisation (LRR-TUM)
Institut für Informatik, Technische Universität München, D-80290 München, Germany \\ \\ Markus Kowarschik ({\tt kowarschik@informatik.uni-erlangen.de}) \\ Lehrstuhl für Systemsimulation (IMMD 10), Institut für Informatik
Universität Erlangen-Nürnberg, Martensstraße 3, D-91058 Erlangen, Germany \\ \\ Ulrich Rüde ({\tt ruede@informatik.uni-erlangen.de}) \\ Lehrstuhl für Systemsimulation (IMMD 10), Institut für Informatik
Universität Erlangen-Nürnberg, Martensstraße 3, D-91058 Erlangen, Germany \\ \\ Christian Weiß ({\tt weissc@in.tum.de}) \\ Lehrstuhl für Rechnertechnik und Rechnerorganisation (LRR-TUM)
Institut für Informatik, Technische Universität München, D-80290 München, Germany
Conventional implementations of iterative numerical algorithms, especially multigrid methods, merely reach a disappointing small percentage of the theoretically available CPU performance when applied to representative large problems. One of the most important reasons for this phenomenon is that the current DRAM technology is not able to provide the data fast enougth to keep the CPU busy. Although the fundamentals of cache optimizations are quite simple, current compilers are not able to optimize even simple iterative methods.

We analyse the memory and cache behavior of iterative methods with extensive profiling and describe program transformation techniques to improve the cache performance of two and three dimensional multigrid algorithms. Examples are provided for both structured and unstructured grid problems.