The IMDEA Software Institute researchers Pepe Vila, Pierre Ganty and Marco Guarnieri, and Boris Koepf, from Microsoft Research, are the authors of the recent paper “CacheQuery: Learning Replacement Policies from Hardware Caches” accepted at the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2020).
Caches are small and fast memories that sit between the CPU and the main memory (DRAM) of computers. Their main goal is to speed up computations by reducing the time it takes to load and store data, and since their capacity is limited, they must anticipate which data is going to be used in the near future. The better this prediction, the better the performance.
Thus, the cache replacement policy is the logic that decides which data is kept in the memory cache and which is replaced to make room for more useful data. It is a critical component for the performance of modern computers.
In most modern processors these policies are not documented, and since they have a huge performance impact, the absence of precise models makes it very difficult to predict and analyze the behavior and security of programs.
“We lack precise models of our hardware. With our approach, we close this gap, and we allow other people to understand how the policy optimizes certain workloads, so that they can predict the timing behavior of critical systems (i.e., cars or planes), calculate limits on the information leakage of cryptographic programs, or write more accurate hardware simulators” comments Pepe Vila, predoctoral researcher of the IMDEA Software Institute.
Transparency, is the first step towards improving the security and safety of computer systems. In this sense, the researcher of the Institute, Marco Guarnieri, says that: “microprocessors and memories are central components of our computing infrastructure. Security vulnerabilities in these components may result in attacks affecting any program running on top of them. To assess and study the security of microprocessors and memories, researchers need high-level models documenting and describing their behaviors. Unfortunately, many crucial details of how these components work are undocumented. We see our research as a first step towards automatically generating such high-level models from hardware measurements and, ultimately, towards more secure systems”.