Memory Management for Large Datasets

The past years have shown that the discrepancy between processor and memory performance is rapidly increasing, making memory access a potential bottleneck for applications which have to access large amounts of data. Raycasting, in particular, is prone to cause problems, since it generally leads to irregular memory access patterns. This section discusses strategies to improve memory access patterns taking advantage of the memory hierarchy.

The memory of contemporary computers is structured in a hierarchy of successively larger, slower, and cheaper memory levels. Each level contains a working copy or cache of the level above. Recent developments in processor and memory technology imply an increasing penalty if programs do not take optimal advantage of the memory hierarchy. The memory hierarchy of a x86-based system is shown in Table 3.1. The L1 cache is used to temporarily store instructions and data, making sure the processor has a steady supply of data to process while the memory catches up delivering new data. The L2 cache is the high speed memory between the processor and main memory.

Table 3.1: Memory hierarchy of modern computer architectures. Memory is structured as a hierarchy of successively larger but slower storage technology.

Level	Latency	Size
Register	1 ns - 3 ns	1 KB
Level 1 Cache	2 ns - 8 ns	8 KB - 128 KB
Level 2 Cache	5 ns - 12 ns	0.5 MB - 8 MB
Main Memory	10 ns - 60 ns	256 MB - 2 GB
Hard Disk	8 ms - 12 ms	100 GB - 200 GB

Going up the cache hierarchy towards the CPU, caches get smaller and faster. In general, if the CPU issues an operation on a data item, the request is propagated down the cache hierarchy until the requested data is found. It is very time consuming if the data is only found in a slow cache. This is due to the propagation itself as well as to the back propagation of data through all the caches. For good performance, frequent access to the slower caches has to be avoided. Accessing the slower caches, like hard disk and main memory, only once would be optimal. We assume that there is enough main memory available to hold the volume data and all other data structures necessary - the hard disk only has to be accessed when a volume is loaded. Thus, the main focus lies in optimizing main memory access.

Subsections

up:	Volume Rendering of Large Datasets
previous:	Compositing
next:	Bricking