<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
<title></title>
</head>
<body text="#000000" bgcolor="#ffffff">
<big><big>Attribute Access Methods</big></big><br>
<br>
List trees of order can easily reduce the access path to a keyed member
of a 1000 member set stored in memory to about twenty edge traversals.
A modern computer can traverse edges at a rate of about 200 ns / edge,
assuming one L2 cache miss per edge. That leads to a known attribute
access time of about 4 us. That is not bad, but it would be nice to
improve if you need to process ~500 million attribute (5 million record
x 100 attribute) data sets.<br>
<br>
The best way to improve lookup performance is to cache the results. A
direct mapped cache of adequate size can reduce average access times by
an order of magnitude. In this case, checking a direct mapped cache
requires one integer division and two likely L2 cache misses, about 440
ns, or roughly nine times faster. That compares to perhaps 50 ns
average per traditional static attribute access (five times faster
yet), due to the advantage of high cache locality and a considerably
smaller working set.<br>
<br>
Of course, if attribute records are 64 bytes each, one can only
reasonably fit perhaps 8 million in typical 512 MB area. If the system
starts to page, you might as well give up unless you can either pay a
1000x performance penalty or redesign all of your data structures for
page based locality. If the overflow isn't too serious you can either
buy more RAM and move up to a 64 bit machine, as appropriate, of
course. <br>
<br>
One of the problems, of course, is that all these per-item indexes and
caches have a considerable overhead, typically double the size of the
data being stored. That is a major penalty to pay when every attribute
is an item in and of itself - e.g. 128 bytes/attribute instead of 64.
My solution is to allocate indexes and index caches only for items that
have more than a threshold number of references. <br>
<br>
I expect the optimal threshold to be about 8 references per item. Each
meta-relationship to another item consumes one reference. Most
attributes will have one incoming reference (for an evaluation) and two
outgoing ones, one for the attributed object and one for the domain.
On the other hand, some objects (person records, for example) will
often have hundreds of internal and external references - for
attributes, events, family relationships, and so on - where a reference
index and a cache would be a great help.<br>
<br>
Final note: I learned to program on architectures that did not have CPU
caches. The latter have some really interesting side effects due to
cache line size, replacement latency, and so on. In-memory index
traversal is much faster if you can locate multiple keys in the same
cache line (typically 32-64 bytes), because each L2 cache miss fetches
a full cache line. For my purposes, that means I should really rewrite
my current N-way list tree structure to resemble something much more
like a traditional B-tree. The irony, of course, is that CPUs are so
fast relative to memory that the same techniques for optimizing
mechanical disk access work for RAM access as well, just with a much
smaller 'sector' size and a time scale reduction of 1000 or so.<br>
<br>
- Mark<br>
</body>
</html>