Friday, August 8, 2008

The dangers of micro-benchmarking

This week I started on a project where we needed to optimize the memory usage of a Java queue with a very, very large number of objects. Right away I noticed that the objects we are putting in the queue use "wrapper objects" for native types like long, int, etc.


public class Node {
public java.lang.Long oneValue;
public java.lang.Long anotherValue;
// etc.
}


Changing these to use regular primitives seemed like a great way to save memory, so I wrote up a little micro-benchmark:


public class OldNode {
public java.lang.Long oneValue;
// etc.
}

public class NewNode {
public long oneValue;
// etc.
}

public static void main(String[] args) {
List nodes = new ArrayList();
try {
OldNode node = new OldNode();
node.oneValue = 1L; // take advantage of Java 5 auto-boxing
nodes.add(node);
} catch (OutOfMemoryError e) {
System.out.println("Out of memory. Size: " + nodes.size());
}

// And then repeat the same thing with NewNode
}

When I ran this test, the old node class came out ahead of the new node class! In other words, I was able to add around 300,000 of the old objects to the list before running out of memory, but I could only add around 280,000 of the new objects. Obviously, this didn't make any sense.

The reason is easy to spot if you know how auto-boxing works. The following line:


node.oneValue = 1L; // take advantage of Java 5 auto-boxing


gets converted by the Java compiler to:


node.oneValue = Long.valueOf(1L);


and the implementation of Long.valueOf() (at least in the Sun JVM) uses cached values for -128 to 127. So the old node class was re-using a single Long object for every instance! Obviously this uses a lot less memory.

Once I changed the benchmark to use Rand.nextLong() to generate the values, the new node class came out ahead, using about 40% of the memory that the old class used.

The larger picture is that if you write micro-benchmarks, you have to approximate real-world data. The part that made my original test fail was taking the easy way out and using all 0's.

No comments: