Benchmarking Ideas

Submitted by cluster on Sat, 09/06/2008 - 21:16.

General discussions

I'm starting to put together a benchmarking suite for jnode. The idea came from the fact that while working with the memory manager, it was simple to test things like allocation speed and gc performance, but was difficult to test how our changes were affecting JNode's runtime performance.

For now, I have about a dozen sorting algorithms that can run inside JNode, but i need some more ideas on what people think would be good tests to run. The benchmark suite is designed with a Runner that contains a collection of Tasks to run. The sorting algorithms should prove to be a good gauge of compiler optimizations(L2) as they are implemented. I would like to have some other tests that targeted specific areas of JNode and specific tests for certain pieces of hardware.

An example of a more JNode specific test was the affect of allocation patterns from the memory manager on Locality of Reference. Our current allocation pattern is a first-come/first-fit. What this means is that the first thread to obtain the mm lock gets to allocate at the first available spot. The next thread gets the next spot. This creates a problem that multiple threads allocating at the same time are going to have their data interleaved with all the other threads data. Modern operating systems use process specific heaps to get around this issue. Not only does this help the concurrent access of threads to memory but it also helps locality of reference by keeping a processes data as sequential as the process allocates it, up to a size of the allocated block (4KB pages).

My test for this will involve having multiple threads allocate large data structures that are not by definition sequential(primitive arrays), such as linked lists, maps and reference arrays. Once data has been allocated it will be used in various ways(eg iterating, swaping, sorting) that depend on locality of reference to perform optimally.

We need "real workload" benchmarks

Submitted by Stephen Crawley on Sat, 09/13/2008 - 01:19.

Synthetic benchmarks (like the ones you are suggesting) are useful if you know that they correspond to a common pattern in a real programs. But if you only suspect this, then they can be very misleading. You can spend lots of time optimizing for a pattern that isn't realistic.

What we really need is large-scale benchmarks involving real applications doing real (and representative) tasks. For example, a Java / Ant build of a large application, running a significant simulation task, a significant ray-tracing task, and so on. For benchmarking changes to schedulers, GC or VM, you also need "workload" type benchmarks which simulate typical usage patterns for JNode; e.g. web browsing while running a compilation in the background, etc.

My real point is that these large-scale benchmarks (and especially the workload ones) are hard to do now; we don't have many real applications or typical workloads. So optimizations that depend on these benchmarks are premature. Focus on the simple stuff ... pick the "low hanging fruit" as the americans say.

Benchmark should independent

Submitted by Horcrux7 on Sat, 09/13/2008 - 15:27.

If a benchmark should be comparable then it should be independent from other applications. If you compare 2 different Jnode version and you test applications then you does not know if the running difference occur in the application or in Jnode. Another problem is that you have no point where you can search for the problem.

I think also that the benchmark should be runnable with a standard JVM that we can compare it.

There are many Java Benchmark on the net. Are there no free benchmark? I think it is simpler to use an existing benchmark.

No working ones

Submitted by Peter on Sat, 09/13/2008 - 18:02.

For a start I'd be happy with any benchmark out there that does not crash JNode. All that I tried didn't work or crash JNode. And this was the reason Cluster started that thread in the first place.
The most interessting one (DaCapo) does not work (see other thread) and even with a local fix for the stackoverflow I ran into several other issues :/

If you find a benchmark that works, I'd be happy to use it for a start even if the licence wouldn't permit redistribution.

DaCapo problems

Submitted by Stephen Crawley on Sun, 09/14/2008 - 02:06.

The most interessting one (DaCapo) does not work (see other thread) and even with a local fix for the stackoverflow I ran into several other issues :/

Peter, could you please update the 'bug' for the DaCapo related stack overflow problem to say what you did to work around it? I'm planning to address it once I've dealt with the console issue(s) I'm currently working on.

Also, if there is any chance that the other issues are "JNode's fault", it would be a good idea to create bug issues for them as well. Or maybe we should create "Get DaCapo Working" task and use it to track all of the problems that need to be addressed.

Updates on DaCapo issue

Submitted by Peter on Mon, 09/15/2008 - 17:01.

Sry for the delay. I updated the original DaCapo issue with some more or less usefull information.

You should add tests with

Submitted by Horcrux7 on Fri, 09/12/2008 - 21:21.

You should add tests with a large free memory and test with few free memory if you allocate the most of available memory before. This will test the GC on different cases.
Then you can add test with synchronized and frequented access from multiple threads.
System.arraycopy with different array sizes.
Image and Graphics operations can be also very interesting.

How do you want compare the results? I think every run will produce other results. And other hardware of course also. A look on the total time will not produce a good comparable result.

I have attached some tests that I have written for many years. I want see which programming technics has the fast results. You can see if you find a good idea in it.