Garbage collection

JNode uses a simple mark&sweep collector. You can read on the differences, used terms and some general implementation details at wikipedia. In these terms JNode uses a non-moving stop-the-world conservative garbage collector.

About the JNode memory manager you should know the following: There is org.jnode.vm.memmgr.def.VmBootHeap. This class manages all objects that got allocated during the bootimage creation. VmDefaultHeap contains objects allocated during runtime. Each Object on the heap has a header that contains some extra information about the heap. There's information about the object's type, a reference to a monitor (if present) and the object's color (see wikipedia). JNode objects can have one of 4 different colors and one extra finalization bit. The values are defined in org.jnode.vm.classmgr.ObjectFlags.

At the beginning of a gc cycle all objects are either WHITE (i.e. not visited/newly allocated) or YELLOW (this object is awaiting finalization).

The main entry point for the gc is org.jnode.vm.memmgr.def.GCManager#gc() which triggers the gc run. As you can see one of the first things in gc() is a call to "helper.stopThreadsAtSafePoint();" which stops all threads except the garbace collector. The collection then is divided into 3 phases: markHeap, sweep and cleanup. The two optional verify calls at the beginning and end are used for debugging, to watch if the heap is consistent.

The mark phase now has to mark all reachable objects. For that JNode uses org.jnode.vm.memmgr.def.GCStack (the so called mark stack) using a breadth-first search (BFS) on the reference graph. At the beginning all roots get marked where roots are all references in static variables or any reference on any Thread's stack. Using the visitor pattern the Method org.jnode.vm.memmgr.def.GCMarkVisitor#visit get called for each object. If the object is BLACK (object was visited before and all its children got visited. Mind: This does not mean that the children's children got visited!) we simply return and continue with the next reference in the 'root set'. If the object is GREY (object got visited before, but not all children) or in the 'root set' the object gets pushed on the mark stack and mark() gets called.
Let's make another step down and examine the mark() method. It first pops an Object of the mark stack and trys to get the object type. For all children (either references in Object arrays or fields of Objects) the processChild gets called and each WHITE (not visited yet) object gets modified to be GREY. After that the object gets pushed on the mark stack. It is important to understand at that point that the mark stack might overflow! If that happens the mark stack simply discards the object to push and remembers the overflow. Back at the mark method we know one thing for sure: All children of the current object are marked GREY (or even BLACK from a previous mark()) and this is even true if the mark stack had an overflow. After examining the object's Monitor and TIB it can be turned BLACK.
Back at GCManager#markHeap() we're either finished with marking the object or the mark stack had an overflow. In the case it had an overflow we have to repeat the mark phase. Since many objects now are allready BLACK it is less likly the stack will overflow again but there's one important point to consider: All roots got marked BLACK but as said above not all children's children need to be BLACK and might be GREY or even WHITE. That's why we have to walk all heaps too in the second iteration.
At the end of the mark phase all objects are either BLACK (reachable) or WHITE (not reachable) so the WHITE ones can be removed.

The sweep again walks the heap (this time without the 'root set' as they do not contain garbage by definition) and again visits each object via org.jnode.vm.memmgr.def.GCSweepVisitor#visit. As WHITE objects are not reachable anymore it first tests if the object got previously finalized. If it was it will be freed, if not and the object has a Finalizer it will be marked YELLOW (Awaiting finalization) else it will be freed too. If the object is neither WHITE nor YELLOW it will be marked WHITE for the next gc cycle.

The cleanup phase at the end sets all objects in the bootHeap to WHITE (as they will not be swept above) as they might be BLACK and afterwards calls defragment() for every heap.

Some other thoughts regarding the JNode implementation include:

It should be also noted that JNode does not know about the stack's details. I.e. if the mark phase visits all objects of a Thread's stack it never knows for a value if it is a reference or a simple int,float,.. value. This is why the JNode garbage collector can be called conservative. Every value on the stack might be a reference pointing to a valid object. So even if it is a float on the stack, as we don't know for sure we have to visit the object and run a mark() cycle. This means on the one hand that we might mark memory as reachable that in reality is garbage on the other hand it means that we might point to YELLOW objects from the stack. As YELLOW objects are awaiting finalization (and except the case the finalizer will reactivate the object) they are garbage and so they can not be in the 'root set' (except the case where we have a random value on the stack that we incorrectly consider to be a reference). This is also the reason for the current "workaround" in GCMarkVisitor#visit() where YELLOW objects in the 'root set' trigger error messages instead of killing JNode.

There is some primilary code for WriteBarrier support in JNode. This is a start to make the gc concurrent. If the WriteBarrier is enabled during build time, the JNode JIT will include some special code into the compiled native code. For each bytecode that sets a reference to any field or local the writebarrier gets called and the object gets marked GREY. So the gc will know that the heap changed during mark. It is very tricky to do all that with proper synchronization and the current code still has bugs, which is the reason why it's not activated yet.