7 May 2013
This follows on from my previous two garbage collection blog posts:
The parallel garbage collectors in Hotspot are designed to minimise the amount of time that the application spends undertaking garbage collection, which is termed throughput. This isn't an appropriate tradeoff for all applications - some require individual pauses to be short as well, which is known as a latency requirement.
The Concurrent Mark Sweep (CMS) collector is designed to be a lower latency collector than the parallel collectors. The key part of this design is trying to do part of the garbage collection at the same time as the application is running. This means that when the collector needs to pause the application's execution it doesn't need to pause for as long.
At this point you're probably thinking 'don't parallel and concurrent mean something fairly similar?' Well in the context of GC Parallel means "uses multiple threads to perform GC at the same time" and Concurrent means "the GC runs at the same time as the application its collecting".
The young gen collector in CMS is called ParNew and it actually uses the same basic algorithm as the Parallel Scavenge collector in the parallel collectors, that I described previously.
This is still a different collector in terms of the hotspot codebase to Parallel Scavenge though because it needs to interleave its execution with the rest of CMS, and also implements a different internal API to Parallel Scavenge. Parallel Scavenge makes assumptions about which tenured collectors it works with - specifically ParOld and SerialOld. Bare in mind this also means that the young generational collector is stop the world.
As with the ParOld collector the CMS tenured collector uses a mark and sweep algorithm, in which live objects are marked and then dead objects are deleted. Deleted is really a strange term when it comes to memory management. The collector isn't actually deleting objects in the sense of blanking memory, its merely returning the memory associated with that object to the space that the memory system can allocate from - the freelist. Even though its termed a concurrent mark and sweep collector, not all phases run concurrently with the application's execution, two of them stop the world and four run concurrently.
In ParOld garbage collection is triggered when you run out of space in the tenured heap. This approach works because ParOld simply pauses the application to collect. In order for the application to continue operating during a tenured collection, the CMS collector needs to start collecting when there is a still enough working space left in tenured.
So CMS starts based upon how full up tenured is - the idea is that the amount of free space left is your window of opportunity to run GC. This is known as the initiating occupancy fraction and is described in terms of how full the heap is, so a fraction of 0.7 gives you a window of 30% of your heap to run the CMS GC before you run out of heap.
Once the GC is triggered, the CMS algorithm consists of a series of phases run in sequence.
Theoretically the objects marked during the preclean phase would get looked at during the next phase - remark - but the remark phase is stop the world, so the preclean phase exists to try and reduce remark pauses by doing part of the remark work concurrently. When CMS was originally added to HotSpot this phase didn't exist at all. It was added in Java 1.5 in order to address scenarios when a young generation scavenging collection causes a pause and is immediately followed by a remark. This remark also causes a pause, which combine to make a more painful pause. This is why remarks are triggered by an occupancy threshold in Eden - the goal is to schedule the remark phase halfway between young gen pauses.
The remark phases are also pausing, whilst the preclean isn't, which means that having precleans reduces the amount of time spent paused in GC.
Sometimes CMS is unable to meet the needs of the application and a stop-the-world Full GC needs to be run. This is called a concurrent mode failure, and usually results in a long pause. A concurrent mode failure happens when there isn't enough space in tenured to promote an object. There are two causes for this:
This might happen because the concurrent collection is unable to free space fast enough given the object promotion rates or because the continued use of the CMS collector has resulted in a fragmented heap and there's no individual space large enough to promote an object into. In order to properly 'defrag' the tenured heap space a full GC is required.
CMS doesn't collect permgen spaces by default, and requires the
?XX:+CMSClassUnloadingEnabled flag enabled in order to do so. If, whilst using CMS, you run out of permgen space without this flag switched on it will trigger a Full GC. Furthermore permgen space can hold references into normal heap via things like classloaders, which means that until you collect Permgen you may be leaking memory in regular heap. In Java 7 String constants from class files are also allocated in regular heap, instead of permgen, which reduces permgen consumption, but also adds to the set of object references coming into regular heap from permgen.
At the end of a CMS collection its possible for some objects to not have been deleted - this is called Floating Garbage. This happens when objects become de-referenced since the initial mark. The concurrent preclean and the remark phase ensure that all live objects are marked by looking at objects which have been created, mutated or promoted. If an object has become dereferenced between the initial mark and the remark phase then it would require a complete retrace of the entire object graph in order to find all dead objects. This is obviously very expensive, and the remark phase must be kept short since its a pausing phase.
This isn't necessarily a problem for users of CMS since the next run of the CMS collector will clean up this garbage.
Concurrent Mark and Sweep reduces the pause times observed in the parallel collector by performing some of the GC work at the same time as the application runs. It doesn't entirely remove the pauses, since part of its algorithm needs to pause the application in order to execute.
It took me a little longer than I had hoped to get round to writing this blog post - but if you want to know when my next post is published simply enter your email in the top right hand corner of this blog to subscribe by email.