Saturday, September 12, 2009

.NET Best Practice No: 2:- Improve garbage collector performance using finalize/dispose pattern

Introduction and Goal

Ask any developer which is the best place in a .NET class to clean unmanaged resources?, 70% of them will say the destructor. Although it looks the most promising place for cleanup it has a huge impact on performance and memory consumption. Writing clean up code in the destructor
leads to double GC visits and thus affecting the performance multifold times.

In order to validate the same we will first start with bit of theory and then we will actually see how GC algorithm performance is impacted using destructor. So we will first understand the concept of generations and then we will see the finalize dispose pattern.
I am sure this article will change your thought process regarding destructor, dispose and finalize.

Please feel free to download my free 500 question and answer videos which covers Design Pattern, UML, Function Points, Enterprise Application Blocks,OOP'S, SDLC, .NET, ASP.NET, SQL Server, WCF, WPF, WWF, SharePoint, LINQ, SilverLight, .NET Best Practices @ these videos http://www.questpond.com/


.NET Best Practice No: 2:- Improve garbage collector performance using finalize/dispose pattern

Is this Article worth reading ahead?


With this article you will understand how performance of GC algorithm can be improved using finalize dispose pattern. Below figure shows the comparison of what we will be achieving after this article. This article uses CLR profiler to profile how GC works. In case you are new to CLR profiler please do read before you go ahead with this one.




Assumptions

Thanks Mr. Jeffrey Richter and Peter Sollich Let’s start this article by first thanking Mr. Jeffery Richter for explaining in depth how garbage collection algorithm works. He has written two legendary articles about the way garbage collector work. I actually wanted to point to the MSDN magazine article written by Jeffery Richter but for some reason it’s not showing up in MSDN. So I am pointing to a different unofficial location, you can download both the articles from http://www.cs.inf.ethz.ch/ssw/files/GC_in_NET.pdf in a PDF format.

Also thanks to Mr. Peter Sollich who is the CLR Performance Architect to write such a detail help on CLR profiler. When you install the CLR profiler please do not forget to read the detail help document written by Peter Sollich. In this article we will use the CLR profiler to check how the garbage collector performance is affected using finalize.

Thanks a lot to you guys. There was no way I would have completed this article without reading articles written by you. Any time you guys pass by article please do comment on the same would love to hear from you guys.


Garbage collector – The unsung Hero


As said in the introduction writing clean up code in constructor leads to double GC visits. Many developers would like to just shrug off and say ‘Should we really worry about GC and what it does behind scenes?”. Yes, actually we should not worry about GC if you write your code properly. GC has the best algorithm to ensure that your application is notimpacted. But many times the way you have written your code and assigned/cleanedmemory resources in your code affects GC algorithm a lot. Sometimes this impactleads to bad performance of GC and hence bad performance for your application. So let’s first understand what different tasks are performed by the garbage collector to allocate and clean up memory in an application. Let’s say we have 3 classes where in class ‘A’ uses class ‘B’ and class ‘B’ uses class ‘C’.

When the first time your application starts predefined memory is allocated to the application. When the application creates these 3 objects they are assigned in the memory stack with a memory address. You can see in the below figure how the memory looks before the object creation and how it looks after object creation. In case there was an object D created it will be allocated from the address where Object C ends.

Internally GC maintains an object graph to know which objects are reachable. All objects belong to the main application root object. The root object also maintains which object is allocated on which memory address. In case an object is using other objects then that object also holds amemory address of the used object. For example in our case object A uses Object B. So object A stores the memory address of Object B.


Now let’s say Object ‘A’ is removed from memory. So the Object ‘A’ memory is assigned to Object ‘B’ and Object ‘B’ memory is assigned to object ‘C’. So the memory allocation internally looks something as shown below.


As the address pointers are updated GC also needs to ensure that his internal graph is updated with new memory addresses. So the object graph becomes something as shown below. Now that’s a bit of work for GC it needs to ensure that the object is removed from graph and the new
addresses for the existing objects is updated throughout the object tree.



An application other than his own custom objects also has .NET objects which also form the part of the graph. The addresses to those objects also need to be updated. The number of objects of
.NET runtime is very high. For instance below is the number of objects created for a simple console based hello world application. The numbers of objects are approximately in 1000’s. Updating pointers for each of these objects is a huge task.





Generation algorithm – Today, yesterday and day before yesterday

GC uses the concept of generations to improve performance. Concept of generation is based on the way human psychology handles tasks. Below are some points related to how tasks are handled by humans and how garbage collector algorithm works on the same lines:-
• If you decide some task today there is a high possibility of completion of
those tasks.
• If some task is pending from yesterday then probably that task has gained a
low priority and it can be delayed further.
• If some task is pending from day before yesterday then there is a huge
probability that the task can be pending forever.

GC thinks in the same lines and has the below assumptions:-

• If the object is new then the life time of the object can be short.
• If an object is old then it can have a long life time.

So said and done GC supports three generations (Generation 0, Generation 1 and Generation 2).




Generation 0 has all the newly created objects. When the application creates objects they first come and fall in the Generation 0 bucket. A time comes when Generation 0 fills up so GC needs to run to free memory resources. So GC starts building the graph and eliminating any objects which are not used in application any more. In case GC is not able to eliminate an object from generation 0 it promotes it to generation 1. If in the coming iterations it’s not able to remove objects from generation 1 it’s promoted to generation 2. The maximum generation supported by .NET runtime is 2.

Below is a sample display of how generation objects are seen when you run CLR profiler. In case you are new to CLR profiler you can catch up the basics from Best Practices - Part 1



Ok, so how does generation help in optimization

As the objects are now contained in generations, GC can make a choice which generation objects he wants to clean. If you remember in the previous section we talked about the assumptions made by GC regarding object ages. GC assumes that all new objects have shorter life time. So in other words GC will mainly go through generation 0 objects more rather than going through all objects in all generations. If clean up from generation 0 does not provide enough memory it will then move
towards cleaning from generation 1 and so on. This algorithm improves GC performance to a huge extent.


Conclusion about generations
• Huge number of object in Gen 1 and 2 means
memory utilization is not optimized.
• Larger the Gen 1 and Gen 2 regions GC algorithm will perform more worst.

Using finalize/destructor leads to more objects in Gen 1 and Gen 2

The C# compiler translates (renames) the destructor into Finalize. If you see the IL code using IDASM you can see that the destructor is renamed to finalize. So let’s try to understand why
implementing destructor leads to more objects in gen 1 and gen 2 regions. Here’s how the process actually works:-
• When new objects are created they are moved to gen 0.
• When gen 0 fills out GC runs and tries to clear memory.
• If the objects do not have a destructor then it just cleans them up if they are not used.
• If the object has a finalize method it moves those objects to the finalization queue.
• If the objects are reachable it’s moved to the ‘Freachable’ queue. If the objects are unreachable the memory is reclaimed.
• GC work is finished for this iteration.
• Next time when GC again starts its goes to Freachable queue to check if the objects are not reachable. If the objects are not reachable from Freachable memory is claimed back.




In other words objects which have destructor can stay more time in memory.
Let’s try to see the same practically. Below is a simple class which has destructor.
class clsMyClass

{

public clsMyClass()

{

}

~clsMyClass()

{

}

}
We will create 100 * 10000 objects and monitor the same using CLR profiler.
for (int i = 0; i < 100 * 10000; i++)

{

clsMyClass obj = new clsMyClass();

}






If you see the CLR profiler memory by address report you will see lot of objects in gen 1.

Now let’s remove the destructor and do the same exercise.
class clsMyClass

{

public clsMyClass()

{

}

}




You can see the gen 0 has increased considerably while gen 1 and 2 are less in number.

If we see a one to one comparison it’s something as shown in the below figure.


Get rid of the destructor by using Dispose

We can get rid of the destructor by implementing our clean up code in the dispose method For that we need to implement the ‘IDisposable’ method , write our clean up code in this and call suppress finalize method as shown in the below code snippet. ‘SuppressFinalize’
dictates the GC to not call the finalize method. So the double GC call does not happen
class clsMyClass : IDisposable

{

public clsMyClass()

{

}

~clsMyClass()

{

}



public void Dispose()

{

GC.SuppressFinalize(this);

}

}




The client now needs to ensure that it calls the dispose method as shown below.
for (int i = 0; i < 100 ; i++)

{

clsMyClass obj = new clsMyClass();

obj.Dispose();

}




Below is the comparison of how Gen 0 and 1 distribution looks with constructor and with dispose. You can see there is
marked improvement in gen 0 allocation which signifies good memory allocation.



What if developers forget to call Dispose?

It’s not a perfect world. We cannot ensure that the dispose method is always called from the client. So that’s where we can use Finalize / Dispose pattern as explained in the coming section.

There is a detailed implementation of this pattern at
http://msdn.microsoft.com/en-us/library/b1yfkh5e(VS.71).aspx.

Below is how the implementation of finalize / dispose pattern looks like.

class clsMyClass : IDisposable

{

public clsMyClass()

{



}



~clsMyClass()

{

// In case the client forgets to call

// Dispose , destructor will be invoked for

Dispose(false);

}

protected virtual void Dispose(bool disposing)

{

if (disposing)

{

// Free managed objects.

}

// Free unmanaged objects



}



public void Dispose()

{

Dispose(true);

// Ensure that the destructor is not called

GC.SuppressFinalize(this);

}

}






Explanation of the code:-
• We have defined a method called as Dispose which takes a Boolean flag. This flag says is this method called from Dispose or from the destructor. If this is called from the ‘Dispose’ method then we can free both managed and unmanaged resources.
• If this method is called from the destructor then we will just free the unmanaged resources.
• In the dispose method we have suppressed the finalize and called the dispose with true.
• In the destructor we have called the dispose function with false value. In other words we assume that the GC will take care of managed resources and let’s take the destructor call to clean unmanaged resources.
In other words if the client does not call the dispose function the destructor will take care of cleaning the unmanaged resources.


Conclusion


• Do not have empty constructors in your classes.
• In case you need to clean up use finalize dispose pattern with ‘SupressFinalize’ method called.
• If there is a dispose method exposed by a class , ensure to call the same from your client code.
• Application should have more objects allocated in Gen 0 than Gen 1 and Gen 2. More objects in Gen 1 and 2 is sign of bad GC algorithm execution.


Source code

You can find the sample source code for the dispose pattern at from here.

No comments: