Today we’re going to see how unmanaged resources are handled by .NET, what are finalization and fReachable queues and what’s the garbage collector’s role in it. We’ll also get to know what is a dispose pattern and see how to implement it.
Unmanaged resources and finalizers
First of all we should all get familiar with what unmanaged resources are. These ones can be files or folders on the disk, database connections, GUI elements or network resources. You can also think of them as some external elements we need to interact with in our applications through some protocol (e.g. database queries or file system manipulation methods). They’re called unmanaged not without a reason – any unmanaged resources accessed from .NET code will not be automatically cleaned by the garbage collector. That’s how they are not(un)managed.
Maybe you’ve heard about destructors in C++ – remember methods called after class’s name (like its constructor) starting with ‘~’ character? This also exists in .NET, but it’s not what you’d expect from C++. In .NET, such methods are referred to as finalizers.
Finalizers and destructors in .NET
Nonetheless, you can define a finalizer in your class in one of the following ways:
Using “destructor-like” syntax:Using Finalize() method:
If you examine the IL code of both versions, it turns out that the destructor can be seen as a syntax sugar for the finalizer. In fact, the ~MyTestClass() method is translated into Finalize() method:
The only difference is that – as you can see on the IL above – Finalize() method created by the compiler from the destructor is implicitly calling the Finalize() method on the base class of the object (you can see it in the finally block in the IL generated). So in case of using a destructor-like syntax, we are actually creating a finalizer by overriding Finalize() method, whereas when implementing a Finalize() method directly we do it as we’d use the new keyword (hiding base class’s implementation of Finalize() method).
Will GC call the finalizer for me?
As you may know, finalizers methods are normally used to clean-up some unmanaged resources (e.g. to release a file handle or database connection). You may think that as soon as your object is ready to be collected (nothing references it anymore), garbage collector will firstly call the finalizer (to ensure everything is cleaned up) and then reclaim the memory used by the object.
Unfortunately not. If you think about it one more time – GC cannot know what’s inside your Finalize() method. You can put there the code which downloads the whole Internet just like that:
… and even though we have nowadays a bit faster bandwidths than the one from the gif 🙂 , you can clearly see it wouldn’t be a good idea to block garbage collection because of some costly process being executed before the memory for the object is reclaimed.
In order to avoid the above-described scenario, which would slow down the GC’s work significantly, .NET contributors decided that the finalizers on the objects are called periodically on a separate thread, completely independently from the GC.
This way the main GC thread is not blocked. However, it adds a bit of complexity to the garbage collection process – continue reading for more details.
As we said in one of the previous posts, one of the various sources of references to objects in our .NET application (GC roots) are “objects finalization references”. Let’s now make this mysterious term clear.
In order to prevent having a finalizable object (containing finalizer method) reclaimed before its Finalize() method is called (which happens independently from the GC as you already know), there’s a separate references list called finalization queue maintained by the GC. As soon as memory for a new finalizable object is allocated, a reference to it is also put on the finalization queue, which becomes a finalization root for this object (it’s not the same kind of root as “normal” GC roots – it’s treated a bit more differently).
The diagram below presents the memory state just after few new objects were allocated on the managed heap.
As you can see, objects ob2, ob3, ob5, ob6 and ob10 are all placed on the finalization queue, which means all of them contain Finalize() method.
Within these objects, ob2 and ob5 don’t have any references (roots) – it’s visible on the left side (heap). It means that these two objects are not used (are not referenced by anything) so they’re ready to be garbage collected. However, each of them contains a finalization root on the finalization queue.
As soon as the next garbage collection occurs, ob2 and ob5 cannot be reclaimed straightaway, as there’s a Finalize() method to be called on them first. Instead, when GC examines these two objects, it sees that they are not referenced from “normal” roots, but they have finalization references on the finalization queue. If that’s the case, GC moves finalization reference to another data structure called fReachable queue (also visible on the diagram above).
After, both queues and the heap look as follows:
That’s what I meant by stating that finalization roots are “special” kind of GC roots. Apart from that, ob2 and ob5 are still seen as “rooted” (as still being referenced) and because of that are not reclaimed by the GC. In that moment, standard generational GC rules apply, so if the collection below was a gen 1 collection, both objects will be promoted to gen 2.
Periodically, the finalization runs on a separate thread exploring the fReachable queue and calling Finalize() method on each object referenced from it. Then it removes the reference to this object from the queue. At this moment, finalizable object becomes rootless and can be reclaimed in the next garbage collection cycle.
Dispose for the rescue
In order to unify usage of finalizers and allow developers to correctly free their resources without unnecessarily prolonging their objects’ lifetime (as you saw above, finalizable objects stay uncollected for at least 1 more GC round than “normal” objects), programmer can (and should) implement a dispose pattern. The purpose of it is to allow the developer to free unmanaged resources “manually” as soon as they are not needed anymore.
The framework also provides a GC.SuppressFinalize method which tells GC that the object has been manually disposed and no more finalization is necessary. As soon as it’s called, the object reference is removed from finalization/fReachable queue. The goal is to prevent finalizing the object twice.
.NET Framework provides a System.IDisposable interface which should be implemented by a class which works with unmanaged resources.
Sample implementation of dispose pattern can look as follows:
Implementing IDisposable interface only forces us to add Dispose() method. Inside, we perform the resources clean-up and call GC.SuppressFinalize(this) to mark our object as finalized. We also implemented a separate method, in the example called CleanUp(bool disposing), which performs the actual releasing of unmanaged resources. But why do we need this bool disposing parameter?
It may be important to differentiate when the resources cleanup is called directly from code (we’ll see how it can be done in the next section) or by the finalization thread. As you already know, finalization is executed periodically on a separate thread. It means that if our UnmanagedClass needs any thread-specific resources cleanup, it’s not a good idea to execute it on totally independent, finalization thread, which is out-of-context at that moment.
That’s why we added a boolean parameter to the CleanUp method, which will be set to “true” as soon as Dispose() method is called directly from code. In that case we know that we can execute thread-specific cleanup code.
Additionally, we implemented the Finalize() method, which also calls CleanUp(), but passing “false” parameter’s value. In that case, as the finalization thread is out-of-context, we shouldn’t execute thread-specific cleanup code. We should treat Finalize() method as a last resort in case Dispose() has not been called because of some reason.
You can also check Microsoft dosc to see what’s their recommendation on implementing IDisposable interface.
As mentioned above, the programmer can explicitly call Dispose() method on the class:
but there’s another – and preferred – way: wrapping usage of the class within using statement:
How does using statement ensure that Dispose() is called on our object created within it?
Let’s see the IL:
As you can see, the compiler translates using statement into try-finally block. In the finally part Dispose() method is called on our object’s instance.
This way the compiler ensures that even if there’s any exception thrown in the using block, the object will be disposed. That’s why we should always work with objects that may use unmanaged resources in such a way.
In this 7th post from .NET Internals series we examined how unmanaged resources are handled within .NET Framework and what’s the GC’s role in that process. We got to know two new GC-internal data structures: finalization and fReachable queues. In the end, we got familiar with dispose pattern and using statement.
I hope it helped you to clarify these topics a bit 😉
See you next week! 🙂