Sunday, January 25, 2015

Troubleshooting Memory Leaks in Production


Overview

If your a .net developer, you may never need to investigate an application that is leaking memory. If so, consider yourself very fortunate. However, if you do find yourself in the troubling position of having a memory leak in production, that you are unable to reproduce on your development machine, this post will give you a good starting point.

Memory Leaks in .Net

I could write a whole blog post on just the different types of leaks in .net. Here is an excellent post about the different types of leaks.

http://alexatnet.com/articles/memory-leaks-in-net-applications

Monitoring For Leaks

Perfmon is your best friend here. It has performance counters for about anything you would want to monitor. For Capturing memory leaks there are a few counters you will want to setup and watch.
These are five of the most important counters that you will want to monitor. Monitor first, because this narrows down the potential causes of the leak, and lets you narrow your focus when searching for a possible cause. I've listed the common behaviors of a leaking application below.



Private Bytes Increasing, Bytes in All Heaps remains constant

You have an unmanaged code leak. Are there any parts of the application that use COM or external non .net libraries?

Bytes in All Heaps Increasing

You have a leak in managed code. There are many different things that could be causing this. Take a look at both the Gen2 heap size and the Large Object Heap Size. Depending on which one is growing, could point you towards it being a large object that is not being collected, or smaller objects.

Current Assemblies Increasing

If your current assemblies are going up over time, you are leaking dynamically generated assemblies.  There are a few different things that could be causing this, but the most obvious is XmlSerializer
There are only 2 Constructors XmlSerializer(type) and XmlSerializer(type, String) where the dynamic assembly is cached correctly. If you use any other constructors here a dynamic assembly will be created every time you create a new instance of the serializer. So, make sure you only have a single instance.

Memory Leak Tools

There are several few tools out there, some paid, and some free, that help find memory leaks. Redgate and Scitech have paid offering and those tools are very nice. You can go this route, if you have the money to spend, and do pretty well. However, I don't think you can ever have too many tools in your toolbox and there are some free tools that are just as useful as their paid counterparts.

Process Explorer

Even though Process Explorer isn't a memory leak tool, it is worth mentioning as every developer should be familiar with it. It has a whole wealth of information related to running processes. It Particular is the ability to view real time CPU usage of threads (nice when tracking down a hang), and loaded assemblies in memory. You can use this feature if you suspect an assembly leak to see how many dynamic assemblies have been loaded.

VMMap

Another useful SysInternals tool. This tool shows you a live view of the memory your process is using, and breaks down the memory into addresses.

PerfView

This is another really nice performance tool from Microsoft. Its a bit complicated to use, but it makes up for that with the powerful features it offers. It can capture memory dumps from a running process without pausing the process(Very nice in production). It also has the ability to profile more than just memory. It has the ability to collect real time information about GC, and also profile the CPU usage of your application. Channel 9 has some great tutorials about it, and I would go watch them to get an understanding about how it works.

Debug Diag

This tool is a must have in your toolbox, especially if you are trying to get good dumps from a service running under IIS. It allows you to dump processes, setup triggers to take dumps, etc. It also has analysis tools, that allow you to quickly get a report about a dump. It has a nifty heuristic that it uses to calculate the likelyhood of a module causing the leak, if you set it to monitor for leaks. It doesn't have the drill down capabilities of the more complicated tools, but its very easy to use, and gives a nice overview of whats going on in your application at the time the dump was created.

WinDbg

This is the oldest and most well known debugging tool. Its a powerful tool, but its also the most complicated debugging tool of all of the tools. It is a challenge to get setup and running correctly. I personally would only use this as a last resort, if you need some information that isn't available to you in all of the other tools out there.


The Hard Part
 Now your at the hard part. Your pretty sure you have a leak somewhere. You actually need to find the leak. If you look at some of the simple tutorials, they make it look so easy. I've never had a memory leak that was really easy to find. Usually because I am looking at a WCF or Asp.Net application, which seems to make it more difficult to track down. Here is a rough idea of how to proceed, if your not using a commercial tool that has one click profiling and gathering.

Learn your tools
Before you march off to production and start capturing information, spend some time learning about the tools you are using. Especially getting the dumps correctly. You want to make sure you are capturing a dump that you can actually use, so you don't waste time there. You also want to make sure that the method you are using to capture your dump doesn't impact the business. Nobody wants to be responsible for killing the uptime chart, because they didn't understand how their dump is captured. You also need to have a basic understanding of how GC works, and how Leaks can occur.

Capture dumps over time
 If you are using DebugDiag, you want to enable the memory leak detection. You want to capture a few dumps spread out over some amount of time. Try to capture the first dump after your application or service has warmed up/initialized.  That will reduce the amount of noise when you compare two dumps.

Diff the Dumps
Looking at a single dump only provides a snapshot in time. You have no idea what is going on inside the application though at the time the dump was created. Tools like Perfview, DebugDiag, and the paid tools can take two dumps and do a diff between them. This allows you to see what has changed over time.

Filter out the Noise
This is the hard part. You need to have a good grasp of your application to understand what is noise and what is not. If the issue isn't obvious you might spend a fair amount of time doing this. It is useful work though, because if you ever have a second memory leak, you will already know what stuff is most likely still noise.



Conclusion
I hope you found this article useful as a good starting point for investigating memory leaks. There is no magic bullet when it comes to finding and resolving leaks, and you may spend days and weeks tracking them down. Practice makes it easier, but hopefully you don't have much opportunity to practice.  Like everything else, it just takes hard work, tenacity and good Google skills.

No comments :

Post a Comment