Debugging

When debugging it is important to understand the bug you are hunting. Before it can be properly fixed one must understand it completely. The first step is always to find a way to reproduce the bug, unless you are lucky and the application crashed while you were running the debugger. However, most of the time someone else finds the bug and worse, it didn't crash the application, it just caused unexpected behavior.

Once the bug is reproduceable in a testing environment one might just understand why it happens, but most of the time you won't. The next step is to always try to limit the domain. If the application was crashing you already have a smaller domain. Otherwise setting some breakpoints/watchpoints might help to find when/where the values goes wrong.

Memory related debugging
A common symptom of corrupt memory is when the application crashes randomly and/or when it crashes the debugger breaks at random locations. Another common symptom is when accessing or another pointer is causing the application to crash. The first thing to check is always if the pointer is valid! Later I will explain a technique how to quickly check any pointer, but some simpler tips first. Make sure that the pointer in question is initialized to (preferably with an initializer list) and set the pointer to NULL after being freed if there is a chance it will be accessed again. That is almost always, unless they are freed in a destructor. An uninitialized pointer can point to anything. Such a pointer is called a dangling reference. The same goes for a pointer whose pointee was freed. Accessing a dangling reference in any way will cause a SIGSEGV.

By making sure that the pointer is NULL unless it points to a valid pointee the next step is to see if is  or not. If it is there is your problem. It it should have a valid pointee but does not chances are that the memory got corrupted in some way. The most common way is buffer overflow. Take this example for instance:

What would happen if one would call Foo::bar with a string whose length is larger than 9 characters? The array would not be large enough to contain all the characters thus it would just keep writing outside of it's space. The pointer Foo::fred would get a value and probably crash the application, for instance if the destructor were to free the memory held by the pointer.

Memory blocks
A great way of knowing whenever a pointer is a dangling reference or not is to have some custom data before the actual data, a memory header. This requires to overloading of operator . When allocating memory allocate sizeof(header_t) + the requested size. Cast the allocated memory to a header_t and fill the values with a predefined signature (can be anything), size should be the originally requested size and data should be a pointer to the first byte after the header. Remember to return the value of the data pointer, not to the entire block of memory! Another good idea is to clear these values in operator delete as a dangling reference might have the header intact but the data corrupted.

Now, once the application crashes and we believe the cause is a dangling reference we can


 * 1) Cast the pointer to a char pointer (or any pointer to a datatype which is one byte large)
 * 2) Substract sizeof(header_t) from it
 * 3) Cast to a header_t

Then one would just see if the values are correct. The above steps is possible to do with gdb but I don't belive it is possible with MSVC. Anyway, it is usually a good idea to write a function which retrieves a header from a pointer and one which verifies the block. There is a couple of tests which is suitable:


 * Correct signature
 * Correct size (if size of pointee is known)
 * data == pointer to check

Padding
To further protect against various overflows padding can be applied just like the above header was added, both before and after the block. It is also a good idea to add padding pointers to the header. Make sure the padding is set to a fixed value and check the padding when verifying the block. If the pre-padding is overwritten someone else trashed your memory, also this means there is another trashed pointer around with it's post-padding corrupted. If the post-padding is overwritten you (one of your methods or friends) has trashed your memory. This is really good to know, remember we should limit the domain as much as possible.