Archive for December, 2013

Preface

This Article is intended for experienced C++ developers, It is assumed that the reader is familiar with Windows API programming.

Introduction

An example for memory overrun is when writing to a memory block more bytes then what was actually allocated, and, possibly overwriting memory intended to be used for other purpose, This, might cause unpredictable behavior when the overwritten memory is accessed in future time.

While the symptom of a memory overrun is easily detected ( usually a crash of some sort ), The cause of a memory overruns is much harder to find, that is, because the overruned memory might be used long after the time the overrun has actually happened.

In this Article I will demonstrate a simple approach to pin-point the cause ( rather than the symptom ) of the memory overrun as it happens: the debugger show the line of code causing the breach.

Few words about Memory allocation

Memory is aimed to be sequentially allocated on the heap ( that is, when it is not fragmented ), This means that two sequential allocation requests will result in two adjacent blocks of memory on the heap, Each of these blocks is built of a small header internally used by the OS followed by a block of bytes in the requested size, This is illustrated
in Figure 1 bellow:


                      pFirst[13]                                      pSecond[13]                   

     – Memory internally used by OS to keep track of the allocated block

     – X amount of bytes requested by HeapAlloc/malloc/new, …

Having the above in mind, when a memory overrun occurs data written into ‘pFirst’ breach the size initially allocated and overruns part of the next memory block, That is, data written to ‘pFirst[13+1]’ will overwrite the header of the next memory block and might also overwrite ‘pSecond’ if the breach is large enough.

Example

The code example in Figure 2 bellow demonstrates a simple memory overrun scenario where ‘memset’ is called to set 26 bytes of a memory block allocated with only 13 bytes producing a memory overrun while breaching into ‘pSecond’ array address space

Figure 2

Section (A) at ‘memset(pSecond, … )’ cause the memory overrun, The impact is not immediate.
Section (B) is where the application fails, while HeapFree ( called by the delete operator ) is trying to deallocate the memory, it is accessing the memory block header ( Internally managed by the OS ), This header was corrupted at (A) and contains invalid data, This cause heap validation to fail ( RtlValidateHeap ) that eventually leads to a premature termination of the application.

So what is so special about memory overruns? Well, as seen in the above example, the impact of a memory overrun is not immediate, Thus, with the above application, the actual overrun has occurred at (A) while the first place it had an impact is at (B), ‘long time’ after the overrun has actually occurred, The above is a simple example though there might be much more complex scenarios where multiple classes and threads are involved.

What if we could intercept the memory overrun at the point where it happens ( at A )?, Then, obviously, it would have been much easier to pin-point the bug.

Digging deeper

The OS manage memory using pages, A page is 4096 bytes size ( for both x86 and x64 ), A single allocated block of memory might span over multiple pages, Each page has protection rights, The protection right given to a block of memory allocated on the heap is PAGE_READWRITE, This enables reading and writing from that page, There are other protection rights, Interesting in-specific is the PAGE_GUARD right, Which prevents any access to the page, If accessed EXCEPTION_GUARD_PAGE is raised, ‘How does this is related with memory overruns’ you might ask,
Well, having a EXCEPTION_GUARD_PAGE exactly at the end of each allocation block will immediately trigger an EXCEPTION_GUARD_PAGE if the allocated block boundary is breached, and hence, indicate the overrun as it happens, giving control to the debugger ( if attached ) for further analysis.

Implementation

So how can we do that? Well, obviously this requires implementation of custom allocation function and/or to overload the existing methods ( eg. the new operator ), First thing we should start with, is aligning the end ( and not start ) of each memory block with the end of a memory page, This can be achieved using the _aligned_malloc method, Using this method with alignment of a PAGE_SIZE, we can guarantee that the allocated memory block will start ( and not end ) on a new page of memory, Aligning the requested bytes count to the next highest multiple of PAGE_SIZE will enable us to align the end of the memory block with the end of the page, Next, is to have a guard page aligned to the end of the allocated memory block, This can be done by allocating one page more than what is needed and aligning the end of the memory block to the start of that page, All remaining now, Is to set the memory protection of the last ‘guard page’ to PAGE_GUARD, This can be easily achieved using the VirtualProtect API, Figure 3 bellow demonstrate how this can be achieved.

Figure 3

The above code snap demonstrate implementation of the ‘guard page’ concept with basic memory allocation functions

Note that the ‘realloc‘ method doesn’t really try to reallocate memory in the fashion
HeapRealloc does, rather, It checks if the new bytes count falls into the PAGE_SIZE boundary, If it does, internal structures are adjusted and memory is moved to reflect the new requested bytes count, Otherwise, a totally new block of memory is allocated at the requested size, data is copied, the old memory block is freed and a pointer to the new block is returned.
Trying to reallocate memory in the HeapRealloc fashion will force removing the PAGE_GUARD from the guard page for a short period of time, and this, might let memory overruns happen without being noted.

The code snap presented at Figure 4 bellow shows how to use the above memory management methods with some overloads of the new operator. These overloads must be defined on global scope to be able to properly override the defaults.

Figure 4

Using the suite of methods presented at Figures 3 and 4 with the application presented at Figure 2 will cause a debugged application to break at (A) and not at (B) as with the standard CRT implementation, and hence will enable identification of the problem as it happens making it much easier to resolve.

Final words

The above suite of methods is good while hunting for memory overruns, however, nothing comes for free, These methods have memory size and performance penalty, Specifically when reallocating memory using ‘Memory::realloc’, Having that in mind it is advised to use this suite of methods only on _DEBUG mode while on Release to use the standard CRT implementation, This can easily be implemented by combining few #ifdef statements.

It is worth noting that PAGE_GUARD will except only once, if the exception is suppressed no further PAGE_GAURD exceptions will be generated for the page, to support repeating exceptions use eg. PAGE_NOACCESS rather than PAGE_GAURD.