Archive for the ‘Windows Programing’ Category

Perspective

The intended audience of this article are Windows Driver C++ developers and architects, It is assumed that the reader of this article is familiar with object oriented programming and design and is intimately acquainted with the Windows Operating System.

For the purpose of brevity and clarity, thread synchronization and error checking aspects are omitted and not discussed in details in this the article.

Motivation

For a while I have been searching for the means of simulating Bluetooth HID devices under windows desktop, this apparently, is not that trivial since the Bluetooth HID interface is reserved for operating system use.

This Article provide a brief review of windows 8 Bluetooth stack & Profile Drivers, Describes it’s limitation with HID, and, present a work-around enabling HID device simulation using windows standard Bluetooth stack.

High level overview

The above present the main modules related with our use-case, in green, are custom modules developed by a 3rd party, in blue, are protocols/APIs provided by the operating system, in red, are operating system modules we patch in-order to achieve the desired functionality, and, in orange are physical HW components.

Profile Driver is a mini port driver implementing a specific Bluetooth service, in contrast to RFCOMM services which can be implemented using winsock on the user-realm, services that directly use L2CAP ( such as HID ) mandate Kernel-mode profile driver (KDMF) implementation.

The HCI layer provides a unified API for communicating with Bluetooth controllers, of specific interest for us is the HCI CoD ( Class of device/service ) indicating the type of Bluetooth peripheral, unfortunately, with windows built-in bluetooth stack the CoD is limited to COD_MAJOR_COMPUTER, and this, limits connectivity with various Bluetooth devices such as iOS which mandate a ‘Peripheral’ major class and a minor class of eg. ‘Keyboard’ ( where the CoD is 0x540 ), I have found this tool to be quite useful in generating popper Bluetooth CoDs

SDP stands for Service Discovery Protocol, it is used to report the type of services provided by the Bluetooth device, this is, for example, where HID devices report their descriptors or where a Bluetooth headset report it’s audio interface.

L2CAP is a lower level transport layer over-which various other protocols are implemented ( eg. RFCOMM ), it is responsible, among other things, for maintaining sequential packet connection to remote devices and multiplexing data from various Bluetooth services, this is the transport used for HID devices, with L2CAP services are identified by a unique Protocol/Service Multiplexer (PSM) identifier, for HID, two specific and pre-defined PSMs are needed, Interrupt and Control ( 0x13 & 0x11 correspondingly ), the first is used for Device to Host communication and the latter is used for Host to Device. Windows built-in bluetooth stack reserve these PSMs for OS use preventing HID devices simulation, later on in the article I will explain how to go around this limitation.

bthport.sys is a kernel module encapsulating Bluetooth logic including, among others, HCI, SDP, L2CAP, …. It is not a driver, rather, it is a dynamic library directly used by Profile drivers and other system components to implement Bluetooth services. bthport.sys is responsible for reserving the HID PSMs ( 0x13 & 0x11 ) using the ‘bthprot!BthIsSystemPSM‘ internal method, I will show, later in this article how to patch this method to go around PSM reservation.

Bluetooth L2CAP HID Connection Flow

The above high-level level diagram present the main steps in establishing a HID Bluetooth connection, in red, is the initialization phase where we register the PSMs to be used, once these are registered we are able to receive incoming L2CAP connections, the initialization phase is elaborated the next chapter.

Once the L2CAP connection is established HID Reports are sent to the controlled device and back indicating Key-Strokes and feedback from the device.

Driver Initialization

The diagram to the left illustrate the main steps in setting-up an L2CAP HID Profile Driver, The first thing needed is to register a callback method to be invoked upon incoming L2CAP connections, this is done by querying for the Profile driver interface using WdfFdoQueryForInterface, Allocating a Bluetooth Request Block ( BRB ), Setting up the BRB and dispatching it to the IoTarget.

Once the L2CAP callbacks are installed the required PSMs are registered, this is done by dispatching a BRB_REGISTER_PSM with the desired PSMs, In our case: 0x1 for SDP, 0x13 for the Control channel, and, 0x11 for Interrupt.

Registering PSMs reserved for OS use will fail, that is, also if there is no existing connected/paired HID device, the next chapter discuss an approach to go around that limitation.

Once the PSMs are registered we need to use them in conjunction with the Keyboard HID Descriptor to set-up the SDP, bthport.sys expose “GUID_BTHDDI_SDP_PARSE_INTERFACE” obtained using WdfFdoQueryForInterface for that purpose.

Once the SDP is ready it is published to the IoTarget to be listed on the available Bluetooth services of the Desktop machine.

Reserved PSMs Workaround

As earlier mentioned, with Windows OS the HID PSMs are reserved and cannot be used by Profile Drivers, PSM registration logic is implemented by bthport.sys, to work around the PSM limitation a binary patch is applied.

bthport.sys implement an internal method called “BthIsSystemPSM”, this is where the magic happen and where the patch is applied, the process consists of the following steps:

  1. Upon driver startup, Find bthport.sys!BthIsSystemPSM on the loaded binary image
    for that, it is needed to get the address of a reference method in bthport.sys and have the offset to BthIsSystemPSM, this is used to get access to the binary code responsible for OS PSM reservation.
    The reference method we use is bthport.sys!BthAllocateBrb, this method is exposed using the BTH_PROFILE_DRIVER_INTERFACE we have previosly retrieved by calling WdfFdoQueryForInterface
    bthport.sys!BthIsSystemPSM is not accessible using WdfFdoQueryForInterface, getting it’s address is not straight forward and require some low-level PE analysis, Using IDA ( Interactive Disassembler ) we can resolve bthport.sys!BthIsSystemPSM and bthport.sys!BthAllocateBrb PE offsets, get the relative distance ( which is identical to the distance when the PE is loaded in memory ) and use it to find bthport.sys!BthIsSystemPSM address on the loaded binary ( on runtime )
  2. Binary-code modification
    The extracted binary code for bthport.sys!BthIsSystemPSM is the following:

    B8 ED FF 00 00 66 FF C9 66 85 C8 0F 94 C0 C3 CC

    Having this dis-assembled results the following, in green are the values of the relevant registers before executing the instruction on that line

    01> b8 ed ff 00 00
    02> 66 ff c9
    03> 66 85 c8
    04> 0f 94 c0
    05> c3
    06> cc
    mov eax,0FFEDh
    dec cx
    test ax,cx
    sete al
    ret
    int 3
    // ZF:1 AL:0x09 AX:0x109 CX:0x11
    // ZF:1 AL:0xed AX:0xffed CX:0x11
    // ZF:0 AL:0xed AX:0xffed CX:0x10
    // ZF:1 AL:0xed AX:0xffed CX:0x10
    // ZF:1 AL:0x01 AX:0xffed CX:0x10

    With the above assembly code, the PSM is set through register CX ( 0x11 in our case ), this value is deduced by one on line #02 and then, on line #03 applied a ‘bitwise and’ operator with the value at register AX ( 0xFFED ), this result a zero value, sets the Zero Flag ( register ZF ) to 1 and assign it’s value to register AL, the calling code evaluates AL to make sure if the PSM can be used by the calling code, in our case a value of AL=1 will cause BRB_REGISTER_PSM to fail.

    In order to workaround this PSM verification we should cause the code to return with AL set to Zero, this will prevent the calling code from rejecting our PSMs, and will, in turn, enable L2CAP HID connections

    Inspecting line #03 of the above it is clear that “0 == (0xFFED & (0x11 – 1))”, and also that “0 == (0xFFED & (0x13 – 1))”, and hence, we need to change the 0xFFED mask such that the bitwise and operation result will be different than Zero, this, is achieved by changing the 0xFFED mask to 0xFFFD, having that set, the code execute as follows:

    01> b8 fd ff 00 00
    02> 66 ff c9
    03> 66 85 c8
    04> 0f 94 c0
    05> c3
    06> cc
    mov eax,0FFFDh
    dec cx
    test ax,cx
    sete al
    ret
    int 3
    // ZF:1 AL:0x09 AX:0x109 CX:0x11
    // ZF:1 AL:0xfd AX:0xfffd CX:0x11
    // ZF:0 AL:0xfd AX:0xfffd CX:0x10
    // ZF:0 AL:0xfd AX:0xfffd CX:0x10
    // ZF:0 AL:0x00 AX:0xfffd CX:0x10

    The above returns AL=0 resulting acceptance of the HID PSMs, the raw binary code we need to update looks as follows:

    B8 FD FF 00 00 66 FF C9 66 85 C8 0F 94 C0 C3 CC

  3. Patch the binary code
    Once we have the offset between bthport.sys!BthIsSystemPSM and bthport.sys!BthAllocateBrb we need to updated the binary code with the above mentioned modification, we can’t however, directly update the binary code, before doing so we need to clear the Write Protectin ( WP ) bit of register cr0, Apply the update and then return the original value of cr0, this is done using the __writecr0 and __readcr0 kernel intrinsics, once we are done with the modification our code can freely Register the HID PSMs and intercept incoming HID L2CAP connections.

Sample Code

BOOL HackBthPort(IN const BTH_PROFILE_DRIVER_INTERFACE& itf) {
    // 000000000008B4D0 – BthIsSystemPSM PE Offset ( fixed )
    // 0000000000083698 – BthAllocateBrb PE Offset ( fixed )
    // bthprot!BthIsSystemPSM: B8 ED FF 00 00 66 FF C9 66 85 C8 0F 94 C0 C3 CC
    UCHAR    pMachineCode[] = { 0xb8, 0xed, 0xff, 0x00,
                                0x00, 0x66, 0xff, 0xc9,
                                0x66, 0x85, 0xc8, 0x0f,
                                0x94, 0xc0, 0xc3, 0xcc };
    // The approx distance between ‘itf.BthAllocateBrb’ and ‘bthprot!BthIsSystemPSM’
    INT64    qwOffset    = (INT64)(0x8B4D0 – 0x83698);
    PUCHAR    pAddr        = (PUCHAR)((UINT64)itf.BthAllocateBrb + qwOffset);
    
    // Make the start address page aligned
    PUCHAR    pAddrStart    = (PUCHAR)((UINT64)pAddr & 0xfffffffffffff000);
    UINT64    qwTrailer    = *(UINT64*)(pMachineCode + sizeof(pMachineCode)
                            - sizeof(qwTrailer));
    PUCHAR    pAddrEnd    = pAddrStart + PAGE_SIZE – sizeof(qwTrailer);

    pAddrStart += sizeof(pMachineCode) – sizeof(qwTrailer);
    while (pAddrStart <= pAddrEnd) {
        if (*(UINT64*)(pAddrStart) == qwTrailer) {
            if (0 == memcmp(pAddrStart + sizeof(qwTrailer) – sizeof(pMachineCode),
                            pMachineCode, sizeof(pMachineCode) – sizeof(qwTrailer)))
            {
                NT_ASSERT(0xed == pAddrStart[-7]);
                const auto cr0 = __readcr0();
                const auto cr0noWP = cr0 & 0xFFFFFFFFFFFEFFFF;// Clear the WP bit
                __writecr0(cr0noWP);
                pAddrStart[-7] = 0xfd;// Patch the code!!!
                __writecr0(cr0);
                return TRUE;
            }
        }
        pAddrStart++;
    }
    return FALSE;
}

Risks & Limitations

  • All kernel modules are running under a shared address space, any change done to bthport.sys will affect any other module/driver referring/using it.
  • The HID PSMs are reserved by the OS for a reason, Usage of this patch should be done with care when other Bluetooth HID devices are connected.
  • This binary patch assumes a specific bthport.sys version with fixed bthport.sys!BthIsSystemPSM relative positing, while the above sample code demonstrate some flexibility regarding finding the right offset, updated bthport.sys versions might require re-calculating the new offsets.
  • As mentioned before, Some devices expect a specific HID CoD values, Windows OS doesn’t support the required HCI level API for changing the CoD, this way, using this kernel patch will enable HID device simulation for eg. Android Devices but not for iOS devices, the reader is encouraged to use the approach described in this article to patch this through.
  • Windows kernel implement a mechanism called Kernel Patch Protection (KPP), this mechanisms verify no binary changes were applied to core kernel modules on runtime, at the time this article was written bthport.sys wasn’t one of these modules, this, may ( and may not ) change in the future.

Disclaimer

This Article discuss implementing an HID device using the Windows Desktop Bluetooth stack, this stack is limited and mandate a binary patch, When Windows OS is not a hard requirement the reader is encouraged to use solutions where the above mentioned is natively supported, such as the Linux BlueZ stack.

The patch was implemented on Windows 8 OS (x64) and should be verified if used on other/newer OS versions

References

KDMF Profile drivers, RFCOMM, Assigned CoD Numbers – Bluetooth Baseband, Bluetooth Class of Device/Service (CoD) Generator, PSMs reserved for OS use, HID: Human Interface Device Class, Bluetooth Request Block, L2CAP Bluetooth Echo Sample, Service Discovery Protocol, Keyboard HID Descriptor, Hex-Rays Interactive Disassembler (IDA), A Guide to Kernel Exploitation, Kernel Patch Protection (KPP), Linux BlueZ stack,

Perspective

This Article is designated for experienced C++ developers, It is assumed that the reader has basic experience with windows OS driver development.

Introduction

WinUSB is Microsoft user-mode framework for communicating with USB devices.

The main aim of WinUSB is to reduce development cost by exposing a user-mode application level USB API, this, save the time and effort of developing a USB Driver and enables the developer to focus on application logic development.

As of Windows 8.1, Comparing to a fully fetched USB Kernel mode driver, WinUSB has few limitations, one, is the fact that while a USB device might expose multiple Configurations to choose from, WinUSB supports only the default Configuration ( the first one ).

This Article describe an approach enabling WinUSB to use any of the Configurations exposed by the USB device.

Few words about USB

“Universal Serial Bus (USB) is an industry standard developed in the mid-1990s that defines the cables, connectors and communications protocols used in a bus for connection, communication, and power supply between computers and electronic devices.” ( Commenting Wikipedia )

A USB device can expose multiple configurations, at any given moment only a single configuration can be used by the application communicating with the USB device.

While setting up a USB connection, the SW selects the configuration to be used for device communication.

Use this link for a through USB explanation.

Few words about windows drivers

In most cases, a driver is a module ( DLL ) implementing functionality related with operating a specialized HW device, In windows, multiple drivers can be used to operate a single HW device, each, implement a subset of the requiered functionality,
these drivers are grouped in stacks where each IO request is executed from the top most driver to the one bellow, each driver, in it’s turn can modify the IO being executed and execute additional logic, In addition there is a special type of drivers called filter drivers, these are used to ~filter~ the IOs goin in and out of another driver, we will get deeper into details regarding this type of drivers later on.

With USB, there are several types of drivers, The USB host-controller driver, Hub driver and more are provided by the OS, a specialized USB device requires a function driver on top of the Host-Controller & Hub drivers, these implement the specific USB HW Bus & Hub logic, the specialized USB driver will then need to implement only the functionality related with the specific HW device.

A detailed explanation of windows USB architecture can be found in this link.

The WinUSB.sys driver

Commenting msdn, “Windows USB (WinUSB) is a generic driver for USB devices that was developed concurrently with the Windows Driver Frameworks (WDF) for Windows XP with SP2. The WinUSB architecture consists of a kernel-mode driver (Winusb.sys) and a user-mode dynamic link library (Winusb.dll) that exposes WinUSB functions. By using these functions, you can manage USB devices with user-mode software.”

USB 2.0 Driver stack

Implementation

We are going to make WinUSB think it is selecting the first/default configuration while under the hood we will switch the default configuration with the one desired, to achieve that,
we will implement a lower-level filter driver, one that will intercept all URBs sent by WinUSB downwards,
and, when needed, change them.

We connect to the default device queue and in specific intercept the URB_FUNCTION_CONTROL_TRANSFER and URB_FUNCTION_GET_DESCRIPTOR_FROM_DEVICE URBs where we change the configuration index from the default one requested by WinUSB to the one we want

switch (pUrb->UrbHeader.Function) {
	case URB_FUNCTION_CONTROL_TRANSFER:
		if ((USB_REQUEST_GET_DESCRIPTOR != pUrb->UrbControlTransfer.SetupPacket[1]) || 
		    (USB_CONFIGURATION_DESCRIPTOR_TYPE!=pUrb->UrbControlTransfer.SetupPacket[3]))
			break;
		if (USB_DEFAULT_CFG_INDEX == pUrb->UrbControlTransfer.SetupPacket[2])
			pUrb->UrbControlTransfer.SetupPacket[2] = m_btZeroCfgSwitch;
		else if (m_btZeroCfgSwitch == pUrb->UrbControlTransfer.SetupPacket[2])
			pUrb->UrbControlTransfer.SetupPacket[2] = USB_DEFAULT_CFG_INDEX;
		break;
	case URB_FUNCTION_GET_DESCRIPTOR_FROM_DEVICE:
		if(USB_CONFIGURATION_DESCRIPTOR_TYPE!=pUrb->UrbControlDescriptorRequest.DescriptorType)
			return TRUE;// This is not what we are looking for...
		if (USB_DEFAULT_CFG_INDEX == pUrb->UrbControlDescriptorRequest.Index) {
			pUrb->UrbControlDescriptorRequest.Index = m_btZeroCfgSwitch;
		} else if (m_btZeroCfgSwitch == pUrb->UrbControlDescriptorRequest.Index) {
			pUrb->UrbControlDescriptorRequest.Index = USB_DEFAULT_CFG_INDEX;
		}
		break;
}

the Figure above show what is needed to switch the default configuration with the desired value, m_btZeroCfgSwitch indicate the configuration index to replace the default with, the two control requests we need to modify
are URB_FUNCTION_GET_DESCRIPTOR_FROM_DEVICE and USB_REQUEST_GET_DESCRIPTOR where we specifically intercept the extraction of the configuration descriptor.

SetupPacket format is in accordance to Table 9-3 of the USB_3_1 spec shown bellow, for configuration query, the first byte of wValue indicate the index, this is what
we modify to make WinUSB use the configuration we want.

Coding & Concepts

The driver is implemented as a C++ KDMF Lower Filter driver, it consists of a simple C++ class for the device called UsbCfgDevice where most of the logic is implemented and a simple user-mode WinUSB application used to interact with the device.

The WinUSB Use-mode app is using WinUSB API to open the device and verify that the id of the selected configuration is equal to the id of the configuration at index zero ( the one we have overridden ), while this is true for any WinUSB application, when the default CFG is overriden, the value of the selected configuration will be different than one ( assuming the CFGs are numbered incrementally by the HW device ).

Tracing is implemented using the built-in WPP framework, To monitor logging on the debugee create a monitoring session using ‘traceview.exe’ and the driver PDB ( make sure to set logging level to verbose ).

The Driver is HW specific, the HW for which it is installed ( along with WinUSB ) is defined in the associated INF ( Explained next ).

Driver Installation

Driver installation is done using a standard INF file indicating the files and registry entries to be updated on the operating system.

Of specific importance are the following INF sections:

            [Version]
            ...
            Class     = USBDevice
            ClassGUID = {88BAE032-5A81-49f0-BC3D-A4FF138216D6}
            ...

The Class and ClassGUID indicate the type of the driver being installed in accordance with a system-defined device setup classes

            [Standard.NT$ARCH$]
            %DeviceName%=USB_Install, USB\VID_nnnn&PID_nnnn

Defines the HW ( using the Vendor Id and Product Id ) for which the driver is to be installed, this section can include multiple HW definitions, each having a specialized VID and PID ( ‘nnnn’ is replaced with the actual ids )

            [UsbCfgCtrl_AddReg]
            HKR,,"LowerFilters",0x00010000,"UsbCfgCtrl"
            HKR,,"DefaultCfgIndex",0x00010001,"1"

Defines the driver as a lower filter driver ( installed bellow winusb.sys on the driver stack ), and, the Configuration index we want to replace the default with.

Prerequisites

References

www.usb.org USB_3_1 spec USB Architecture
WinUSB Lower filter drivers USB Request Blocks
URB Header Structure INF Files Using traceview.exe
Source Code

Preface

This Article is intended for experienced C++ developers, It is assumed that the reader is familiar with Windows API programming.

Introduction

An example for memory overrun is when writing to a memory block more bytes then what was actually allocated, and, possibly overwriting memory intended to be used for other purpose, This, might cause unpredictable behavior when the overwritten memory is accessed in future time.

While the symptom of a memory overrun is easily detected ( usually a crash of some sort ), The cause of a memory overruns is much harder to find, that is, because the overruned memory might be used long after the time the overrun has actually happened.

In this Article I will demonstrate a simple approach to pin-point the cause ( rather than the symptom ) of the memory overrun as it happens: the debugger show the line of code causing the breach.

Few words about Memory allocation

Memory is aimed to be sequentially allocated on the heap ( that is, when it is not fragmented ), This means that two sequential allocation requests will result in two adjacent blocks of memory on the heap, Each of these blocks is built of a small header internally used by the OS followed by a block of bytes in the requested size, This is illustrated
in Figure 1 bellow:


                      pFirst[13]                                      pSecond[13]                   

     – Memory internally used by OS to keep track of the allocated block

     – X amount of bytes requested by HeapAlloc/malloc/new, …

Having the above in mind, when a memory overrun occurs data written into ‘pFirst’ breach the size initially allocated and overruns part of the next memory block, That is, data written to ‘pFirst[13+1]’ will overwrite the header of the next memory block and might also overwrite ‘pSecond’ if the breach is large enough.

Example

The code example in Figure 2 bellow demonstrates a simple memory overrun scenario where ‘memset’ is called to set 26 bytes of a memory block allocated with only 13 bytes producing a memory overrun while breaching into ‘pSecond’ array address space

Figure 2

Section (A) at ‘memset(pSecond, … )’ cause the memory overrun, The impact is not immediate.
Section (B) is where the application fails, while HeapFree ( called by the delete operator ) is trying to deallocate the memory, it is accessing the memory block header ( Internally managed by the OS ), This header was corrupted at (A) and contains invalid data, This cause heap validation to fail ( RtlValidateHeap ) that eventually leads to a premature termination of the application.

So what is so special about memory overruns? Well, as seen in the above example, the impact of a memory overrun is not immediate, Thus, with the above application, the actual overrun has occurred at (A) while the first place it had an impact is at (B), ‘long time’ after the overrun has actually occurred, The above is a simple example though there might be much more complex scenarios where multiple classes and threads are involved.

What if we could intercept the memory overrun at the point where it happens ( at A )?, Then, obviously, it would have been much easier to pin-point the bug.

Digging deeper

The OS manage memory using pages, A page is 4096 bytes size ( for both x86 and x64 ), A single allocated block of memory might span over multiple pages, Each page has protection rights, The protection right given to a block of memory allocated on the heap is PAGE_READWRITE, This enables reading and writing from that page, There are other protection rights, Interesting in-specific is the PAGE_GUARD right, Which prevents any access to the page, If accessed EXCEPTION_GUARD_PAGE is raised, ‘How does this is related with memory overruns’ you might ask,
Well, having a EXCEPTION_GUARD_PAGE exactly at the end of each allocation block will immediately trigger an EXCEPTION_GUARD_PAGE if the allocated block boundary is breached, and hence, indicate the overrun as it happens, giving control to the debugger ( if attached ) for further analysis.

Implementation

So how can we do that? Well, obviously this requires implementation of custom allocation function and/or to overload the existing methods ( eg. the new operator ), First thing we should start with, is aligning the end ( and not start ) of each memory block with the end of a memory page, This can be achieved using the _aligned_malloc method, Using this method with alignment of a PAGE_SIZE, we can guarantee that the allocated memory block will start ( and not end ) on a new page of memory, Aligning the requested bytes count to the next highest multiple of PAGE_SIZE will enable us to align the end of the memory block with the end of the page, Next, is to have a guard page aligned to the end of the allocated memory block, This can be done by allocating one page more than what is needed and aligning the end of the memory block to the start of that page, All remaining now, Is to set the memory protection of the last ‘guard page’ to PAGE_GUARD, This can be easily achieved using the VirtualProtect API, Figure 3 bellow demonstrate how this can be achieved.

Figure 3

The above code snap demonstrate implementation of the ‘guard page’ concept with basic memory allocation functions

Note that the ‘realloc‘ method doesn’t really try to reallocate memory in the fashion
HeapRealloc does, rather, It checks if the new bytes count falls into the PAGE_SIZE boundary, If it does, internal structures are adjusted and memory is moved to reflect the new requested bytes count, Otherwise, a totally new block of memory is allocated at the requested size, data is copied, the old memory block is freed and a pointer to the new block is returned.
Trying to reallocate memory in the HeapRealloc fashion will force removing the PAGE_GUARD from the guard page for a short period of time, and this, might let memory overruns happen without being noted.

The code snap presented at Figure 4 bellow shows how to use the above memory management methods with some overloads of the new operator. These overloads must be defined on global scope to be able to properly override the defaults.

Figure 4

Using the suite of methods presented at Figures 3 and 4 with the application presented at Figure 2 will cause a debugged application to break at (A) and not at (B) as with the standard CRT implementation, and hence will enable identification of the problem as it happens making it much easier to resolve.

Final words

The above suite of methods is good while hunting for memory overruns, however, nothing comes for free, These methods have memory size and performance penalty, Specifically when reallocating memory using ‘Memory::realloc’, Having that in mind it is advised to use this suite of methods only on _DEBUG mode while on Release to use the standard CRT implementation, This can easily be implemented by combining few #ifdef statements.

It is worth noting that PAGE_GUARD will except only once, if the exception is suppressed no further PAGE_GAURD exceptions will be generated for the page, to support repeating exceptions use eg. PAGE_NOACCESS rather than PAGE_GAURD.

Preface

This Article is intended for developers experienced with C++ and Low-Level Windows API programming

Introduction

The focus of this article is to discuss an autonomous method of generating dump files w/o the need of any development tool installed.

It starts by giving a high level explanation of what dump files are and what they are used for, then, it present few of the most common development tools used to generate dump files and discuss windows exception model, finally, a way of generating dump files w/o the need of any development tool is presented.

So what a Dump file is?

A dump file is the image of the process at a certain point in time, this process image can include various information such as the call stack & stack variables, loaded module list, and even an image of the raw memory used by the application.
This valuable information can then be used to analyze the process state at the time the dump file was generated.

What is it used for?

In most cases ( but not only ) Dump files are used to identify the root of an exceptional condition causing the process to abnormally terminate ( a 2nd chance exception ), having a dump file generated just before the application has crashed will enable postmortem analysis of the process state when it has crashed, and thus, enables pin-pointing the root of the problem.

Using MS Visual Studio to generate memory dumps

Microsoft Visual Studio enable generation of memory dumps while breaking the execution of a debugged process, this can be done through the Debug->Save Dump As menu item as illustrated in Figure 1 bellow

Figure 1

Two dump file types are support by the IDE, a ‘minidump’ that include stack trace information ( resulting small files ), and a ‘minidump with heap’ including the full memory image ( resulting large files ).

Using ADPlus to generate dump files

Debugging tools for windows is a light weight suite of tools for debugging applications, It is ideal for customer site problem resolution, and for scenarios where it is not possible to install heavy duty development environments such as Microsoft Visual Studio.

ADPlus ( also known as ‘AutoDump+’ ) is a light weight tool used to automatically generate Dump files, that is, upon abnormal process termination a Dump file will automatically be generated enabling postmortem analysis of the process state when it has crashed, it also support automatic dump generation upon deadlocks, Figure 2 bellow present sample ADPlus command line.

ADPlus.exe –crash –pn winword.exe –o d:\Dumps

Figure 2

The above attach ADPlus to winword.exe and generates dump files at ‘d:\dumps’ upon winword.exe crash, click here for the full command line specification.

Analyzing Dump Files

Dump file analysis is the phase where postmortem takes place, Starting with Microsoft Visual Studio 10, it is possible to directly analyze dump files for un-managed applications through the IDE, this is done through the “File->Open->’File…’” menu and then by selecting the dump file to analyze ( ‘*.dmp’, ‘*.mdmp’, ‘*.hdmp’ extensions ).

Once opened, Click the ‘Play’ Icon and the IDE will take you to the point where the application was breaking.

It is important to note that for Dump Analysis to properly work it is essential to keep the symbol files ( .pdb ) associated with the executable for which the dump was created, these should then be used during the analysis process.

Dump file analysis for managed applications is supported by debugging tools for windows and will be covered in a specialized Article.

Process termination due to Exceptional condition

A Process might be abnormally terminated due to an exceptional condition preventing normal process execution, such an exceptional condition is usually due to a programming error ( a SW bug ), A list describing common exceptions can be found here.

The operating system use Structured exceptions to indicate such exceptional behavior, the executing application will get the first chance to deal with the exception, and if not dealt with or if dealt with but not suppressed, the operating system will get the second chance to deal with it, having 1st and 2nd chance exceptions respectively, most of the time when 2nd chance exceptions are generated the operating system will terminate the application ( a crash ), exceptions are eg. debugger breakpoints ( DebugBreak() ) where once intercepted, the OS will open a dialog letting the programmer to choose if he wants to debug the application ( assuming a debugger is installed ) or supress the exception.

Generating Dump files upon abnormal termination

No more than few lines of code are needed to be able to automatically generate dump files when the application is crashing, Figure 3 bellow demonstrate what is needed.

Figure 3

The above code snap uses Structured Exception handling to intercept 2nd chance exceptions, this is done by installing the Unhandled exception handler ‘__TopLevelExceptionHandler’ ( using SetUnhandledExceptionFilter ) that intercept all 2nd chance exceptions.

Once an exception has been intercepted ‘__TopLevelExceptionHandler’ is invoked and does the actual dump file generation.

The un-handled exception handler ( in our case ‘__TopLevelExceptionHandler’ ) is executed on the context of the thread throwing the exception, thread stack is not collected while the handler is executed, this, might limit the exception handler implementation on stack overflow scenarios where there might not be enough space left on the stack to execute the handler functionality, for this, ‘__TopLevelExceptionHandler’ create a separate thread where the actual ~dumping~ process will synchronously execute.

The actual dumping process is executed by the ‘__GenerateDumpFile’ method, in specific by using the MiniDumpWriteDump API.

By default the dump file will be generated at the directory of the executing process, the name of the file include the time, the exception code, and, the name of the process.

The code can easily be integrated in to any C++ application enabling automatic dump file generation, and, reducing the cost of customer site probelm interception.

Final words

I was trying to have the code provided with this article as clear & simple as possible, The generated dump files might take considerable disk space, integrating this code with any commercial product will req implementation of a dump file recycling mechanism.