Own data in the system crash dump of Windows

By the nature of my activity (Windows Kernel), I regularly have to parse BSOD dumps. There are not a few cases when the end user successfully writes only Mini-dumps in which only the value of the processor registers and the fall stack are saved. And there is simply no other means of debugging the client machine. But what if the stack does not have our driver, and the customer insists that the crash started after installing the product and ended after the driver of this product was turned off? In my case, keeping a small log of recent events in a circular buffer turned out to be a good solution. It remains only to save this circular buffer in the dump.


Under the cut, I’ll tell you how to add data from my driver to the dump. And then extract them using pykd .
pykd


Starting with Windows XP SP1 and 2003 Server, the system provides an opportunity for drivers to add their own data to the core crash dump: Secondary Callback Data . In order for the system to request this data from the driver, you need to register your callback function by calling KeRegisterBugCheckReasonCallback . When registering, you need to specify the address of the function that will be called when the kernel crashes and, in our case ( BugCheckSecondaryDumpDataCallback ), provide data that needs to be supplemented with a system dump. The specified callback function will be called twice:


  1. The first time the system calls the driver to determine the size of the buffer. Already at this stage, the OS indicates the maximum data size ( KBUGCHECK_SECONDARY_DUMP_DATA .MaximumAllowed) in the input data , which can be saved in the dapma. This size depends on the type of system dump to be generated. In Windows XP, when the Mini-dump recording setting is set, the system provides 4096 bytes (one page of memory).
  2. The second time the system requests the data itself.

Due to the fact that the callback function is called when the kernel of the operating system crashes, serious restrictions are imposed on the code of this function: do not use memory allocation (everything is allocated in advance), do not access Paged memory (paging is not possible), and do not use mechanisms synchronization (risk of deadlocks). See the MSDN article, Writing a Bug Check Callback Routine, for more details .


Strange enough, but an example of using the KeRegisterBugCheckReasonCallback function is not in the collection of WDK examples . But the example was found in the open source Microsoft's KMDF (Kernel-Mode Driver Framework) - fxbugcheckcallback.cpp :


Handler registration: pieces of the FxInitializeBugCheckDriverInfo function
    //
    // The KeRegisterBugCheckReasonCallback exists for xp sp1 and above. So
    // check whether this function is defined on the current OS and register
    // for the bugcheck callback only if this function is defined.
    //
    RtlInitUnicodeString(&funcName, L"KeRegisterBugCheckReasonCallback");
    funcPtr = (PFN_KE_REGISTER_BUGCHECK_REASON_CALLBACK)
        MmGetSystemRoutineAddress(&funcName);

    if (NULL == funcPtr) {
        goto Done;
    }

    //
    // Initialize the callback record.
    //
    KeInitializeCallbackRecord(callbackRecord);

    //
    // Register the bugcheck callback.
    //
    funcPtr(callbackRecord,
            FxpLibraryBugCheckCallback,
            KbCallbackSecondaryDumpData,
            (PUCHAR)WdfLdrType);

    ASSERT(callbackRecord->CallbackRoutine != NULL);

Handler Implementation: FxpLibraryBugCheckCallback Function
VOID
FxpLibraryBugCheckCallback(
    __in    KBUGCHECK_CALLBACK_REASON Reason,
    __in    PKBUGCHECK_REASON_CALLBACK_RECORD /* Record */,
    __inout PVOID ReasonSpecificData,
    __in    ULONG ReasonSpecificLength
    )

/*++

Routine Description:

    Global (framework-library) BugCheck callback routine for WDF

Arguments:

    Reason               - Must be KbCallbackSecondaryData
    Record               - Supplies the bugcheck record previously registered
    ReasonSpecificData   - Pointer to KBUGCHECK_SECONDARY_DUMP_DATA
    ReasonSpecificLength - Sizeof(ReasonSpecificData)

Return Value:

    None

Notes:
    When a bugcheck happens the kernel bugcheck processor will make two passes
    of all registered BugCheckCallbackRecord routines.  The first pass, called
    the "sizing pass" essentially queries all the callbacks to collect the
    total size of the secondary dump data. In the second pass the actual data
    is captured to the dump.

--*/

{
    PKBUGCHECK_SECONDARY_DUMP_DATA  dumpData;
    ULONG                           dumpSize;

    UNREFERENCED_PARAMETER(Reason);
    UNREFERENCED_PARAMETER(ReasonSpecificLength);

    ASSERT(ReasonSpecificLength >= sizeof(KBUGCHECK_SECONDARY_DUMP_DATA));
    ASSERT(Reason == KbCallbackSecondaryDumpData);

    dumpData = (PKBUGCHECK_SECONDARY_DUMP_DATA) ReasonSpecificData;
    dumpSize = FxLibraryGlobals.BugCheckDriverInfoIndex * 
                sizeof(FX_DUMP_DRIVER_INFO_ENTRY);
    //
    // See if the bugcheck driver info is more than can fit in the dump
    //
    if (dumpData->MaximumAllowed < dumpSize) {
        dumpSize = EXP_ALIGN_DOWN_ON_BOUNDARY( 
                        dumpData->MaximumAllowed,
                        sizeof(FX_DUMP_DRIVER_INFO_ENTRY));
    }

    if (0 == dumpSize) {
        goto Done;
    }

    //
    // Ok, provide the info about the bugcheck data.
    //
    dumpData->OutBuffer = FxLibraryGlobals.BugCheckDriverInfo;
    dumpData->OutBufferLength  = dumpSize;
    dumpData->Guid = WdfDumpGuid2;

Done:;
}

As a demonstration, it is these data that we will extract from the dump. The data is an array of structures FX_DUMP_DRIVER_INFO_ENTRY , each structure has a version and a driver name in its fields. The key to the data in the dump is the GUID specified when writing, in our case it is {F87E4A4C-C5A1-4d2f-BFF0-D5DE63A5E4C3} .


To view the data stored in the dump, there is a debug command .enumtag . As a result of the command, we will see a raw memory dump. Here is an example of the data we are interested in:


1: kd> .enumtag
{65755A40-F146-43EA-8C9136B85728FD35} - 0x0 bytes
<...>
{F87E4A4C-C5A1-4D2F-BFF0D5DE63A5E4C3} - 0x508 bytes
  00 00 00 00 00 00 00 00 01 00 00 00 0D 00 00 00  ................
  00 00 00 00 57 64 66 30 31 30 30 30 00 00 00 00  ....Wdf01000....
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  00 00 00 00 00 00 00 00 90 AC 55 00 00 E0 FF FF  ..........U.....
  01 00 00 00 0B 00 00 00 00 00 00 00 61 63 70 69  ............acpi
  65 78 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ex..............
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  30 81 F6 00 00 E0 FF FF 01 00 00 00 0B 00 00 00  0...............
  00 00 00 00 6D 73 69 73 61 64 72 76 00 00 00 00  ....msisadrv....
  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  00 00 00 00 00 00 00 00 A0 D3 EB 00 00 E0 FF FF  ................
  01 00 00 00 0B 00 00 00 00 00 00 00 76 64 72 76  ............vdrv
<...>

You can work with this format, but not conveniently. Microsoft offers to write its extension to the debugger :


To use this data in a more practical way, it is recommended that you write your own debugger extension.

But I am one of the developers of the pykd project . The pykd module can be a debugger extension that allows you to use Python to automate debugging. Therefore, I will show how to use it to extract and visualize data. Immediately make a reservation that the listing and extraction of Secondary Callback Data was added in the latest (at the time of writing) release - 0.3.3.3. Therefore, if you already have an older version installed, you need to update pykd ( Last Release ).


As a test dump, I will use the file used for pykd unit tests - win8_x64_mem.cab


Actually, the entire script for reading and formatting data:


kmdf_tagged.py
import os
import sys
import pykd
import struct

def print_command(command):
    if pykd.getDebugOptions() & pykd.debugOptions.PreferDml:
        pykd.dprint( '<exec cmd="{}">{}</exec>'.format(command, command),
                     dml = True )
    else:
        pykd.dprint( command )

def parse():
    buff = bytearray( pykd.loadTaggedBuffer("F87E4A4C-C5A1-4d2f-BFF0-D5DE63A5E4C3") )
    entry_type = pykd.typeInfo("Wdf01000!_FX_DUMP_DRIVER_INFO_ENTRY")

    _struct = struct.Struct( "<{}III".format("Q" if pykd.is64bitSystem() else "L") )

    name_offset = entry_type.fieldOffset("DriverName")
    name_size = entry_type.DriverName.size()

    entry_size = entry_type.size()

    if len(buff) % entry_size:
        raise RuntimeError( "The buffer size ({}) is not a multiple of entry size ({})".format(len(buff), entry_size) )

    print("[FxLibraryGlobals.BugCheckDriverInfo]")

    while len(buff):
        ptr, mj, mn, build = _struct.unpack_from(buff)

        name = str(buff[name_offset : name_offset + name_size]).strip("\0")

        command = "!drvobj {} 7".format(name)
        print_command( command )

        pykd.dprint( " " * (24 - len(name)) )

        pykd.dprint( " {:12} ".format("({}.{}.{})".format(mj, mn, build)) )
        if ptr:
            command = "dx ((Wdf01000!{})0x{:x})".format(entry_type.FxDriverGlobals.name(), ptr)
            print_command( command )

        pykd.dprintln( "" )

        buff = buff[entry_size:]

if __name__ == "__main__":
    if len(sys.argv) == 1:
        parse()
    else:
        for file_name in sys.argv[1:]:
            print(file_name)
            dump_id = pykd.loadDump(file_name)
            parse()
            pykd.closeDump(dump_id)

The content of the script, in my opinion, is quite simple (parse function):


  • By calling pykd.loadTaggedBuffer we read the contents of the stored data, specifying the GUID as a key argument as a string.
  • Using the information from the debugging symbols (creating an instance of the pykd.typeInfo object), we get the offset to the driver name (name_offset), the size of the driver name buffer (name_size) and the size of one FX_DUMP_DRIVER_INFO_ENTRY structure (entry_size).
  • For each structure FX_DUMP_DRIVER_INFO_ENTRY in the subtracted buffer, using the standard python module struct, unpack the structure fields containing a pointer to the global driver object and version. And then we get the driver name, converting it to a string, discarding 0-characters. And we print the received data using DML , if the current environment allows us to use this markup language (print_command function).

We execute the script in the WinDbg debugger:
windbg_output


If you look at the contents of the script after the parse function, you will notice that the script can take an argument. Kmdf_tagged.py script is written so that would demonstrate the work in stand-alone mode (outside the debugger), if it is specified command-line argument. The script treats each argument passed as a path to the dump file, loads this dump and extracts the target data from it. In particular, a script can process dump files in batch mode:


~> for /R .\dumps %i in (*.*) do @python.exe kmdf_tagged.py %i
~\dumps\win8_x64_mem.cab
[FxLibraryGlobals.BugCheckDriverInfo]
!drvobj Wdf01000 7                 (1.13.0)
!drvobj acpiex 7                   (1.11.0)     dx ((Wdf01000!_FX_DRIVER_GLOBALS*)0xffffe0000055ac90)
<...>
!drvobj PEAUTH 7                   (1.7.6001)   dx ((Wdf01000!_FX_DRIVER_GLOBALS*)0xffffe000022081c0)
~\dumps\win8_x64_mem2.cab
[FxLibraryGlobals.BugCheckDriverInfo]
!drvobj Wdf01000 7                 (1.13.0)
!drvobj acpiex 7                   (1.11.0)     dx ((Wdf01000!_FX_DRIVER_GLOBALS*)0xffffe0000055ac90)
<...>
!drvobj PEAUTH 7                   (1.7.6001)   dx ((Wdf01000!_FX_DRIVER_GLOBALS*)0xffffe000022081c0)

I hope that my experience (and the contents of this article) will be useful to someone. And the number of BSODs, the reason for which remains a mystery, will tend to 0.