How to do kernel debugging for Windows NT 3.1 build 196

Spoiler: it's fairly straightforward, especially if you've done NT kernel debugging before.

If you've been messing around with the recently uploaded build 196 of Windows NT 3.1 from September 1991 - now the earliest publicly available build of Windows NT (thanks to ReflectiaX for this one!), you probably noticed that it likes to hang a lot. And since it is a checked (debug) build, this raises the question: is the system actually crashing completely, or is it just waiting for the debugger to do something?

Luckily, this build includes what I think is the earliest known version of the NT kernel debugger, kd (for a long time the i386 version was appropriately named i386kd). However, this is so early that it's an OS/2 1.x program that won't run on Windows NT, and 196 is not even stable enough to debug itself anyway. To use that particular version you therefore need an OS/2 1.x (virtual) machine. Alternatively, some later versions of i386kd that run on Windows seem to work just as well - I tried the NT 4.0 version and it works fine, but the latest Windows 10 version doesn't.

Curiously, the setup batch file (SETUP.BAT for DOS or SETUP.CMD for OS/2) has an optional parameter called newdbg, with the description: "newdbg: Optional specifying OS/2-based kernel debugger. Default old-style kernel debugger outputs to terminal connected to COM1.". However, looking at the guts of these batch files reveals that this parameter is actually meaningless in this build, as all it does is copy an identical NTOSKRNL.EXE that would otherwise be installed without the parameter, just from a different directory on the installation media. In addition, this kernel always expects the debugger on COM2, rather than COM1 like the batch file claims. So it seems this parameter and the older debugger over COM1 became obsolete at some point before this build and they eventually just placed the same new debugger kernel in both directories as a quick and easy upgrade without having to rework the internals of the setup batch files.

This leaves you with two options to debug build 196: you can set up the kernel debugger in a virtual machine using an emulator or hypervisor that supports connecting the guest's serial (COM) port to host's named pipe or serial port (I used VirtualBox for this, but I'm sure you could get a working set up with others as well), or you can run the debugger on your host OS. In either case, I recommend you use a program like com0com that can create virtual COM port pairs on your host that you can then attach the debugger and debuggee to. If you're doing this with two virtual machines and named pipes, you don't need such a program.

For OS/2 1.x, I went with Microsoft's release of OS/2 1.30. To get the COM ports working in VirtualBox, I had to add the line "DEVICE=C:\OS2\COM01.SYS (1,3F8,4) (2,2F8,3)" to my CONFIG.SYS file. This loads the serial port driver and configures the first two ports for VirtualBox defaults (COM1 address 0x3F8, IRQ 4, COM2 address 0x2F8, IRQ 3). You obviously also need to connect one of the emulated COM ports (either 1 or 2, it doesn't matter which on the debugger's end) to a host's named pipe (e.g. \\.\pipe\ntdbg) or one of the host's virtual COM ports in the pair you configured with com0com (e.g. \\.\COM10). You also need to do this on the debuggee (196) side, except you must use guest's COM2 port there, and connect it to the same named pipe or the other host virtual COM port from the com0com pair. You can also run a version of Windows NT and i386kd inside a virtual machine as well, in which case the process is the same as long as you match the port configuration in the guest OS and VirtualBox. If you're using named pipes, keep in mind that one of the virtual machines must create the pipe (the one you'll start first, doesn't really matter which of the two you pick), while the other simply opens this pipe.

If you're going to run the debugger on your host OS, you can't use named pipes since i386kd doesn't seem to support them. But all you need to do is open CMD (or PowerShell, if that's your thing) and set the _NT_DEBUG_PORT environment variable to one of the virtual COM ports, and run i386kd. Just like in the previous paragraph, the virtual machine running build 196 needs to connect guest COM2 to the other host virtual COM port from the pair.

The debugger will wait for connection until 196 is past the boot loader, at which point you should see some output that gives you a glimpse of the early NT internals. Once the system is booted, you can make it crap itself (which shouldn't be too hard, honestly) and you'll perhaps see some output on the debugger. Sometimes, the system really does break completely with nothing on the debugger, sometimes the error is only logged to the debugger and there's no recovery, and sometimes you can continue the execution past the error (although the system is often even less stable after that). Doing this for some time revealed that quite a few of the hangs are caused by problems in GDI and USER modules, which is understandable since the GUI was still a fairly recent addition to NT at this point.

This post's comments feed