Recently IaEUR(TM)ve been working on a rather difficult task, namely creating PDB debug database files from the href="https://github.com/apoch/epoch-language">Epoch Language compiler toolchain.
This is difficult in part because the format of PDB files is generally not well-understood, and is certainly poorly documented. I canaEUR(TM)t go much further without a hearty thanks to the LLVM project and particularly their tool
llvm-pdbdump which makes it much easier to test whether or not a generated PDB is sane. When
llvm-pdbdump has good information about the state of a given PDB, it is invaluable; and when it falls short, as is inevitably the case with a format like PDB, it at least gives me a starting point for understanding why things have gone wrong.
However, there is another tool, from Microsoft themselves, called
Dia2Dump.exe which uses an authoritative implementation of the PDB format, via the file
MsDia140.dll on Visual Studio 2015. This library is (as near as I can tell) close to or identical to the code used by Visual Studio itself for debugging programs. It also seems to parallel the implementations in
DbgHelp.dll, both of which I use extensively in my research.
Last but not least, I must mention the Microsoft-PDB repo on GitHub, which is partial source for the implementation of the PDB format. It does not actually compile right now, so itaEUR(TM)s hard to use, but it has a significant purpose for me: I can cross-reference functions in
MsDia140.dll with this code, and use that for some serious reverse-engineering.
Sometimes when feeding data into a black box like
MsDia140.dll it can be hard to know what code paths are taken and why. For example, letaEUR(TM)s look at the function
GSI1::readHash (see here to follow along in the source).
This function does some stuff I still donaEUR(TM)t fully understand, so letaEUR(TM)s walk through the process of gaining more understanding.
First we need a partially malformed PDB. This is easy to do since PDB files are sensitive to tiny changes, often in non-obvious ways. In particular, IaEUR(TM)m going to work on the
Publics stream. This is a fragment of a Multi-Stream File (aka MSF) which contains, among other things, publicly visible debug symbols for some program.
At the beginning of the stream, there is a structure which
llvm-pdbdump is sadly cryptic about. Thankfully,
llvm-pdbdump contains some sanity checks which seem to align well with the checks made by MicrosoftaEUR(TM)s code, so itaEUR(TM)s at least easy to use the tool to verify what weaEUR(TM)re spitting out.
readHash is responsible for decoding part of this data structure, which appears to be some kind of hash table for accelerating symbol lookups. Inside the code for
readHash (see link above) there is a call to a pesky function called
fixHashIn. By attaching WinDbg to a running copy of
Dia2Dump.exe and setting liberal numbers of breakpoints, I traced a failure in my PDB generation code to this single function.
fixHashIn is vomiting because IaEUR(TM)m feeding it data it doesnaEUR(TM)t like.
The first thing to note is that
fixHashIn begins with a decrement instruction to decrease the value of one of its parameters. This parameter is supposedly the number of buckets in the hash table, or so I extrapolate from the source.
In my case, the parameter has a value of zero! Clearly I donaEUR(TM)t want my hash table to have zero buckets, so it becomes apparent why
fixHashIn is choking. What I donaEUR(TM)t immediately understand is why it thinks zero is the number of bucketsaEUR|aEUR' I had thought that I was passing a value in (8 bytes per entry * 16 entries) that would work. Clearly I was wrong, but where was the zero coming from?
A little more background is in order. In an MSF file (MSF being a superset of PDB files), data is divided into streams, each of which is built up of one or more blocks. A block can be different sizes, but IaEUR(TM)m using 1KB (1024 bytes) for convenience. Data not used is filled with junk bytes.
Crucially, I pad my blocks with zeroes. If somehow the PDB interpreter is reading one of my padding bytes, it might be incorrectly assuming I want to feed it a zero-size hash tableaEUR|aEUR' obviously a problem. So what to do?
And now the meat of everything!
Instead of padding my file with zeroes, I use carefully crafted poison values. For my purposes IaEUR(TM)m working with 32-bit data, so a poison value is usually 4 bytes long. A good example is
0xfeedface which is a funny but valid hex number that happens to be the right size.
The important thing is that we canaEUR(TM)t just pad every 32-bit slot with
0xfeedface. Instead, we want to make permutations of the poison value - one unique permutation per slot. Every possible 4-byte sequence of my PDBaEUR(TM)s "padding" is now a unique string of digits.
HereaEUR(TM)s the magic part: when I run this in the debugger, I can walk into the
fixHashIn function, and look at its parameters.
My first run of this process is surprising - despite poisoning a bunch of data around where I thought this zero was coming from, the value is still zero when we reach the
fixHashIn function! This indicates one of two things.
The value is read from a place I didnaEUR(TM)t poison
The value might be computed somehow
To rule out the possibility that IaEUR(TM)m not poisoning enough, I expand the poison to the entire file instead of just one blockaEUR(TM)s worth of padding bytes. The debugger still stubbornly shows the parameter as zero, meaning that the zero is being computed from some other data being fed in, not read directly from the file on disk.
This line of the Microsoft PDB source is illuminatingaEUR|aEUR' but href="https://github.com/Microsoft/microsoft-pdb/blob/master/PDB/dbi/gsi.cpp#L65">this line even more so. At line 65 is a comment stating that
fixHashIn is called from two placesaEUR|aEUR' one of them is the loader for the Publics stream, but one is for a totally unrelated stream called
It turns out IaEUR(TM)ve been hitting breakpoints all evening in
fixHashIn, but the call stack is wrong. The calls IaEUR(TM)ve been seeing are from a totally different stream of data.
This post may not have a cheerful ending, but I hope the value of poisoning data is clear: I may have taken days to realize my mistake without having 100% proof that the evil zero was not coming from my Publics stream.
In any event, I use the poison technique a lot, and this is just one sampling of my adventures with the PDB format. Maybe IaEUR(TM)ll have a better story of success tomorrow!