• entries
    111
  • comments
    278
  • views
    155511

Developing a Graphics Driver II

Sign in to follow this  
Promit

140 views

Ok, I missed a lot of days. It's been a bit of a rough week. XNA entry comes tomorrow. By the way, I have a GameDev article coming up soonish. I'll probably make a lot of noise once it actually shows up. This particular article is about some incredibly awesome work I did late last year.

Developing a Graphics Driver II

In part 1, I briefly covered how things are structured. This entry is all about debugging the driver. Like I said, the driver lives partly in usermode and partly in kernel mode. As a practical matter, it's convenient to be able to access either part at will. In other words, a usermode debugger isn't enough -- we need a full kernel debugger. And because kernel debugging freezes the entire machine completely when you hit a breakpoint, it's absolutely necessary to have two machines. One runs all the programs, and the other debugs it. You can connect the two in any of several ways; Firewire is the most convenient. Serial works, if you want to suffer that, and USB might work, with catches that I am not familiar with. The computers had firewire ports, so I used them.

Sadly, VS does not support kernel debugging. I have no idea why not. The tool of choice for most people who need to do Windows kernel debugging is WinDbg. This program is a psychotically powerful debugger, with a terribly akward and irritating UI frontend attached. It's also kind of slow. Still, it does the job, and setting it up isn't too bad. Once you have the two machines connected, you need to set up the slave for kernel debugging. Once that's done, you can fire up WinDbg on the master, and boot the slave. WinDbg will automatically establish a connection when the machine comes up. (Note that things don't need to happen in this order. WinDbg can connect to an already running machine.) After that, it functions like a normal debugger, except that everything on the machine is being debugged. Any process or module that invokes an int 3, the breakpoint interrupt, will be caught by WinDbg. There is one catch, though; the OS can't recover from a driver invoking a breakpoint. So if you accidentally run a driver build with debugging stuff enabled but without a debugger attached, there's a good chance you'll hard lock the machine.

I'll set breakpoints in the driver as necessary to inspect what I'm trying to debug. When a breakpoint is hit, it's pretty much like debugging in VS. All the same information is available, albeit in a much worse interface. It's particularly important to make sure that WinDbg knows where to find debug symbols. Sources of symbols include the PDB generated by the build, and Microsoft's public symbol server. With those correctly set up, I can see correct call stacks through the NT kernel and the driver. WinDbg also knows how to find the driver's code files from those symbols, so when a breakpoint is hit, it can open the relevant code and point at what's going on. (Again, in a much worse way than VS. We're more at the WinDiff level of "prettiness" here.) What I don't usually get is symbols for the actual application being debugged. That's not surprising; we have access to a lot here, but debug builds of games plus all symbols, let alone code, are not really part of that. So I can see what the application is doing externally, but not what is going on inside its head. It's an interesting role reversal, actually. It's also shown me that a lot of games -- even major commercial AAA+ titles -- behave rather badly with respect to the driver.

Most of the hard work is really in isolating the conditions that cause a bug to occur, and closing in on the source of the problem. Once you know why something has gone wrong, it's usually a fairly trivial change. Not always, of course. Occasionally, it's a real pain, especially when dealing with a badly behaved or cruel application that hits a soft spot, or expects certain behavior where no such behavior was guaranteed. (And of course, it worked on the small set of hardware the game developer has, which may not even be NVIDIA based.)

I wrote this up fairly quickly. Feel free to ask questions, but be aware that there are very constricting limits on how much I can say. Don't let that stop you from asking; just don't be disappointed if I don't provide an answer.
Sign in to follow this  


2 Comments


Recommended Comments

Sounds like you've landed yourself a very interesting job there :)

I have a couple of questions.

1) Could you give some examples of the kind of mistreatment an application might give the driver? I.e., what sort of best practices can you give that might not be immediatly obvious (and perhaps specific to the NVIDIA driver).


2) In D3D10 the Begin/End Scene API is (finally) removed. I'm curious if this was ever any help to the D3D9 driver (there are some restrictions imposed in D3D9 for what must be done outside or inside Begin/End Scene), and how that effects the D3D10 driver (if at all)?


3) I understand you might not be at liberty to speak on this, but do you know much about how the driver internally handles D3D9 state blocks? For instance, I assume alot of D3D states are compiled into the same GPU registers or register blocks. If a complete set of states for a GPU register is set in a state block, will the driver then store the GPU register rather than the individual states? If so, is there any way to find out which states one should group together to get maximum benefit from this? I would assume that grouping together states that are in D3D10 part of the same state object should be likely to work optimally on D3D10 hardware while running under D3D9.


Cheers,
Christian

Share this comment


Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now