Jump to content

  • Log In with Google      Sign In   
  • Create Account

[win32] detect a crashed process


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
14 replies to this topic

#1 e‍dd   Members   -  Reputation: 2105

Like
0Likes
Like

Posted 30 January 2009 - 10:12 AM

Hi folks, If a process is created with CreateProcess(), is there a way of telling if the process crashed? I'll leave the definition of "crashed" open. I have named pipes connected to standard streams of the child process, though looking for a broken pipe doesn't seem to do the trick (the offending child process writes to a null pointer).

Sponsor:

#2 Drew_Benton   Crossbones+   -  Reputation: 1713

Like
0Likes
Like

Posted 30 January 2009 - 11:07 AM

Here's a quick solution I've coded for fun [grin]


#include <windows.h>
#include <tlhelp32.h>
#include <shlwapi.h>
#include <vector>

#pragma comment(lib, "shlwapi.lib")
#pragma comment(lib, "Psapi.lib")

std::vector<PROCESSENTRY32> GetProcessListByName(const char *process)
{
PROCESSENTRY32 pe = {0};
HANDLE thSnapshot = {0};
BOOL retval = false;
std::vector<PROCESSENTRY32> processList;

// Try to create a toolhelp snapshot and verify that it was actually created
thSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
if(thSnapshot == INVALID_HANDLE_VALUE)
{
MessageBox(NULL, "Error: Unable to create toolhelp snapshot!", "Loader", MB_ICONERROR);
// (Vector is empty)
return processList;
}

// Need to have this set for the WinAPI structures
pe.dwSize = sizeof(PROCESSENTRY32);

// Try to get the first process
retval = Process32First(thSnapshot, &pe);

// While we have processes to go through
while(retval)
{
// As soon as we find the process id, add it to the vector
if(StrStrI(pe.szExeFile, process))
{
processList.push_back(pe);
}

// Otherwise, get try to get the next process
retval = Process32Next(thSnapshot,&pe);
}

// Return the list of processes with this name
return processList;
}

BOOL CALLBACK MyEnumWindowsProc(HWND hwnd, LPARAM lParam)
{
DWORD pidwin = 0;
GetWindowThreadProcessId(hwnd, &pidwin);
if(pidwin == lParam)
{
char title[256] = {0};
GetWindowText(hwnd, title, 255);
if(strcmp(title, "CrashTest.exe") == 0)
{
printf("Application crash detected! We will now close it.\n");
// We know our app crashed, but with this simple logic,
// we could possible detect it more than once so we need
// to do something to close out the crashed application
// and get rid of DrWatson.
TerminateProcess(OpenProcess(PROCESS_ALL_ACCESS, false, pidwin), 0);
}
return false;
}
return true;
}

int main(int argc, char * argv[])
{
// Force a crash that invokes DrWatson (most crashes do that don't bsod the system ;))
#ifdef _DEBUG
__asm mov eax, 0
__asm call eax
#endif

#ifndef _DEBUG
// Check 3 times
for(int z = 0; z < 3; ++z)
{
std::vector<PROCESSENTRY32> processes = GetProcessListByName("dwwin.exe");
if(processes.size())
{
for(size_t x = 0; x < processes.size(); ++x)
{
EnumWindows(MyEnumWindowsProc, processes[x].th32ProcessID);
}
}
// Check every 5 seconds
Sleep(5000);
}
#endif
return 0;
}



It works pretty simply:
1. It searches the current process list for "dwwin.exe" -> DrWatson crash reporter that you usually get.
2. If the process was found, then it will search all top most level windows searching for the window handle that belongs to that process.
3. When the window handle is found, to window text is read to get the name of the process that crashed, since standard behavior for the application is to set the window title to the name of the application that crashed.
4. Your code goes in the section after the crash was detected. For this example I simply show a message to the console and then close the crashed application by terminating the DrWatson crash reporter.

You will need to clean up the code a little to make it work for your executable, change the checking so it happens as frequently as you want, etc... I just wrote it so you can test it easily in a new visual studio project, named "CrashTest" of course. Build the debug exe and run it and leave it in the crashed state. Build the release exe and run it and you "should" see it work properly. I don't know the implications of the program when crash reporting is disabled, but those details you can look into [wink]

#3 e‍dd   Members   -  Reputation: 2105

Like
0Likes
Like

Posted 30 January 2009 - 11:37 AM

Quote:
Original post by Drew_Benton
It works pretty simply:
1. It searches the current process list for "dwwin.exe" -> DrWatson crash reporter that you usually get.

This means it won't work if a different PM debugger is installed, right? Sometimes I swap in drminwgw :/

Quote:

2. If the process was found, then it will search all top most level windows searching for the window handle that belongs to that process.


I would need this to work for console apps too. Do they have windows of any kind?

Either way, many thanks for your efforts, I appreciate it.

I think I have found another way in the meantime, though I still suspect there might be a much simpler way:

1. Pass CREATE_SUSPENDED, DEBUG_ONLY_THIS_PROCESS and DEBUG_PROCESS to CreateProcess.
2. Use DebugActiveProcess() to attach a make-shift debugger to the child while it's suspended.
3. Spin up another thread that calls WaitForDebugEvent() and listens for a RIP_EVENT.

Still horrible, but I think it's the best option so far (assuming it works!).

#4 e‍dd   Members   -  Reputation: 2105

Like
0Likes
Like

Posted 30 January 2009 - 02:37 PM

I can confirm that this does indeed work. However, it seems to take a much longer time to actually start the process. So I'm still looking for a better way.


void check_for_crash()
{
DEBUG_EVENT de;
zero_out(de);

while (WaitForDebugEvent(&de, 0))
{
DWORD cont = DBG_CONTINUE;
if (de.dwDebugEventCode == EXCEPTION_DEBUG_EVENT)
{
const EXCEPTION_DEBUG_INFO &info = de.u.Exception;
const EXCEPTION_RECORD &rec = de.u.Exception.ExceptionRecord;

if (!info.dwFirstChance)
throw pexl::abnormal_child_termination(rec.ExceptionCode);

if (rec.ExceptionCode != EXCEPTION_BREAKPOINT)
cont = DBG_EXCEPTION_NOT_HANDLED;
}
if (!ContinueDebugEvent(de.dwProcessId, de.dwThreadId, cont))
throw pexl::system_error("Failed to check child process health", "ContinueDebugEvent");
}
}


I call this frequently in the loop in which I poll the named pipes. CreateProcess() had DEBUG_ONLY_THIS_PROCESS and DEBUG_PROCESS included in its creation flags.

#5 Codeka   Members   -  Reputation: 1157

Like
0Likes
Like

Posted 30 January 2009 - 03:19 PM

Quote:
Original post by the_edd
I call this frequently in the loop in which I poll the named pipes. CreateProcess() had DEBUG_ONLY_THIS_PROCESS and DEBUG_PROCESS included in its creation flags.


Doing that also means you won't be able to debug the child process (you can only attach one debugger at a time)

I don't think there is any way, in general, to detect when a process crashes.

Do you control the child process? You could simple have it send a special message over the named pipe when it terminates "normally" and so if the named pipe is ever disconnected without that message being sent, you know it crashed.

#6 e‍dd   Members   -  Reputation: 2105

Like
0Likes
Like

Posted 31 January 2009 - 12:21 AM

Quote:
Original post by Codeka
Do you control the child process? You could simple have it send a special message over the named pipe when it terminates "normally" and so if the named pipe is ever disconnected without that message being sent, you know it crashed.


Unfortunately, I do not control the child :( It can be an arbitrary application.

I came across a similar question on StackOverflow.

One of the answers suggested creating a mutex that is inherited by the child, waiting on it in the parent and considering it a crash if WaitForSingleObject returns WAIT_ABANDONED.

Since I am not the author of the child, in general, I can't have the child take the lock, but I wonder if I can do the same with an Event object that is initially unsignaled. It's something I will try shortly.

#7 e‍dd   Members   -  Reputation: 2105

Like
0Likes
Like

Posted 31 January 2009 - 01:01 AM

Quote:
Original post by the_edd
One of the answers suggested creating a mutex that is inherited by the child, waiting on it in the parent and considering it a crash if WaitForSingleObject returns WAIT_ABANDONED.

Since I am not the author of the child, in general, I can't have the child take the lock, but I wonder if I can do the same with an Event object that is initially unsignaled. It's something I will try shortly.


No dice :(

#8 Erik Rufelt   Crossbones+   -  Reputation: 3476

Like
0Likes
Like

Posted 31 January 2009 - 02:07 AM

Unless I'm missing something, this is fairly easy..

Create the process with CreateProcess. You will get a handle to the process in your PROCESS_INFO structure. Wait for the process handle with WaitForSingleObject and a timeout of INFINITE, and it won't return until the child process has exited (crashing, or exiting gracefully). Then use GetExitCodeProcess to get the exit code. If your process has crashed you will get some strange exit-code usually, not sure if it's always the same, might depend on the crash circumstances. It won't be 0 however, or whatever exit code you normally use. You can return some defined value from the main-function of the child process that you compare against, to know when the process has exited in a manner that it shouldn't have.
This can be extended to WaitForMultipleObjects on several child-processes. You will need to create a thread waiting for the processes, unless you want to poll them regularly.

EDIT: I missed that you don't write the child-process.. however it shouldn't matter, most programs return 0 on exit, and the crash-exit code (as far as I have noticed, haven't done extensive testing) is something like 0x??fffff. There might very well be definitions of this for windows, what exit code is used on a crash.

#9 SiCrane   Moderators   -  Reputation: 9594

Like
0Likes
Like

Posted 31 January 2009 - 02:20 AM

Quote:
Original post by Erik Rufelt
EDIT: I missed that you don't write the child-process.. however it shouldn't matter, most programs return 0 on exit, and the crash-exit code (as far as I have noticed, haven't done extensive testing) is something like 0x??fffff. There might very well be definitions of this for windows, what exit code is used on a crash.


Unfortunately, a large number of applications have non-zero return codes in non crash situations. For example, compilers will return non-zero when there's an error with the code. And the Windows crash return codes are not constant. For example this code:

int main(int, char **) {
throw 0;
}

Run three different times on my computer just now returned 4062000, 4192680 and 3930676.

#10 Erik Rufelt   Crossbones+   -  Reputation: 3476

Like
0Likes
Like

Posted 31 January 2009 - 02:34 AM

Ah, I see :(
http://msdn.microsoft.com/en-us/library/ms683189(VS.85).aspx says that the exit code will be the value of the exception that caused the crash. Doing a simple test for a buffer overflow gets me an Unhandled Exception dialog, with the value 0xc0000005, which is also the value returned by GetExitCodeProcess. Could this be used somehow, such as some bit always being set in unhandled exceptions?

#11 SiCrane   Moderators   -  Reputation: 9594

Like
0Likes
Like

Posted 31 January 2009 - 02:49 AM

Nope. Windows exception codes follow the following format: Bits 31 and 30 represent the severity code. It should be 00 for success, 01 for informational, 10 for warning and 11 for an error. Bit 29 is set for user exception codes. Bit 28 is reserved and should not be set. Bits 27 to 16 represent a facility code. For example, Windows Update is assigned the facility code of 36, while 0 represents no particular facility. The bottom 16 bits then represent the actual error code. This is why exceptions like access violations start with 0xC: these codes represent non-user error exceptions. Now take a look at the three exit codes I posted above. None of them have any of the top three bits set and any overlap with the lower bits will be coincidental as they have no individual meaning.


#12 Erik Rufelt   Crossbones+   -  Reputation: 3476

Like
0Likes
Like

Posted 31 January 2009 - 04:14 AM

Too bad, good to know. =)

For the OP.. does it actually matter if it crashed, or exited some other way?
Reading the post again, it seems as though you only need to know if the process has exited, since the pipe must then no doubt be unusable?
In that case GetExitCodeProcess will return a special error code if the process is still running.

#13 e‍dd   Members   -  Reputation: 2105

Like
0Likes
Like

Posted 31 January 2009 - 05:16 AM

Quote:
Original post by Erik Rufelt
For the OP.. does it actually matter if it crashed, or exited some other way?
Reading the post again, it seems as though you only need to know if the process has exited, since the pipe must then no doubt be unusable?
In that case GetExitCodeProcess will return a special error code if the process is still running.


Well the full story is that I'm trying to port a library I have made for UNIX to Windows, that simplifies the creation of processes and interacting with their streams.

I am aiming to provide the same functionality in the Windows port, including the ability to throw an exception when the child crashes, but the clumsiness of some of the Windows APIs are making this port extremely difficult. I accept that I might have to make some concessions, but I'd like to explore all possibilities before giving up.

Another idea I've had would be to attempt to replace the child's entry point (WinMain?) so that the child can acquire an inherited Mutex before calling the real WinMain. This would allow me to look for WAIT_ABANDONED in the parent.

I haven't tried this kind of "static monkey patching" before however, and I'm really not very clued up on the guts of Windows executables or even ASM so I might not have much success in this regard. I don't know if anyone might be able to lend a hand with this or provide some useful articles?

#14 Drew_Benton   Crossbones+   -  Reputation: 1713

Like
0Likes
Like

Posted 31 January 2009 - 05:49 AM

Quote:
Original post by the_edd
Another idea I've had would be to attempt to replace the child's entry point (WinMain?) so that the child can acquire an inherited Mutex before calling the real WinMain. This would allow me to look for WAIT_ABANDONED in the parent.

I haven't tried this kind of "static monkey patching" before however, and I'm really not very clued up on the guts of Windows executables or even ASM so I might not have much success in this regard. I don't know if anyone might be able to lend a hand with this or provide some useful articles?


You can use this concept to accomplish that, but it would be more work than it might be worth. There are a lot of things that you could possibly try that route, but the problem is they are all more or less "hacks" that are really specific to the application you are using and it won't always work the same on different Windows operating systems.

In terms of "crashing", there is not much standard behavior that you can go by for the child process. You couldn't use pings to the server because for that to work, you would have to run it in the main thread or a secondary thread. The main thread can be legitimately blocked, screwing up the ping timing and a secondary thread will still run in a crashed application.

You could inject a DLL and set a Vectored Exception Handler, but if the application generates exceptions that it knows how to recover from, then you will be interfering with that.

Consider all the types of crashes there are (I can't list them all but the main ones are):
- Crashes that invoke debuggers (external dialog boxes being shown)
- Crashes that are a result from exceptions not being handled (runtime error dialog box)
- Crashes that are from asserts (assert dialog boxes)
- Crashes that clean exit the application (nothing shown, but process might still be in memory)

Your best bet would be to find something that you can identify from all of them and handle each type separately. In other languages, like Ada for example, problems like these are easily solvable since the language is designed to not allow such meltdowns that you cannot identify and recover from. In C/C++ though, you simply have too much to deal with and thus finding a way that works for everything is quite infeasible. Good luck though!

#15 e‍dd   Members   -  Reputation: 2105

Like
0Likes
Like

Posted 31 January 2009 - 07:27 AM

Quote:
Original post by Drew_Benton
You can use this concept to accomplish that, but it would be more work than it might be worth.

Ah that's the article I had in the back of my mind! I didn't have the link, though.

Quote:
There are a lot of things that you could possibly try that route, but the problem is they are all more or less "hacks" that are really specific to the application you are using and it won't always work the same on different Windows operating systems.

Yes. I think I will leave this route for a rainy day. There is a lot I would need to understand before I could even begin experimenting.

Quote:

In terms of "crashing", there is not much standard behavior that you can go by for the child process. You couldn't use pings to the server because for that to work, you would have to run it in the main thread or a secondary thread. The main thread can be legitimately blocked, screwing up the ping timing and a secondary thread will still run in a crashed application.

I'm curious as to how Windows can pop up a message box for a last-chance exception without attaching a debugger to the application. Is this something that's baked in to the OS?

Quote:

You could inject a DLL and set a Vectored Exception Handler, but if the application generates exceptions that it knows how to recover from, then you will be interfering with that.

I guess I could also use DLL injection to to load a library that acquires a Mutex in DllMain? But again, I'd hack to look in to the intricacies of DLL injection anyway.

Quote:

Consider all the types of crashes there are (I can't list them all but the main ones are):
- Crashes that invoke debuggers (external dialog boxes being shown)
- Crashes that are a result from exceptions not being handled (runtime error dialog box)
- Crashes that are from asserts (assert dialog boxes)
- Crashes that clean exit the application (nothing shown, but process might still be in memory)

Your best bet would be to find something that you can identify from all of them and handle each type separately.

I keep coming back to this Mutex-in-the-child idea because I think it has a chance of working for most of those cases (even though it won't stop the various crash dialogs from appearing).




Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS