Sign in to follow this  
Followers 0
dave164

[C#] Detecting crashed programs

21 posts in this topic

Is it possible to detect whether a program has crashed or not in C#?

If so, any tutorials or advice would be appreciated :)

Thanks,

David.

Share this post


Link to post
Share on other sites

The C# program itself or a different program?

Share this post


Link to post
Share on other sites

A different program, i'm trying to make a program that automatically restarts crashed or programs that aren't running.

The running part of it works like a charm, but it would be a nice feature for it to detect if the program it was trying to check existed had crashed or not. If it had crashed, could kill it and restart it.

Share this post


Link to post
Share on other sites

Um, if you can kill it, then it hasn't crashed. If it crashed, then it's dead, and the process would have been closed.

Are you trying to determine if a program has hanged or become non-responsive? If so, no, there is no way to determine that. You can determine if a particular window has come hanged or non-responsive by asynchronously sending it a benign window message and seeing how long it takes to respond; if it takes too long, then that particular window is probably deadlocked, caught in an infinite loops, etc. That's how the task manager's "Not Responding" heuristic works.

Share this post


Link to post
Share on other sites
Are you trying to determine if a program has hanged or become non-responsive? If so, no, there is no way to determine that.

???

Of course there are ways : you can do everything with Win32 api.

(Kernel apis for that)

Share this post


Link to post
Share on other sites

Hey,

Yeah it was if it has become non-responsive or hung. Basically what happens when you click on a program and it goes all white and breaks.

Thanks,

Garnett

Share this post


Link to post
Share on other sites
???

Of course there are ways : you can do everything with Win32 api.

(Kernel apis for that)

Erm, no, you can't. Because it's impossible. In the "if this were possible, then Alan Turing and the laws of mathematics would be wrong" kind of impossible, not the "oh, there's just no API for it" kind of impossible.

What you can do is look at heuristics. You can measure the CPU usage, and if it remains high for a long period of time, then it might be stuck (or something else might go on). Or you can peek at the process' activity and if it doesn't look like it's doing anything, then maybe it's stuck. Or if the process has a UI, you can send it a window message and see if it responds (which I already mentioned above; this, BTW, is the heuristic that Task Manager uses, and anyone familiar with the Task Manager's "Not Responding" status knows that it's imperfect and it can sometimes be wrong).

Yeah it was if it has become non-responsive or hung. Basically what happens when you click on a program and it goes all white and breaks.

When you click on a program (or more precisely, a window belonging to a program) and it "goes all white and breaks", that means that the window's main thread is not responding to the system's paint messages in a timely manner (or not at all). See my earlier post about how to heuristically detect such hangs.

Edited by kliu0x52

Share this post


Link to post
Share on other sites

One issue however, is if you can't kill a hung or unresponsive program, then there isn't a point in detecting if its crashed or not surely? For my purpose anyway.

Share this post


Link to post
Share on other sites
One issue however, is if you can't kill a hung or unresponsive program, then there isn't a point in detecting if its crashed or not surely? For my purpose anyway.

Oh, killing a process is easy... just use OpenProcess to get a process handle and then feed that to TerminateProcess. Afterwards, close the process handle so that you don't leak memory.

Share this post


Link to post
Share on other sites

What you need is a process that is launches the program but doesn't do anything else. Just plain and simple and very small windows service. After launching the program the process can determine by looking at heuristics. A program wont look for problems within itself when it is hung!!! It needs to have a handler which can manage that for it.

Share this post


Link to post
Share on other sites

Without information returned from the application in a pre-defined fashion it's harder than you'd think.

It's very difficult to tell if a program is busy and poorly programmed or has crashed (as it is to tell if a problem will halt or not (Halting Problem) ).

Why is it that you want to do this? Windows does about as good a job as is possible with the Win32 APIs (which I'd imagine you'll have to be using).

Chris

Share this post


Link to post
Share on other sites

Oh, I had assumed that the OP meant process A detecting whether process X (where X != A) has hung. If you want to determine from process A whether or not process A is hung (or more precisely, a particular thread in process A has hung), then you need to have the hang-checker operating in its own thread with special care taken with the thread's use of shared resources and interthread communication to make sure that you preclude the possibility of it deadlocking.

Share this post


Link to post
Share on other sites

I have the whole thing of killing and restarting processes that aren't running fine :)

But what i meant is above someone said you can't kill a crashed / hung process. So surely trying to kill a hung process is therefore impossible. So you can tell that it has hung, but can't do anything about it?

My program at the minute does successfully restart the process when it isn't running, disappears, etc.. I was just looking to advance it to cover when it gets hung to :)

Before i pursue it further though i wanted to just check that it was possible to kill a hung process :)

Oh, I had assumed that the OP meant process A detecting whether process X (where X != A) has hung. If you want to determine from process A whether or not process A is hung, then you need to have the hang-checker operating in its own thread with special care taken with the thread's use of shared resources and interthread communication to make sure that you preclude the possibility of it deadlocking.

Yes i do mean A detecting whether process X (where X != A) has hung :).

Share this post


Link to post
Share on other sites
A program wont look for problems within itself when it is hung!!! It needs to have a handler which can manage that for it.

Sorry if we hadn't gotten there yet; I was assuming this was the premise... if not, give up :p

I have the whole thing of killing and restarting processes that aren't running fine :)

But what i meant is above someone said you can't kill a crashed / hung process. So surely trying to kill a hung process is therefore impossible. So you can tell that it has hung, but can't do anything about it?

My program at the minute does successfully restart the process when it isn't running, disappears, etc.. I was just looking to advance it to cover when it gets hung to :)

Before i pursue it further though i wanted to just check that it was possible to kill a hung process :)

Yes, the PID can be terminated by denying it process time, freeing it's memory etc... (automatically by WIN32 APIs)

Share this post


Link to post
Share on other sites
But what i meant is above someone said you can't kill a crashed / hung process. So surely trying to kill a hung process is therefore impossible. So you can tell that it has hung, but can't do anything about it?

I never said you can't kill it (I did say that you can't kill a crashed process because, well, a crashed process is already dead--killed by the kernel). But a hung process can be detected through imperfect heuristics and it can be killed. (I guess it depends on how you defined "crashed" and "hung"... I define crash as any sort of failure that results in the process dying, and hung as some sort of failure that results in the process not responsive, but not dead either).

Share this post


Link to post
Share on other sites

Ok whilst i'm taking a look at sending these magical messages cross program :p Could anyone point me in the direction of any useful tutorials / guides they've seen for it?

Share this post


Link to post
Share on other sites
I never said you can't kill it (I did say that you can't kill a crashed process because, well, a crashed process is already dead--killed by the kernel). But a hung process can be detected through imperfect heuristics and it can be killed. (I guess it depends on how you defined "crashed" and "hung"... I define crash as any sort of failure that results in the process dying, and hung as some sort of failure that results in the process not responsive, but not dead either).

Ok, so what do you class as the process still being on the process list in task manager. But with it all whited out and the windows box over it saying its non-responsive and just keeps scrolling the progress bar. Hung?

Share this post


Link to post
Share on other sites
Ok, so what do you class as the process still being on the process list in task manager. But with it all whited out and the windows box over it saying its non-responsive and just keeps scrolling the progress bar. Hung?

Unresponsive.

Share this post


Link to post
Share on other sites

First, all this is going to be done via the native Windows API.

Second, you have a understand a little about how windows (lowercase "w") work. A window is owned by exactly one thread. The thread that owns the window is responsible for processing any "messages" that the window gets ("pumping messages", in the Windows lingo). These messages can be sent to the window's thread by the system (e.g., WM_KEYDOWN/WM_KEYUP when a key is pressed) or by other apps (e.g., apps that emulate key presses) or from within (e.g., as a way for two windows to send signals to each other).

So the theory here is that if you send a window a message and that window does not respond in a timely manner, then there is something up with the thread pumping that window's messages, and it may be that the thread has hanged.

With that, the limitations of this heuristic should be obvious:

1) You don't know for sure why the thread is not pumping your message; it's probably hanged, but it might just be really, really busy. You can't be 100% sure which it is. The longer your timeout period, the more you can be sure that the thread has hanged instead of just being really, really busy, but a long timeout period also means that your watchdog process will be slow to act.

2) You can only determine if the thread that owns a particular window has hanged (or is really busy).

2a) If the process that you are watching has no UI (no windows), then you can't use this method to detect a hang. Now you know why Task Manager's list of responsive/nonresponsive "tasks" never include processes that have no windows.

2b) If the process that you are watching has multiple windows, then you might run into a situation where one window has hanged but the others have not. Now you know why Task Manager's list of responsive/nonresponsive "tasks" has one entry per window, even if those windows all belong to the same process.

2c) One common programming practice is to do all the actual "work" in a separate "worker thread". That way, the UI can remain responsive even while the program's off churning away on a bunch of data. But that also means that it's possible to run into a situation where a program's UI thread is still responsive, but its worker thread is hanged, and you won't be able to detect that.

Anyway, you can just send a simple WM_NULL message using either SendMessageCallback or SendMessageTimeout. SMC is asynchronous and is a bit more work to use, but your watchdog program can go off and do other stuff (such a polling another window) while it's waiting to see if the target window responds. SMT is a bit easier to use, but you can't do other stuff while waiting for a response (which means that you can only poll the windows one at a time: polling 3 hanged windows with a 20s timeout for each will take a full minute to complete) (unless of course you create a new thread for each SMT, but then it'll just be easier to use SMC). Under no circumstance should you should the the regular SendMessage, because a SendMessage to a hanged window will just hang your own thread (this mistake is actually a common cause of hangs).

Ok, so what do you class as the process still being on the process list in task manager. But with it all whited out and the windows box over it saying its non-responsive and just keeps scrolling the progress bar. Hung?

See the second half of post #7 in this thread.

Edited by kliu0x52

Share this post


Link to post
Share on other sites
First, all this is going to be done via the native Windows API.

Second, you have a understand a little about how windows (lowercase "w") work. A window is owned by exactly one thread. The thread that owns the window is responsible for processing any "messages" that the window gets ("pumping messages", in the Windows lingo). These messages can be sent to the window's thread by the system (e.g., WM_KEYDOWN/WM_KEYUP when a key is pressed) or by other apps (e.g., apps that emulate key presses) or from within (e.g., as a way for two windows to send signals to each other).

So the theory here is that if you send a window a message and that window does not respond in a timely manner, then there is something up with the thread pumping that window's messages, and it may be that the thread has hanged.

Makes sense!

With that, the limitations of this heuristic should be obvious:

1) You don't know for sure why the thread is not pumping your message; it's probably hanged, but it might just be really, really busy. You can't be 100% sure which it is. The longer your timeout period, the more you can be sure that the thread has hanged instead of just being really, really busy, but a long timeout period also means that your watchdog process will be slow to act.

Fair enough. I'll probably fine tune it to a decent and reasonable time.

2) You can only determine if the thread that owns a particular window has hanged (or is really busy).

2a) If the process that you are watching has no UI (no windows), then you can't use this method to detect a hang. Now you know why Task Manager's list of responsive/nonresponsive "tasks" never include processes that have no windows.

2b) If the process that you are watching has multiple windows, then you might run into a situation where one window has hanged but the others have not. Now you know why Task Manager's list of responsive/nonresponsive "tasks" has one entry per window, even if those windows all belong to the same process.

2c) One common programming practice is to do all the actual "work" in a separate "worker thread". That way, the UI can remain responsive even while the program's off churning away on a bunch of data. But that also means that it's possible to run into a situation where a program's UI thread is still responsive, but its worker thread is hanged, and you won't be able to detect that.

Yup, i do 2c myself in all my programming :)

Anyway, you can just send a simple WM_NULL message using either SendMessageCallback or SendMessageTimeout. SMC is asynchronous and is a bit more work to use, but your watchdog program can go off and do other stuff (such a polling another window) while it's waiting to see if the target window responds. SMT is a bit easier to use, but you can't do other stuff while waiting for a response (which means that you can only poll the windows one at a time: polling 3 hanged windows with a 20s timeout for each will take a full minute to complete) (unless of course you create a new thread for each SMT, but then it'll just be easier to use SMC). Under no circumstance should you should the the regular SendMessage, because a SendMessage to a hanged window will just hang your own thread (this mistake is actually a common cause of hangs).

See the second half of post #7 in this thread.

Thanks, that is really useful. When i'm rested in the morning and have a clear head i will chase it up.

Thank you very much for that very useful post! It is really appreciated. SMC sounds like the one to use because you could have multiple instances of a said program.

Many thanks

David.

Share this post


Link to post
Share on other sites
but then it'll just be easier to use SMC

Minor correction: I'm running the code through my head, and I'm now thinking that the two are probably equal in hassle. (With SMC, the API takes care of making things asynchronous, but you have to manually deal with timeouts (starting and stopping timers, etc.). With SMT, the API takes care of the timing, but if you want to poll multiple windows at once, you have to provide the async operation yourself by creating a thread for each.)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Recently Browsing   0 members

    No registered users viewing this page.