Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
464 views
in Technique[技术] by (71.8m points)

c# - Untraceable Exceptions in Windows.Forms Application.Run()

I have an old Windows.Forms Application that I am trying to debug.

Sometimes after running a few minutes it will produce an ArithmeticException or an OverflowException. The source must be somewhere in the codebase, but the stacktrace always points to the line Application.Run(mainForm);

The StackTrace is useless as it only shows Windows.Forms native calls:

 bei System.Windows.Forms.UnsafeNativeMethods.DispatchMessageW(MSG& msg)
   bei System.Windows.Forms.Application.ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32 dwComponentID, Int32 reason, Int32 pvLoopData)
   bei System.Windows.Forms.Application.ThreadContext.RunMessageLoopInner(Int32 reason, ApplicationContext context)
   bei System.Windows.Forms.Application.ThreadContext.RunMessageLoop(Int32 reason, ApplicationContext context)
   bei System.Windows.Forms.Application.Run(Form mainForm)
   bei Program.Main() in C:xyProgram.cs:Zeile 102.

To find the source of the exception I have added an exception handler to System.Windows.Forms.Application.ThreadException and to System.AppDomain.CurrentDomain.UnhandledException.

I have tried enabling and disabling catching exceptions with System.Windows.Forms.Application.SetUnhandledExceptionMode();

The ThreadException event handler is never called. The UnhandledException event handler just reports the same exception I see in Visual Studio.

In Visual Studio I have enabled breaking execution when an exception is thrown: enter image description here This had no effect whatsoever.

What can I do to find the offending line of code?


edit: the full exception details:

enter image description here


If I start the process without any debugger attached, and wait for it to crash before attaching a debugger, I get the following exception:

Unbehandelte Ausnahme bei 0x0c9f9e1b in program.exe: 0xC0000090: Floating-point invalid operation.

Debugging then leads to this piece of disassembly

0C9F9E12  add         esi,10h 
0C9F9E15  push        0CA1FD48h 
0C9F9E1A  push        eax  
0C9F9E1B  fmul        qword ptr ds:[0CA202E0h] 
0C9F9E21  fstp        dword ptr [esp+18h] 

I cannot parse this, but I suspect this is merely the DispatchMessageW function

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The diagnostic here is that you have legacy unmanaged code in your process, judging from the call stack you posted that's likely to be an old ActiveX control.

These exceptions are hardware exceptions generated by the FPU, the floating point processor. Which can be put in an operation mode where it reports problems by raising exceptions, like the STATUS_FLOAT_OVERFLOW and STATUS_FLOAT_INVALID_OPERATION exceptions that you are seeing. Instead of generating infinity, NaN or denormals. The FMUL instruction can easily generate such an exception.

Software that changes the FPU operation mode is pretty fundamentally incompatible with managed code. Which requires that FPU exceptions are always masked. Masking these exceptions is entirely normal and what is done with all modern software. Back in the previous century these exceptions were however considered an asset to diagnose floating point calculations going haywire. In particular, old Borland runtime libraries unmasked these exceptions.

Well, this is all rather bad news in case you didn't get that message yet. First place to look is to try to diagnose why this code is throwing floating point exceptions. Bad data tends to be the most common reason. Secondly, you really have to do something about the FPU control register being changed, this can easily cause managed code to fail as well. Particularly a problem in WPF code, it likes using NaN.

Finding such code is pretty easy with the debugger. Use the Debug + Windows + Registers debugger window. Right-click the window and tick the "Floating point" option. The value of the CTRL register is crucial, it should be 027F in a managed program. Step through the program, coarse at first, you found the trouble-maker when the register changes. If it is 64-bit program then also tick "SSE", the MXCSR register should be 00001F80.

You cannot directly reset the FPU control register with managed code but you can use a trick. The CLR resets it whenever it handles an exception. So a possible fix is to intentionally throw and catch an exception, after the statement that caused the control register value to change:

        try {  throw new Exception("Resetting FPU control register, please ignore"); }
        catch { }

Pinvoking the _controlfp() function in msvcrt.dll is a more direct way. But of course with the side-effect of both that now that library is operating in a mode that it wasn't designed for, it of course won't expect to encounter Nan and Infinity values. Long term, you really need to consider retiring that old component or library.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...