Discussion:
WFSO timed out after longjmp, data pass 3 failed
Dirk Reiners
2005-08-11 22:53:01 UTC
Permalink
Hi everybody,

I'm getting the above error messages, and I'm not sure what they mean or
what I can do about them.

Context: I'm running doxygen to generate the documentation for my
project (OpenSG, www.opensg.org) in parallel to compiling it every
night. This takes a pretty long time (>5 hours). At the end of the
process, doxygen tries to run the html help compiler to create a .chm
file, and that fails with the given error message:

Generating graph info page...
Running html help compiler...
C:\cygwin\bin\doxygen.exe (2224): *** WFSO timed out after longjmp
3506 [main] doxygen 2472 fork_copy: user/cygwin data pass 3 failed,
0x22EAD0..0x230000, done 0, windows pid 2224, Win32 error 5
Error: failed to run html help compiler on index.hhp
make[2]: Leaving directory `/home/Administrator/OpenSG_db/OpenSG/Doc'

What does WFSO mean, and is there any way to change the timeout? It
seems to be not fully reproducible, sometimes it works, which might
depend on other things going on on the system that slow down the process
more or less. Therefore if I can extend the timeout somewhat I hope I
can get rid of this.

Thanks

Dirk
Christopher Faylor
2005-08-11 23:55:43 UTC
Permalink
Post by Dirk Reiners
I'm getting the above error messages, and I'm not sure what they mean or
what I can do about them.
Context: I'm running doxygen to generate the documentation for my
project (OpenSG, www.opensg.org) in parallel to compiling it every
night. This takes a pretty long time (>5 hours). At the end of the
process, doxygen tries to run the html help compiler to create a .chm
Generating graph info page...
Running html help compiler...
C:\cygwin\bin\doxygen.exe (2224): *** WFSO timed out after longjmp
3506 [main] doxygen 2472 fork_copy: user/cygwin data pass 3 failed,
0x22EAD0..0x230000, done 0, windows pid 2224, Win32 error 5
Error: failed to run html help compiler on index.hhp
make[2]: Leaving directory `/home/Administrator/OpenSG_db/OpenSG/Doc'
Try installing the rebase package and then issuing the "ash -c
/bin/rebaseall" from a windows command prompt after all cygwin processes
have been terminated.

Otherwise: http://cygwin.com/problems.html

cgf
Brian Dessent
2005-08-12 00:13:58 UTC
Permalink
Post by Dirk Reiners
What does WFSO mean, and is there any way to change the timeout? It
seems to be not fully reproducible, sometimes it works, which might
depend on other things going on on the system that slow down the process
more or less. Therefore if I can extend the timeout somewhat I hope I
can get rid of this.
Disclaimer: There's a good chance that I'm way off base with the
following.

WFSO stands for WaitForSingleObject and the error means that something
went wrong during the dance that parent and child processes perform in
the fork emulation code.

To emulate fork(), the parent starts the child, waits for it to
initialize, and then the child blocks (using WFSO) on a semaphore while
the parent copies over all its data onto the child. The parent is then
supposed to signal the child to wake up again after it's done copying
all these sections (heap, stack, etc.)

The child needs a timeout for this wait, in case the parent flakes out
for some reason. But if it does wake up on account of this timeout,
there's a serious problem and all it can really do is exit, which is
what you're seeing. The parent was apparently still copying (in pass 3)
when this happened, and so the copying failed with errno 5 (Access
denied) as the child has terminated.

The timeout is 5 minutes, and it's hard to imagine that copying over
those sections would take anywhere near long, but I suppose if the
system is heavily taxed then it's conceivable that the parent might be
starved for CPU by a higher priority process and not complete in time.
But that's really a stretch.

If this is the case then it should be intermittent, and not occur unless
the system is very busy. You should try the latest cygwin DLL snapshot,
and try rebasing (though I dunno if that would affect this or not.)

Brian
Dirk Reiners
2005-08-12 18:41:35 UTC
Permalink
Hi Brian,

thanks for the explanation, it makes a lot more sense now.

Yes, the system is very busy. This is running on VMWare, and while the
doxygen is running there is a compiler or two running in the virtual
machine at the same time. In addition to that the doxygen process gets
pretty big (> 250 MB), so copying everything over might take a little
while. 5 minutes seems seriously excessive, but I guess I could hit that
limit if a lot of the main process's memory has gone into swap, and if
the Linux build (which is running on the main system) takes longer than
it should and runs into the VMWare build's time slot. So yes it is a
stretch, but in my situation I could actually see it happen.

I'll look into lightening the load on the system and give the doxygen a
little more processor time.

Thanks a bunch

Dirk

Loading...