Discussion:
wget, ssh, ssh-agent hang in socket_cleanup
(too old to reply)
Dobes Vandermeer
2009-02-22 07:27:45 UTC
Permalink
Currently when using cygwin (other software works fine) to communicate over
the network, some network connections will hang. This affects rsync, ssh,
and wget (so far).

I ran an strace on wget http://www.google.com and it hangs here (ssh hangs
in the same place):

Connecting to www.google.com|74.125.19.103|:80... 1870 2067824 [main] wget
214768 fhandler_console::write: 50 = write_c
onsole (,..50)
3637 2071461 [main] wget 214768 cygwin_socket: socket (2, 1, 0)
3889 2075350 [main] wget 214768 fdsock: reset socket inheritance
2096 2077446 [main] wget 214768 build_fh_pc: fh 0x61169E10
1749 2079195 [main] wget 214768 fhandler_base::set_flags: flags 0x10002,
supplied_bin 0x0
1920 2081115 [main] wget 214768 fhandler_base::set_flags: O_TEXT/O_BINARY
set in flags 0x10000
1825 2082940 [main] wget 214768 fhandler_base::set_flags: filemode set to
binary
1968 2084908 [main] wget 214768 fdsock: fd 3, name '', soc 0x1FC
1763 2086671 [main] wget 214768 cygwin_socket: 3 = socket (2, 1, 0)
2374 2089045 [main] wget 214768 sig_send: sendsig 0x13C, pid 214768, signal
-34, its_me 1
1844 2090889 [main] wget 214768 sig_send: wakeup 0x1F4
2028 2092917 [main] wget 214768 sig_send: Waiting for pack.wakeup 0x1F4
1818 2094735 [sig] wget 214768 wait_sig: signalling pack.wakeup 0x1F4
1961 2096696 [main] wget 214768 sig_send: returning 0x0 from sending signal
-34
1801 2098497 [main] wget 214768 fhandler_socket::ioctl: socket is now
nonblocking
2020 2100517 [main] wget 214768 fhandler_socket::ioctl: 0 = ioctl_socket
(8004667E, 27C43C)
3710 2104227 [main] wget 214768 __set_errno: void __set_winsock_errno(const
char*, int):234 val 119
2775 2107002 [main] wget 214768 __set_winsock_errno: connect:788 - winsock
error 10036 -> errno 119
2719 2109721 [main] wget 214768 cygwin_select: 4, 0x0, 0x27C3F0, 0x27C3D0,
0x0
2212 2111933 [main] wget 214768 dtable::select_write: fd 3
1774 2113707 [main] wget 214768 dtable::select_except: fd 3
2143 2115850 [main] wget 214768 cygwin_select: to NULL, ms FFFFFFFF
1763 2117613 [main] wget 214768 cygwin_select: sel.always_ready 0
1953 2119566 [main] wget 214768 start_thread_socket: Handle 0x1FC
1754 2121320 [main] wget 214768 start_thread_socket: Added to writefds
1931 2123251 [main] wget 214768 start_thread_socket: Added to exceptfds
2250 2125501 [main] wget 214768 start_thread_socket: opened new socket
0x208
1923 2127424 [main] wget 214768 start_thread_socket: exitsock 0x208
1810 2129234 [main] wget 214768 start_thread_socket: stuff_start 0x27C354
2635 2131869 [select_socket] wget 214768 cygthread::stub: thread
'select_socket', id 0x3421C, stack_ptr 0x1B2FCDA0
1790 2133659 [select_socket] wget 214768 thread_socket: stuff_start
0xF5582C
3188 2136847 [select_socket] wget 214768 thread_socket: Win32 select
returned 1
2146 2138993 [select_socket] wget 214768 thread_socket: s 0xF527E8, testing
fd 3 ()
5712 2144705 [main] wget 214768 select_stuff::wait: m 2, ms 4294967295
5 2144710 [select_socket] wget 214768 thread_socket: write_ready
3987 2148697 [main] wget 214768 select_stuff::wait: woke up. wait_ret 1.
verifying
2044 2150741 [main] wget 214768 select_stuff::wait: gotone 1
2042 2152783 [main] wget 214768 select_stuff::wait: returning 0
1853 2154636 [main] wget 214768 select_stuff::cleanup: calling cleanup
routines
2033 2156669 [main] wget 214768 socket_cleanup: si 0xF52818 si->thread
0x61106F30
2254 2158923 [main] wget 214768 socket_cleanup: sent a byte to exitsock
0x208, res -1
2038 2160961 [main] wget 214768 socket_cleanup: reading a byte from
exitsock 0x208

For reasons I can't really guess at, wget works 3/4 times, ssh only 1/3
times. I can get through by pressing CTRL-C and re-running the command
(sometimes I have to do this several times), except with rsync which seems
to trap the CTRL-C and I have to kill it using the Task Manager before it
will stop and I can try again. This also affects ssh-add which seems to
hang trying to connect to the ssh-agent on localhost, so it's not related to
connecting to other machines.

This is something that started happening recently, in the last few weeks.
Windows Update has run a few times and I thought it may have updated my
network driver; however, after I observed this problem I used Windows Update
to install a network card driver update, which didn't fix (or worsen) the
problem. Also, the fact that this affects connections to localhost, which
theoretically would bypass the network card driver, seems to discredit the
"network card driver" theory and cast suspicion on the entire networking
stack instead. This only affects cygwin, so far, though ... so maybe it's
something funny cygwin is doing.

I'm running Windows Vista Home Basic 64-bit, the network card is a Realtek
PCI-E GB NIC.

Any ideas? I'm not really sure where to go next with this, my best idea
right now is to try reinstalling windows since reinstalling cygwin didn't
fix it... any help will be appreciated!
--
View this message in context: http://www.nabble.com/wget%2C-ssh%2C-ssh-agent-hang-in-socket_cleanup-tp22144268p22144268.html
Sent from the Cygwin list mailing list archive at Nabble.com.
Paul McFerrin
2009-02-22 23:24:45 UTC
Permalink
I've been having the same problem for the past two weeks. Just didn't
know where to look. Thanks for sharing. Maybe we can get someone on this.

Wget would just HANG intermittently in the Connecting phase.

- Paul
Post by Dobes Vandermeer
Currently when using cygwin (other software works fine) to communicate over
the network, some network connections will hang. This affects rsync, ssh,
and wget (so far).
I ran an strace on wget http://www.google.com and it hangs here (ssh hangs
Connecting to www.google.com|74.125.19.103|:80... 1870 2067824 [main] wget
214768 fhandler_console::write: 50 = write_c
onsole (,..50)
3637 2071461 [main] wget 214768 cygwin_socket: socket (2, 1, 0)
3889 2075350 [main] wget 214768 fdsock: reset socket inheritance
2096 2077446 [main] wget 214768 build_fh_pc: fh 0x61169E10
1749 2079195 [main] wget 214768 fhandler_base::set_flags: flags 0x10002,
supplied_bin 0x0
1920 2081115 [main] wget 214768 fhandler_base::set_flags: O_TEXT/O_BINARY
set in flags 0x10000
1825 2082940 [main] wget 214768 fhandler_base::set_flags: filemode set to
binary
1968 2084908 [main] wget 214768 fdsock: fd 3, name '', soc 0x1FC
1763 2086671 [main] wget 214768 cygwin_socket: 3 = socket (2, 1, 0)
2374 2089045 [main] wget 214768 sig_send: sendsig 0x13C, pid 214768, signal
-34, its_me 1
1844 2090889 [main] wget 214768 sig_send: wakeup 0x1F4
2028 2092917 [main] wget 214768 sig_send: Waiting for pack.wakeup 0x1F4
1818 2094735 [sig] wget 214768 wait_sig: signalling pack.wakeup 0x1F4
1961 2096696 [main] wget 214768 sig_send: returning 0x0 from sending signal
-34
1801 2098497 [main] wget 214768 fhandler_socket::ioctl: socket is now
nonblocking
2020 2100517 [main] wget 214768 fhandler_socket::ioctl: 0 = ioctl_socket
(8004667E, 27C43C)
3710 2104227 [main] wget 214768 __set_errno: void __set_winsock_errno(const
char*, int):234 val 119
2775 2107002 [main] wget 214768 __set_winsock_errno: connect:788 - winsock
error 10036 -> errno 119
2719 2109721 [main] wget 214768 cygwin_select: 4, 0x0, 0x27C3F0, 0x27C3D0,
0x0
2212 2111933 [main] wget 214768 dtable::select_write: fd 3
1774 2113707 [main] wget 214768 dtable::select_except: fd 3
2143 2115850 [main] wget 214768 cygwin_select: to NULL, ms FFFFFFFF
1763 2117613 [main] wget 214768 cygwin_select: sel.always_ready 0
1953 2119566 [main] wget 214768 start_thread_socket: Handle 0x1FC
1754 2121320 [main] wget 214768 start_thread_socket: Added to writefds
1931 2123251 [main] wget 214768 start_thread_socket: Added to exceptfds
2250 2125501 [main] wget 214768 start_thread_socket: opened new socket
0x208
1923 2127424 [main] wget 214768 start_thread_socket: exitsock 0x208
1810 2129234 [main] wget 214768 start_thread_socket: stuff_start 0x27C354
2635 2131869 [select_socket] wget 214768 cygthread::stub: thread
'select_socket', id 0x3421C, stack_ptr 0x1B2FCDA0
1790 2133659 [select_socket] wget 214768 thread_socket: stuff_start
0xF5582C
3188 2136847 [select_socket] wget 214768 thread_socket: Win32 select
returned 1
2146 2138993 [select_socket] wget 214768 thread_socket: s 0xF527E8, testing
fd 3 ()
5712 2144705 [main] wget 214768 select_stuff::wait: m 2, ms 4294967295
5 2144710 [select_socket] wget 214768 thread_socket: write_ready
3987 2148697 [main] wget 214768 select_stuff::wait: woke up. wait_ret 1.
verifying
2044 2150741 [main] wget 214768 select_stuff::wait: gotone 1
2042 2152783 [main] wget 214768 select_stuff::wait: returning 0
1853 2154636 [main] wget 214768 select_stuff::cleanup: calling cleanup
routines
2033 2156669 [main] wget 214768 socket_cleanup: si 0xF52818 si->thread
0x61106F30
2254 2158923 [main] wget 214768 socket_cleanup: sent a byte to exitsock
0x208, res -1
2038 2160961 [main] wget 214768 socket_cleanup: reading a byte from
exitsock 0x208
For reasons I can't really guess at, wget works 3/4 times, ssh only 1/3
times. I can get through by pressing CTRL-C and re-running the command
(sometimes I have to do this several times), except with rsync which seems
to trap the CTRL-C and I have to kill it using the Task Manager before it
will stop and I can try again. This also affects ssh-add which seems to
hang trying to connect to the ssh-agent on localhost, so it's not related to
connecting to other machines.
This is something that started happening recently, in the last few weeks.
Windows Update has run a few times and I thought it may have updated my
network driver; however, after I observed this problem I used Windows Update
to install a network card driver update, which didn't fix (or worsen) the
problem. Also, the fact that this affects connections to localhost, which
theoretically would bypass the network card driver, seems to discredit the
"network card driver" theory and cast suspicion on the entire networking
stack instead. This only affects cygwin, so far, though ... so maybe it's
something funny cygwin is doing.
I'm running Windows Vista Home Basic 64-bit, the network card is a Realtek
PCI-E GB NIC.
Any ideas? I'm not really sure where to go next with this, my best idea
right now is to try reinstalling windows since reinstalling cygwin didn't
fix it... any help will be appreciated!
Corinna Vinschen
2009-02-24 09:19:47 UTC
Permalink
Post by Paul McFerrin
I've been having the same problem for the past two weeks. Just didn't
know where to look. Thanks for sharing. Maybe we can get someone on this.
Wget would just HANG intermittently in the Connecting phase.
I don't observe this problem. Does updating to Cygwin 1.7 help?


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
Paul McFerrin
2009-02-24 21:08:06 UTC
Permalink
I would be happy to try it. I'm currently running version "-
Paul1.5.25-15". How do I download version 1.7? Setup is giving me
1.5.25-14!

- Paul
Post by Corinna Vinschen
Post by Paul McFerrin
I've been having the same problem for the past two weeks. Just didn't
know where to look. Thanks for sharing. Maybe we can get someone on this.
Wget would just HANG intermittently in the Connecting phase.
I don't observe this problem. Does updating to Cygwin 1.7 help?
Corinna
Larry Hall (Cygwin)
2009-02-24 22:44:08 UTC
Permalink
Post by Paul McFerrin
I would be happy to try it. I'm currently running version "-
Paul1.5.25-15". How do I download version 1.7? Setup is giving me
1.5.25-14!
As expected. If you want 1.7, go to the cygwin-announce mailing list
and read one of the (preferably the latest) 1.7 announcements to find
out how to install it.
--
Larry Hall http://www.rfk.com
RFK Partners, Inc. (508) 893-9779 - RFK Office
216 Dalton Rd. (508) 893-9889 - FAX
Holliston, MA 01746

_____________________________________________________________________

A: Yes.
Post by Paul McFerrin
Q: Are you sure?
A: Because it reverses the logical flow of conversation.
Q: Why is top posting annoying in email?
Paul McFerrin
2009-02-25 03:41:36 UTC
Permalink
I tried updating cygwin1.dll up to version 1.7 on my production system
and it (system) failed to start. So I backed it out for now unless
someone can give me some help is upgrading to 1.7 as I have NO
experience with 1.7. Here are my errors:

Huh? No /etc/fstab file in \??\C:\cygwin\etc\fstab.d\Paul? Using
default root and cygdrive prefix...
cygwin warning:
MS-DOS style path detected: C:/cygwin/home/paul/.bash_login
Preferred POSIX equivalent is: /home/paul/.bash_login
CYGWIN environment variable option "nodosfilewarning" turns off this
warning.
Consult the user's guide for more details about POSIX paths:
http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
ksh: C:/cygwin/home/paul/.bash_login[5]: /usr/bin/ksh: not found
mount (1.5 output)
C:\cygwin\bin on /bin type system (binmode)
C:\cygwin\bin on /usr/bin type system (binmode)
C:\cygwin\etc on /etc type system (binmode)
C:\cygwin\lib on /usr/lib type system (binmode)
C:\cygwin\lib on /lib type system (binmode)
C:\cygwin\usr on /usr type system (binmode)
C:\cygwin\u on /u type system (binmode)
C:\cygwin on / type system (binmode)
\\linda\c on /pc type system (binmode)
C: on /c type system (binmode)
D: on /d type system (binmode)
E: on /e type system (binmode)
F: on /f type system (binmode)
G: on /g type system (binmode)
H: on /h type system (binmode)
I: on /i type system (binmode)
J: on /j type system (binmode)
K: on /k type system (binmode)
L: on /l type system (binmode)
M: on /m type system (binmode)
N: on /n type system (binmode)
O: on /o type system (binmode)
P: on /p type system (binmode)
Q: on /q type system (binmode)
a: on /a type system (binmode)

Questions:
1. Why is it looking for fstab under \??\C:\cygwin\etc\fstab.d\Paul?
2. I get a warning accessing a letter drive number??
Looks like some more reading before trying again. How stable is 1.7? I
would like to leave it running in production mode.

Paul
P.S. I don't have the luxary of bringing down a production system to
debug a new release at this time.
Post by Corinna Vinschen
Post by Paul McFerrin
I've been having the same problem for the past two weeks. Just didn't
know where to look. Thanks for sharing. Maybe we can get someone on this.
Wget would just HANG intermittently in the Connecting phase.
I don't observe this problem. Does updating to Cygwin 1.7 help?
Corinna
Paul McFerrin
2009-02-26 05:28:13 UTC
Permalink
Corinnia:
I finally got cygwin 1.7 swapped-in to try to reproduce my problem.
Good news, after a few hours of pinging a website, ZERO ERRORS for now.
Since it is an intermittent error, I'll let it run at 6 seconds interval
until tomorrow.

Now some issues raised with 1.7 .................
$ ls -l /etc/passwd /etc/group
-rwxrwxrwx+ 1 UNKNOWN mkpasswd 74 Feb 25 23:42 /etc/group
-rwxrwxrwx+ 1 UNKNOWN mkpasswd 81 Jan 11 19:06 /etc/passwd

For BOTH files having write access by everyone, any attemps to edit the
files with "vim" will result on a permament block on OPEN. No EPERM
error code. Now I did NOT upgrade anything elso but just replacing
cygwin1.dll file. I was able to append a new-line to the group file.
When editing group file, "vim group" became "vim -u NONE group" is
procps output.

As I was reading 1.7 docs on mkpasswd & mkproup, I didn't see any way to
specify a UID or GID when creating the entry. Otherwise, things are
looking up. With my two web servers running fine.
Post by Corinna Vinschen
Post by Paul McFerrin
I've been having the same problem for the past two weeks. Just didn't
know where to look. Thanks for sharing. Maybe we can get someone on this.
Wget would just HANG intermittently in the Connecting phase.
I don't observe this problem. Does updating to Cygwin 1.7 help?
Corinna
Corinna Vinschen
2009-02-26 09:03:05 UTC
Permalink
Post by Paul McFerrin
I finally got cygwin 1.7 swapped-in to try to reproduce my problem.
Good news, after a few hours of pinging a website, ZERO ERRORS for now.
Since it is an intermittent error, I'll let it run at 6 seconds interval
until tomorrow.
Now some issues raised with 1.7 .................
$ ls -l /etc/passwd /etc/group
-rwxrwxrwx+ 1 UNKNOWN mkpasswd 74 Feb 25 23:42 /etc/group
-rwxrwxrwx+ 1 UNKNOWN mkpasswd 81 Jan 11 19:06 /etc/passwd
This looks like your passwd and group files are broken.
Post by Paul McFerrin
For BOTH files having write access by everyone, any attemps to edit the
files with "vim" will result on a permament block on OPEN. No EPERM
error code. Now I did NOT upgrade anything elso but just replacing
cygwin1.dll file. I was able to append a new-line to the group file.
When editing group file, "vim group" became "vim -u NONE group" is
procps output.
Never saw that before and works normal for me. procps shows `vim group'.
Looks like some profile's influence.


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
Loading...