Discussion:
Shell script loop runs out of memory
Jordan
2012-05-31 17:42:30 UTC
Permalink
Hi folks,

I've written a shell script running under CygWin, the purpose of which is to
monitor a file for changes. If the MD5 hash fails to match the previous hash, it
will execute a command to process the file. I used a 1-second delay between
checks of the hash. This works great for several hours, but then gives an "out
of memory" error and actually brings Windows 7 to its knees.

The script uses a loop within a loop; the outer loop is infinite by design, and
the inner loop ends when it finds a non-matching hash and processes the file. It
broke while running the inner loop, without the file having been modified at
that point in time. The file was modified numerous times previously, triggering
the code below the inner loop, but not around the time when the memory error
occurred.

I am just wondering why the loops here are consuming increasing amounts of
memory over time? I'm assigning new MD5 values into existing variables over and
over, not allocating new variables for each MD5 assignment. (Right??) Is 1
second perhaps too short a delay... does the system need time to deallocate
something between each iteration of the inner loop?

Here is the script:
------------------------
#!/bin/sh

FILE_TO_CHECK=/mypath/style.less

echo "Reading hash for $FILE_TO_CHECK with md5sum"
MD5PRINT=`md5sum $FILE_TO_CHECK | cut -d " " -f1`

MD5PRINTNEW=$MD5PRINT

while [[ 1 = 1 ]]
do
echo "Waiting for file to change..."

while [[ "$MD5PRINT" = "$MD5PRINTNEW" ]]
do
sleep 1

MD5PRINTNEW=`md5sum $FILE_TO_CHECK | cut -d " " -f1`
done

echo "File was modified ... Running compiler..."

/mypath/lessc $FILE_TO_CHECK /mypath/style.css -x

echo "Reading hash for $FILE_TO_CHECK with md5sum"
MD5PRINT=`md5sum $FILE_TO_CHECK | cut -d " " -f1`

MD5PRINTNEW=$MD5PRINT
done
------------------------

Any help would be appreciated. I can provide the exact memory error if
requested, but I would need some help to know which logs (if any) in CygWin to
look at, to dig around and find the error text. (I'd rather not run it all day
to reproduce the error again. The error was definitely something related to my
CygWin shell running out of memory.)

(If you propose a solution which involves increasing the memory available to
CygWin, that seems illogical, because the script is gradually increasing its own
memory usage over time. Thus, such a solution would only delay the inevitable, I
think.)

Thanks!
Jordan
AZ 9901
2012-05-31 18:16:22 UTC
Permalink
Post by Jordan
I am just wondering why the loops here are consuming increasing amounts of
memory over time?  I'm assigning new MD5 values into existing variables over and
over, not allocating new variables for each MD5 assignment. (Right??) Is 1
second perhaps too short a delay... does the system need time to deallocate
something between each iteration of the inner loop?
You are certainly under effect of BLODA.
You could have a look (there is a BLODA section in Cygwin online doc).

Then, when (bash) scripting under Cygwin, you must take care to avoid
forking as much as possible.

You could try to improve the "sleep 1" loop with the following one :

while md5sum $FILE_TO_CHECK | cut -d " " -f1 | grep -q "^$MD5PRINT$"
do
sleep 1
done

Note that MD5PRINTNEW is no more useful here.
With this loop we avoid the fork done by
MD5PRINTNEW=`md5sum $FILE_TO_CHECK | cut -d " " -f1`

Ben
Thrall, Bryan
2012-05-31 18:43:47 UTC
Permalink
Post by AZ 9901
Then, when (bash) scripting under Cygwin, you must take care to avoid
forking as much as possible.
while md5sum $FILE_TO_CHECK | cut -d " " -f1 | grep -q "^$MD5PRINT$"
do
sleep 1
done
Note that MD5PRINTNEW is no more useful here.
With this loop we avoid the fork done by
MD5PRINTNEW=`md5sum $FILE_TO_CHECK | cut -d " " -f1`
Doesn't that just replace the 2 MD5PRINTNEW forks (md5sum and cut) with 3 (md5sum, cut, and grep)?

Seems like the (untested) following would be better (in terms of fewer forks):

TMPFILE=$(mktemp)
md5sum $FILE_TO_CHECK > "$TMPFILE"
...
while md5sum -c "$TMPFILE"
do
sleep 1
done
rm "$TMPFILE"
--
Bryan Thrall
Principal Software Engineer
FlightSafety International
***@flightsaf
Jordan
2012-05-31 19:23:28 UTC
Permalink
Post by Thrall, Bryan
Post by AZ 9901
Then, when (bash) scripting under Cygwin, you must take care to avoid
forking as much as possible.
while md5sum $FILE_TO_CHECK | cut -d " " -f1 | grep -q "^$MD5PRINT$"
do
sleep 1
done
Note that MD5PRINTNEW is no more useful here.
With this loop we avoid the fork done by
MD5PRINTNEW=`md5sum $FILE_TO_CHECK | cut -d " " -f1`
Doesn't that just replace the 2 MD5PRINTNEW forks (md5sum and cut) with 3
(md5sum, cut, and grep)?
Post by Thrall, Bryan
TMPFILE=$(mktemp)
md5sum $FILE_TO_CHECK > "$TMPFILE"
...
while md5sum -c "$TMPFILE"
do
sleep 1
done
rm "$TMPFILE"
Ok... Two questions for you guys, then:

1. Does "fewer forks" mean that some forks are still occurring, thus the same
memory crash will still happen, but not right away? Just delaying the
inevitable, for longer than my original script does?

2. What is this I read about "rebasing" for BLODA-related issues ... Can
rebasing help me to completely resolve this script problem? I read the docs
about "rebase all" but don't understand whether it would be effective for my
situation. Or do I just need to close any of the offending software such as
anti-virus, then reopen CygWin and try my script again?

Thanks!
Eliot Moss
2012-05-31 19:36:57 UTC
Permalink
Post by Jordan
1. Does "fewer forks" mean that some forks are still occurring, thus the same
memory crash will still happen, but not right away? Just delaying the
inevitable, for longer than my original script does?
I think so, assuming the problem is that forks are not getting fully reclaimed.
Post by Jordan
2. What is this I read about "rebasing" for BLODA-related issues ... Can
rebasing help me to completely resolve this script problem? I read the docs
about "rebase all" but don't understand whether it would be effective for my
situation. Or do I just need to close any of the offending software such as
anti-virus, then reopen CygWin and try my script again?
Rebasing may or may not be necessary, but it can certainly be important to
have BLODA inactive, which often means *removing* it, because of all the little
background things and services these packages throw in.

Eliot Moss
AZ 9901
2012-05-31 20:14:59 UTC
Permalink
Post by Eliot Moss
1.  Does "fewer forks" mean that some forks are still occurring, thus the
same
memory crash will still happen, but not right away?  Just delaying the
inevitable, for longer than my original script does?
I think so, assuming the problem is that forks are not getting fully reclaimed.
I think so too ; when I intensively script under Cygwin, I have to
reboot after some time because the machine runs out of memory.
I made some tests, forks are clearly the root cause yes, they are not
fully reclaimed.

Make an infinite loop with no fork, and look at the memory usage.
Then, make an infinite loop with one fork and look at the memory :-/

I really hope a solution will be found one day :-)
Jordan
2012-05-31 20:58:08 UTC
Permalink
Post by AZ 9901
Make an infinite loop with no fork, and look at the memory usage.
Then, make an infinite loop with one fork and look at the memory
I really hope a solution will be found one day
Argh! And I really like CygWin, so I was hoping to learn that this is
resolvable.

(Of course I could start uninstalling BLODA programs, but it's a fairly
inconvenient solution.)

Maybe I'll just write a C program or Perl script to do the same thing.
A Perl script won't run into the same forking issue, will it?
(Assuming I use a Perl library to get the MD5 hash, rather than calling
out to execute md5sum?)

Thanks guys...
AZ 9901
2012-05-31 21:27:34 UTC
Permalink
Post by AZ 9901
Make an infinite loop with no fork, and look at the memory usage.
Then, make an infinite loop with one fork and look at the memory
I really hope a solution will be found one day
Argh!  And I really like CygWin, so I was hoping to learn that this is
resolvable.
(Of course I could start uninstalling BLODA programs, but it's a fairly
inconvenient solution.)
Maybe I'll just write a C program or Perl script to do the same thing.
A Perl script won't run into the same forking issue, will it?
(Assuming I use a Perl library to get the MD5 hash, rather than calling
out to execute md5sum?)
Yes using a library it should be OK !
Using "system" to call md5sum will give you the bad fork effect.

Ben
Buchbinder, Barry (NIH/NIAID) [E]
2012-05-31 19:50:24 UTC
Permalink
The following are just ideas - totally untested.

You might try changing
[[ condition ]]
to
[ condition ]
Perhaps single brackets use memory differently than double brackets.

If that doesn't work, try changing
#!/bin/sh
(which calls bash) to
#!/bin/dash
You will have to have retained the double to single bracket change,
because dash does not have double brackets. Perhaps dash is more
efficient with memory than bash.

- Barry
Disclaimer: Statements made herein are not made on behalf of NIAID.
Adam Dinwoodie
2012-06-01 09:20:21 UTC
Permalink
Post by Buchbinder, Barry (NIH/NIAID) [E]
You might try changing
[[ condition ]]
to
[ condition ]
Perhaps single brackets use memory differently than double brackets.
They do: [[ condition ]] is interpreted by the shell; [ condition ] forks to
call /usr/bin/[.exe. If forking is the problem, that'll make it worse.
Post by Buchbinder, Barry (NIH/NIAID) [E]
If that doesn't work, try changing
#!/bin/sh
(which calls bash) to
#!/bin/dash
You will have to have retained the double to single bracket change,
because dash does not have double brackets. Perhaps dash is more
efficient with memory than bash.
There's a whole bunch of other alternatives: ksh, zsh, ash, etc. If the
problem is forking, however, none of those are going to improve things.
AZ 9901
2012-06-01 09:36:57 UTC
Permalink
Post by Adam Dinwoodie
Post by Buchbinder, Barry (NIH/NIAID) [E]
You might try changing
    [[ condition ]]
to
    [ condition ]
Perhaps single brackets use memory differently than double brackets.
They do: [[ condition ]] is interpreted by the shell; [ condition ] forks to
call /usr/bin/[.exe. If forking is the problem, that'll make it worse.
So some things to avoid while (bash)scripting under Cygwin to limit
BLODA effect :
- | : pipe stdout --> stdin
- $(...) : subshell fork
- `...` : same as before, subshell fork
- [ condition ] : prefer [[ condition ]] construction
- anything else ?

Ben
AZ 9901
2012-06-01 09:51:40 UTC
Permalink
Post by AZ 9901
Post by Adam Dinwoodie
Post by Buchbinder, Barry (NIH/NIAID) [E]
You might try changing
    [[ condition ]]
to
    [ condition ]
Perhaps single brackets use memory differently than double brackets.
They do: [[ condition ]] is interpreted by the shell; [ condition ] forks to
call /usr/bin/[.exe. If forking is the problem, that'll make it worse.
So some things to avoid while (bash)scripting under Cygwin to limit
- | : pipe stdout --> stdin
- $(...) : subshell fork
- `...` : same as before, subshell fork
- [ condition ] : prefer [[ condition ]] construction
- ( instructions ) : prefer { instructions } construction if possible
- anything else from your point of view ?

Ben
Adam Dinwoodie
2012-06-01 10:06:14 UTC
Permalink
Post by AZ 9901
So some things to avoid while (bash)scripting under Cygwin to limit
- | : pipe stdout --> stdin
- $(...) : subshell fork
- `...` : same as before, subshell fork
- [ condition ] : prefer [[ condition ]] construction
- anything else ?
By my understanding of the discussion, any sort of forking, ie anything that
will require the bash interpreter to make a system() call. In particular,
including pipes in the above is somewhat of a red herring, since it's not the
pipe that's the problem, but the commands either side of it.

Calling any sort of executable (script, binary, whatever) will cause a fork.
Anything that requires a subshell (in bash, that's the subshell forks, the
( ... ) command syntax, etc), will similarly require a fork.

Shell builtins (eg echo) almost certainly won't require a fork. Note that not
everything you might expect to be a builtin is, however: bash doesn't have a
"sleep" builtin, for example. You can check whether something's a builtin by
calling "type command" from the bash shell.
Eric Blake
2012-06-01 13:00:21 UTC
Permalink
Post by Adam Dinwoodie
Post by Buchbinder, Barry (NIH/NIAID) [E]
You might try changing
[[ condition ]]
to
[ condition ]
Perhaps single brackets use memory differently than double brackets.
They do: [[ condition ]] is interpreted by the shell; [ condition ] forks to
call /usr/bin/[.exe.
No, it doesn't. Bash has a built-in [ rather than forking, since
running tests is so common.
--
Eric Blake ***@redhat.com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
Buchbinder, Barry (NIH/NIAID) [E]
2012-06-10 17:41:25 UTC
Permalink
Just to complete this topic ...

This gets rid of all the fork-execs in the inner loop except
for sleep. Instead of comparing file contents, it uses
the test builtin to compare time stamps.

------------------------
#!/bin/dash

FILE_TO_CHECK=/mypath/style.less
COMPARE_FILE=/mypath/compare_file.tmp
echo -n > $COMPARE_FILE

while [ 1 = 1 ]
do
echo "Waiting for file to change..."
while [ 1 = 1 ]
do
if [ $FILE_TO_CHECK -nt $COMPARE_FILE ]
then
break
fi
sleep 1
done
echo -n > ${COMPARE_FILE}
echo "File was modified ... Running compiler..."
/mypath/lessc $FILE_TO_CHECK /mypath/style.css -x
done
------------------------

A question I have is whether using a DOS version of sleep would
avoid the consumption of resources. In other words, is the
problem caused by the fork-exec or launching a new cygwin
process. If the latter, might that be avoid by use of a DOS
version of sleep? (I have one if the OP wants to try it.)

I apologize for taking so long to post this.

- Barry
Disclaimer: Statements made herein are not made on behalf of NIAID.
Andrey Repin
2012-06-11 13:58:41 UTC
Permalink
Greetings, Buchbinder, Barry (NIH/NIAID) [E]!
Post by Buchbinder, Barry (NIH/NIAID) [E]
Just to complete this topic ...
This gets rid of all the fork-execs in the inner loop except
for sleep. Instead of comparing file contents, it uses
the test builtin to compare time stamps.
I /never ever/ rely on timestamps, except for casual check of the file age.
If I want to know, if two files are equal, I always compare contents.
Always. Even on 4Gb+ files.


--
WBR,
Andrey Repin (***@freemail.ru) 11.06.2012, <17:57>

Sorry for my terrible english...

Jordan
2012-05-31 22:53:01 UTC
Permalink
Post by Jordan
This works great for several hours, but then gives an "out
of memory" error and actually brings Windows 7 to its knees.
I can provide the exact memory error if
requested
I reproduced it again. The error messages are as follows:

./myscript.sh: line 32: /usr/bin/cut: Cannot allocate memory
./myscript.sh: line 32: /usr/bin/md5sum: Cannot allocate memory
./myscript.sh: line 32: /usr/bin/grep: Cannot allocate memory
1 [main] sh 437152 fork: child -1 - CreateProcessW failed for
'C:\cygwin\bin\sh.exe', errno 12
./myscript.sh: fork: Cannot allocate memory

...and the script aborts at that point.

So, this totally confirms everyone's responses that the issue is related to
forking.

Thanks all...
w***@gmail.com
2012-05-31 23:55:31 UTC
Permalink
I will not address the memory management problem, if there is one here.
I am addressing your method of determing a difference.

No offense, but it seems extremely inefficient, and the larger the file you are computnig the md5sum on the more inefficient.

Why not use the file system's "modified" flag, or even, depending on the types of changes you are expecting you could just check for a file-size change. Using either of these means that you wouldn't even have to open the file, let alone, slog through x
bytes to compute an md5hash value. In terms of both computer resource usage and speed this would be a far better method in my opinion to the method you currently use.

Regards
Post by Jordan
Hi folks,
I've written a shell script running under CygWin, the purpose of which is to
monitor a file for changes. If the MD5 hash fails to match the previous hash, it
will execute a command to process the file. I used a 1-second delay between
checks of the hash. This works great for several hours, but then gives an "out
of memory" error and actually brings Windows 7 to its knees.
The script uses a loop within a loop; the outer loop is infinite by design, and
the inner loop ends when it finds a non-matching hash and processes the file. It
broke while running the inner loop, without the file having been modified at
that point in time. The file was modified numerous times previously, triggering
the code below the inner loop, but not around the time when the memory error
occurred.
I am just wondering why the loops here are consuming increasing amounts of
memory over time? I'm assigning new MD5 values into existing variables over and
over, not allocating new variables for each MD5 assignment. (Right??) Is 1
second perhaps too short a delay... does the system need time to deallocate
something between each iteration of the inner loop?
Corinna Vinschen
2012-06-01 10:31:48 UTC
Permalink
Post by Jordan
Hi folks,
I've written a shell script running under CygWin, the purpose of which is to
monitor a file for changes. If the MD5 hash fails to match the previous hash, it
will execute a command to process the file. I used a 1-second delay between
checks of the hash. This works great for several hours, but then gives an "out
of memory" error and actually brings Windows 7 to its knees.
[...]
------------------------
#!/bin/sh
FILE_TO_CHECK=/mypath/style.less
echo "Reading hash for $FILE_TO_CHECK with md5sum"
MD5PRINT=`md5sum $FILE_TO_CHECK | cut -d " " -f1`
MD5PRINTNEW=$MD5PRINT
while [[ 1 = 1 ]]
do
echo "Waiting for file to change..."
while [[ "$MD5PRINT" = "$MD5PRINTNEW" ]]
do
sleep 1
MD5PRINTNEW=`md5sum $FILE_TO_CHECK | cut -d " " -f1`
done
echo "File was modified ... Running compiler..."
/mypath/lessc $FILE_TO_CHECK /mypath/style.css -x
echo "Reading hash for $FILE_TO_CHECK with md5sum"
MD5PRINT=`md5sum $FILE_TO_CHECK | cut -d " " -f1`
MD5PRINTNEW=$MD5PRINT
done
------------------------
I'm running your script with disabled "sleep 1" and disabled "/mypath/lessc"
for about half an hour now on W7. Neither the system memory usage, nor
the process memory usage of the outmost shell, nor the handle count of
the outmost shell has changed during this time.

I'm running this under the last snapshot from http://cygwin.com/snapshots/
Either the bug is fixed there, or you're really under the influence of
some BLODA. Did you even check for BLODA? I didn't see a hint of that
in this thread.

Also, how often does the file change so that you have to run the
compiler? Is the compiler a native or a Cygwin tool? Does the memory
problem occur visibly in task manager? All the time or only when the
compiler runs?


Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
Eric Blake
2012-06-01 13:03:27 UTC
Permalink
Post by Jordan
Hi folks,
I've written a shell script running under CygWin, the purpose of which is to
monitor a file for changes. If the MD5 hash fails to match the previous hash, it
will execute a command to process the file. I used a 1-second delay between
checks of the hash. This works great for several hours, but then gives an "out
of memory" error and actually brings Windows 7 to its knees.
Have you ascertained whether the leak is in bash, cygwin1.dll, or in
Windows itself? If it is BLODA (and the leak is in windows itself),
then there is nothing we can do. If it is in cygwin1.dll, then the leak
would be present even if you used a different shell, and fixing it would
benefit all cygwin programs. I will also note that cygwin is currently
on bash 4.1, but upstream is at bash 4.2, so there may be a memory leak
patch for bash that we would get if I ever had time to upgrade to the
latest upstream bash.
--
Eric Blake ***@redhat.com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
Loading...