ansaurus

Question

How to deal with NFS latency in shell scripts

Answer 1

A:

I'd say the way to check for a string in a text file is grep.

What's your exact problem with it?

Also you might adjust your NFS mount parameters, to get rid of the root problem. A sync might also help. See NFS docs.

TheBonsai 2009-07-03 06:53:20

I have no possibility to change mount parameters. My script should be robust enough to deal with the latency. The problem is not with grep itself, but rather that the grep test doesn't get evaluated as I would expect.

andreas buykx 2009-07-03 07:10:51

Okay, if you can't change your environment then it's a problem.Anyways, tried a ''sync'' after changing the file? According to NFS docs it should do (as long as it calls sysnc.2).

TheBonsai 2009-07-03 09:30:15

Answer 2

A:

If you're wanting to use waitLoop in an "if", you might want to change the "exit" to a "return", so the rest of the script can handle the error situation (there's not even a message to the user about what failed before the script dies otherwise).

The other issue is using "$test" to hold a command means you don't get shell expansion when actually executing, just evaluating. So if you say test="grep \"foo\" \"bar baz\"", rather than looking for the three letter string foo in the file with the seven character name bar baz, it'll look for the five char string "foo" in the nine char file "bar baz".

So you can either decide you don't need the shell magic, and set test='grep -sq ^sometext$ somefilename', or you can get the shell to handle the quoting explicitly with something like:

if /bin/sh -c "$test"
then
   ...

Anthony Towns 2009-07-03 07:29:50

Answer 3

+1 A:

You can set your test variable this way:

test=$(grep -sq "^sometext$" $somefilename)

The reason your grep isn't working is that quotes are really hard to pass in arguments. You'll need to use eval:

if ! eval $test

Dennis Williamson 2009-07-03 10:31:39

Answer 4

A:

Try using the file modification time to detect when it is written without opening it. Something like

old_mtime=`stat --format="%Z" file`
# Write to file.
new_mtime=$old_mtime
while [[ "$old_mtime" -eq "$new_mtime" ]]; do 
  sleep 2;
  new_mtime=`stat --format="%Z" file`
done

This won't work, however, if multiple processes try to access the file at the same time.

Steve K 2009-07-14 23:47:45

Answer 5

A:

I just had the exact same problem. I used a similar approach to the timeout wait that you include in your OP; however, I also included a file-size check. I reset my timeout timer if the file had increased in size since last it was checked. The files I'm writing can be a few gig, so they take a while to write across NFS.

This may be overkill for your particular case, but I also had my writing process calculate a hash of the file after it was done writing. I used md5, but something like crc32 would work, too. This hash was broadcast from the writer to the (multiple) readers, and the reader waits until a) the file size stops increasing and b) the (freshly computed) hash of the file matches the hash sent by the writer.

Andrew Barnett 2009-12-18 19:11:16

thanks. How do you broadcast the md5 hash? And wouldn't a wait-loop for that hash be sufficient by itself?

andreas buykx 2009-12-20 07:57:24

We're using an off the shelf network messaging protocol (POE for Perl; but JMS or AMQP, for this functionality, are the same). In our case, the writer and the readers are on different machines, and the writer broadcasts the hash as part of the message that indicates the write is complete. The readers pick up the message, wait for their view of the file to stop growing in size, and then check the hash.

Andrew Barnett 2009-12-21 13:50:36

And, yes, a wait loop on the hash would work as well -- but calculating the hash is relatively expensive vs. checking the file size, so I do the easy/cheap thing as a "gatekeeper" to the heavy/expensive thing.

Andrew Barnett 2009-12-21 13:51:28

Answer 6

A:

We have a similar issue, but for different reasons. We are reading s file, which is sent to an SFTP server. The machine running the script is not the SFTP server.

What I have done is set it up in cron (although a loop with a sleep would work too) to do a cksum of the file. When the old cksum matches the current cksum (the file has not changed for the determined amount of time) we know that the writes are complete, and transfer the file.

Just to be extra safe, we never overwrite a local file before making a backup, and only transfer at all when the remote file has two cksums in a row that match, and that cksum does not match the local file.

If you need code examples, I am sure I can dig them up.

Grant Johnson 2009-12-21 21:49:24

Answer 7

A:

The shell was splitting your predicate into words. Grab it all with $@ as in the code below:

#! /bin/bash

waitFor()
{
  local tries=$1
  shift
  local predicate="$@"

  while [ $tries -ge 1 ]; do
    (( tries-- ))

    if $predicate >/dev/null 2>&1; then
      return
    else
      [ $tries -gt 0 ] && sleep 1
    fi
  done

  exit 1
}

pred='[ -e /etc/passwd ]'
waitFor 5 $pred
echo "$pred satisfied"

rm -f /tmp/baz
(sleep 2; echo blahblah >>/tmp/baz) &
(sleep 4; echo hasfoo   >>/tmp/baz) &

pred='grep ^hasfoo /tmp/baz'
waitFor 5 $pred
echo "$pred satisfied"

Output:

$ ./waitngo 
[ -e /etc/passwd ] satisfied
grep ^hasfoo /tmp/baz satisfied

Too bad the typescript isn't as interesting as watching it in real time.

Greg Bacon 2009-12-22 14:47:50

Answer 8

A:

Ok...this is a bit whacky...

If you have control over the file: you might be able to create a 'named pipe' here. So (depending on how the writing program works) you can monitor the file in an synchronized fashion.

At its simplest:

Create the named pipe:

mkfifo file.txt

Set up the sync'd receiver:

while :
do
    process.sh < file.txt
end

Create a test sender:

echo "Hello There" > file.txt

The 'process.sh' is where your logic goes : this will block until the sender has written its output. In theory the writer program won't need modifiying....

WARNING: if the receiver is not running for some reason, you may end up blocking the sender!

Not sure it fits your requirement here, but might be worth looking into.

Or to avoid synchronized, try 'lsof' ?

http://en.wikipedia.org/wiki/Lsof

Assuming that you only want to read from the file when nothing else is writing to it (ie, the writing process has finished) - you could check whether nothing else has file handle to it ?

monojohnny 2009-12-22 18:13:44

Good idea, but a point to note is that named pipes do not work if the sender and receiver are on two different hosts, even if they have the same NFS mount points.

dogbane 2009-12-23 20:49:07

doh! Should have read the title properly - yup won't work across NFS mounts !

monojohnny 2009-12-29 09:48:31

ansaurus

tags:

views:

answers:

How to deal with NFS latency in shell scripts

related questions