views:

703

answers:

4

Hi I am stuck with this

I want to replace all include('./ in a set of files with include('. I am trying to use awk as follows:

awk '{gsub("include\('"'"'./", "include\('"'"'", $0); print > FILENAME}' *.php

It throws me this error.

awk: (FILENAME=xyz.php FNR=1) fatal: Unmatched ( or \(: /include('.//

Any help would be appreciated.

A: 

Try this :

awk '{gsub("include(\'"'"'./", "include\('"'"'", $0); print > FILENAME}' *.php

you misplaced backslash

or this :

 awk '{gsub("include(\'./", "include(\'", $0); print > FILENAME}' *.php

how about this ?

awk '{gsub("include(\47./", "include(\47", $0); print > FILENAME}' *.php

Did you try without esacping anything

awk '{gsub("include('./", "include('", $0); print > FILENAME}' *.php
c0mrade
Not working these errors cropped... awk: warning: escape sequence \`\'' treated as plain \`'' awk: warning: escape sequence \`\(' treated as plain \`(' awk: (FILENAME=xyz.php FNR=1) fatal: Unmatched ( or \(: /include('.//
GeekTantra
@GeekTantra I don't have a console or installed awk on me .. I'd test these examples above ..
c0mrade
None of these helped...
GeekTantra
that's because awk is writing back to the file as its processing at the same time. !
ghostdog74
+1  A: 

This works (without the I/O redirection on the 'print'):

awk '{gsub(/include\('"'"'.\//, "include\('"'"'", $0); print }' # Wrong
awk '{gsub(/include\('"'"'.\//, "include('"'"'", $0); print }'  # Right

It maps this input:

include('./abc')
include('x/abc')

to:

include('abc')
include('abc')

Empirically, it seems that the regular expression must be inside slashes; the replacement string must be a regular string. You will need to map the '.' to '\.' to stop the second replacement.

I'm not very happy with this explanation. The man page for 'awk' on MacOS X says:

/re/ is a constant regular expression; any string (constant or variable) may be used as a regular expression, except in the position of an isolated regular expression in a pattern.

So, in theory, the string form you used should work. Empirically, it didn't; I got substantially the same error message as you did with your code. And you had got the shell quotes correct, which is non-trivial.

There are times when Perl might be easier (because you can choose an arbitrary delimiter to mark the regex boundaries):

perl -pe "s%include\('\./%include('%g"
Jonathan Leffler
This command works but only for really simple files containing one or two statements of similar type... In a sufficiently large file with a number of single quotes and slashes it seems to mess up everything. I also get this error:awk: warning: escape sequence `\(' treated as plain `('
GeekTantra
@GeekTantra: this is where you need to use a script file: 'awk -f file *.php'. You then don't have to fight the shell's interpretation of quotes as well as awk's interpretation of quotes, which makes life a whole heap easier. Re warning: MacOS 'awk' does not give it, but the backslash in front of the parenthesis in the replacement string is unwanted - your 'awk' is correct.
Jonathan Leffler
using double quotes in `gsub` has its uses. eg, if substituting forward slash `/`, one can use `gsub("/","")` instead of `gsub(/\//,"")`
ghostdog74
+3  A: 

@OP, you can try using octal code for the single quote(\047) and forward slash(\057), eg

$ cat file
include('./
$ awk '{gsub(/include\(\047\.\057/ , "include(\047" ) }1' file
include('
ghostdog74
Finally things worked. Thanks a lot.
GeekTantra
A: 

You don't need to use awk if all you want to do is this. :) Also, writing to a file as you're reading from it, in the way that you did, will lead to data loss or corruption, try not to do it.

for file in *.php ; do
# or, to do this to all php files recursively:
# find . -name '*.php' | while read file ; do
  # make backup copy; do not overwrite backup if backup already exists
  test -f $file.orig || cp -p $file $file.orig
  # awk '{... print > NEWFILE}' NEWFILE="$file" "$file.orig"
  sed -e "s:include('\./:include(':g" "$file.orig" >"$file"
done

Just to clarify the data loss aspect: when awk (or sed) start processing a file and you ask them to read the first line, they will actually perform a buffered read, that is, they will read from the filesystem (let's simplify and say "from disk") a block of data as large as their internal read buffer (e.g. 4-65KB) in order to get better performance (by reducing disk I/O.) Assume that the file you're working with is larger than the buffer size. Further reads will continue to come from the buffer until the buffer is exhausted, at which point a second block of data will be loaded from disk into the buffer etc.

However, just after you read the first line, i.e. after the first block of data is read from disk into the buffer, your awk script opens FILENAME, the input file itself, for writing with truncation, i.e. the file's size on disk is reset to 0. At this point all that remains of your original file are the first few kilobytes of data in awk's memory. Awk will merrily continue to read line after line from the in-memory buffer and produce output until the buffer is exhausted, at which point awk will probably stop and leave you with a 4-65k file.

As a side note, if you are actually using awk to expand (e.g. print "PREFIX: " $0), not shrink (gsub(/.../, "")), data, then you'll almost certainly end up with a non-responsive awk and a perpetually growing file. :)

vladr
sed, awk, both can do the job.
ghostdog74
Arguably one less level of quoting/escaping. :) But the really unforgivable part, whichever (awk or sed) is used, is essentially truncating each PHP file after the first line is read, i.e. if the file is bigger than awk's/sed's read buffer he's just truncated his files to just as many bytes.
vladr
awk actually did the job for me. Its not about using sed or awk its about which one is more comfortable with.
GeekTantra
Correct, either awk or sed will do the job. I do hope that none of your files were larger than 65k, though. :) Just for fun, take or make a 100K file and run the original `gawk { ... print > FILENAME } file` command. My `awk` stops flat at 68k. Ooops. :)
vladr
that's because its writing back to the file that gawk is currently processing. its the same concept as `cat file > file`. Providing a different file name (and renaming it back to original if needed) is the way to go.
ghostdog74