ansaurus

Question

How to efficiently search/replace on a large txt file?

Answer 1

+3 A:

Your perl substitution seems to be wrong. Try:

grep -rl \" . | xargs perl -pi~ -e 's/\^/"/g'

Explanation:

grep : command to find matches
-r : to recursively search
-l : to print only the file names where match is found
\" : we need to escape " as its a shell meta char
. : do the search in current working dir
perl : used here to do the inplace replacement
-i~ : to do the replacement inplace and create a backup file with extension ~
-p : to print each line after replacement
-e : one line program
\^ : we need to escape caret as its a regex meta char to mean start anchor

codaddict 2010-08-23 15:51:41

That both worked and helped explain it clearly. Thank you very much!

Robert Pierce 2010-08-23 17:07:40

@codaddict. Oh, ok I didn't have enough 'points' to do that before. Thanks.

Robert Pierce 2010-08-23 18:04:15

Answer 2

+1 A:

sed -i.bak 's/\^/"/g' mylargefile.csv

Update: you can also use Perl as rein has suggested

perl -i.bak -pe 's/\^/"/g' mylargefile.csv

But on big files, sed may run a bit faster than Perl, as my result shows on a 6million line file

$ tail -4 file
this is a line with ^
this is a line with ^
this is a line with ^

$ wc -l<file
6136650

$ time sed 's/\^/"/g' file  >/dev/null

real    0m14.210s
user    0m12.986s
sys     0m0.323s
$ time perl  -pe 's/\^/"/g' file >/dev/null

real    0m23.993s
user    0m22.608s
sys     0m0.630s
$ time sed 's/\^/"/g' file  >/dev/null

real    0m13.598s
user    0m12.680s
sys     0m0.362s

$ time perl  -pe 's/\^/"/g' file >/dev/null

real    0m23.690s
user    0m22.502s
sys     0m0.393s

ghostdog74 2010-08-24 00:59:52

Thanks for the help. I've never used sed, but if it's that concise it must be worth looking at. :)

Robert Pierce 2010-08-24 01:51:57

perl -i.bak -pe 's/\^/"/g' mylargefile.csv isn't all that longer ...

reinierpost 2010-08-24 08:35:02

ansaurus

tags:

views:

answers:

How to efficiently search/replace on a large txt file?

related questions