views:

90

answers:

5

I use this piece of code in my bash script to read a file containing several hex strings, do some substitution and then write it to a new file. It takes about 30 minutes for about 300 Mb.
I'm wondering if this can be done faster ?

sed 's,[0-9A-Z]\{2\},\\\\x&,g' ${in_file} | while read line; do
 printf "%b" ${line} >> ${out_file}
 printf '\000\000' >> ${out_file}
done
+3  A: 

This is slow because of the loop in bash. If you can get sed/awk/perl/etc to do the loop, it will be much faster. I can't see how you can do it in sed or awk though. It's probably pretty easy for perl, but I dont know enough perl to answer that for you.

At the very least, you should be able to save a little time by refactoring what you have to:

sed 's,[0-9A-Z]\{2\},\\\\x&,g' ${in_file} | while read line; do
 printf '%b\000\000' ${line} 
done >> ${out_file}

At least this way, you're running printf once per iteration and opening/closing ${out_file} once only.

camh
+1 for pointing out that multiple redirections to the same file is slower than just one (read: having common sense).
amphetamachine
You mean it should be **printf '%b${line}\000\000'** because **'\000\000'** becomes after **printf "%b" ${line}**
Robertico
@Robertico: No, I meant it as I wrote it. '%b\000\000' is the format string, ${line} is the argument consumed by %b.
camh
@camh: Thx, I'll give it a try.
Robertico
+2  A: 

Switch to a full programming language? Here's a Ruby one-liner:

ruby -ne 'print "#{$_.chomp.gsub(/[0-9A-F]{2}/) { |s| s.to_i(16).chr }}\x00\x00"'
llasram
@llasram: Thx, but i love it here :-)
Robertico
+4  A: 

You need xxd command that comes with Vim.

export LANG=C
sed 's/$/0000/' ${in_file} | xxd -r -ps > ${out_file}
LatinSuD
@LatinSuD: Thx, I'll give it a try.
Robertico
+1: I had never considered that `xxd` could be used in reverse!
Johnsyweb
@LatinSuD: Thx !!. You're the winner.
Robertico
A: 

if you have Python and assuming data is simple

$ cat file
99
AB

script:

o=open("outfile","w")
for line in open("file"):
    s=chr(int(line.rstrip(),16))+chr(000)+chr(000)
    o.write(s)
o.close()
ghostdog74
+1  A: 

And the winner is:

sed 's,[0-9A-Z]\{2\},\\\\x&,g' ${in_file} | while read line; do
    printf "%b" ${line} >> ${out_file}
    printf '\000\000' >> ${out_file}
done

real 44m27.021s
user 29m17.640s
sys 15m1.070s


sed 's,[0-9A-Z]\{2\},\\\\x&,g' ${in_file} | while read line; do
    printf '%b\000\000' ${line} 
done >> ${out_file}

real 18m50.288s
user 8m46.400s
sys 10m10.170s


export LANG=C
sed 's/$/0000/' ${in_file} | xxd -r -ps >> ${out_file}

real 0m31.528s
user 0m1.850s
sys 0m29.450s


Robertico