tags:

views:

451

answers:

2

This should be simple for those of you who have some programming knowledge... Unfortunately I don't.

I'm trying to iterate through a text file of image captions and add them as title tags to an html file. The image captions file has 105 captions (each is separated by a carriage return) and the gallery file has blank alt tags on each a tag (set up like alt="#"). The order of the captions corresponds with the order of the images in the html file.

So in other words... the psuedo code would be: "Loop through every line in captions.txt and for every alt="#" inside the gallery.html file, replace the # with the corresponding caption."

I'm on a Mac so I'd like to use UNIX.

Any help is greatly appreciated!

Thanks, Mike

+4  A: 

If all the alt="#" are on separate lines, you can use ed:

{
  while read cap
    do echo "/alt=\"#\"/ s//alt=\"$cap\"/"
  done < captions.txt
  echo wq
} | ed gallery.html

This assumes none of your captions contain a slash.

jpalecek
Works like a charm - thank you so much!
+2  A: 

There are many ways to accomplish this goal. awk is the classic text manipulation program. (Well, awk and sed, for different purposes, but sed won't help here.)

awk '
    BEGIN {
        caps = ARGV[1]
        delete ARGV[1]
    }
    /#/ {
        getline cap < caps
        gsub("#", cap)
    }
    { print }
' captions.txt gallery.html

You could put it into a script to avoid having to type it more than once. Just start a plain text file with "#!/usr/bin/awk -f", put the "BEGIN ... { print }" below it, and give the file execute permissions.


This translates trivially into most scripting languages. Perl:

#!/usr/bin/perl -p
BEGIN { open CAPS, shift }
if (/#/) {
    chomp($cap = <CAPS>);
    s/#/$cap/g;
}

Almost the same in Ruby:

#!/usr/bin/ruby
caps = IO.readlines(ARGV.shift).each {|s| s.chomp!}
while gets
    $_.gsub!(/#/, caps.shift) if $_ =~ /#/
    print
end

And Python:

#!/usr/bin/python
import sys
caps = [s.strip() for s in file(sys.argv[1]).readlines()]
for f in [file(s, 'r') for s in sys.argv[2:]] or [sys.stdin]:
    for s in f:
        if s.find('#') > 0: s = s.replace('#', caps.pop(0))
        print s,
ephemient