views:

271

answers:

6

I haven't yet implemented this, I'm still in the thinking stage, but I have to go through a file and replace a certain string with another string. For example,

<img src="/images/logo.gif" ...

should become

<img src="/proxy/www.example.com/images/logo.gif" ...

Any advice on how I can approach this? Perhaps there exist some "string replace" C functions that would do this for me that I don't know about...?

Right now, if I had to write this function myself, I would give it as parameters the file, string to replace, replacement string. Then I would manually go through the file and look for an occurrence of the string and recreate it. This, however, seems very inefficient. Are there better ways to do this?

Thanks, Hristo

+1  A: 

No, there is no function in C that replaces a string throughout a file. You must implement it yourself.

That said, what you're showing us is HTML, and HTML is tricky, because it's hierarchical. Are you required to correctly parse it? Because if you are, the task is much more difficult. Seeing that it's homework, I doubt it, so you might do enough by:

  1. open the file and load it to memory (assuming it isn't too large - if it is, you can write into a temporary file and move it onto the original one after you've finished)
  2. continuously use strstr to find the anchor string you need to start replacing
  3. replace
  4. repeat 2 and 3 until finished with file
  5. write file back
Eli Bendersky
What do you mean by "hierarchical"... like containers inside containers? But what I was thinking is more or less what you outlined above. Thanks!
Hristo
@Hristo: HTML is hierarchical since tags can be nested in other tags. Also, it has a complex syntax - i.e. the string you're looking for may be part of another string, or part of a comment. However, as I said, seeing that it's homework, I doubt that they want you to implement a full-fledged HTML parser. It isn't a good practice giving HTML as thus in the assignment because it teaches bad habits. Afterwards, hordes of programming munch HTML with simple string ops and regexps, and we are all the worse for that.
Eli Bendersky
That makes sense. Its not a full HTML parse... I just need to make sure that my server handles the proxy information properly just for the <a> and <img> tags... so anything that has '/proxy/' will go through and change links accordingly.
Hristo
@Hristo: this is what I meant about people getting used to ad-hoc HTML munging. Depending on the enlightment level of your teacher, you may or may not get extra points for using a proper HTML parser here. In any real-life scenario, **you must**.
Eli Bendersky
haha... extra points :) I mean I wish that was an option but there is a lot of problems with the structure of this class. Anywho, thanks again for your responses. I'll get this going as soon as I can, hopefully by 11:59pm Wed. I'll keep in mind to do a legit HTML parser when I'm in the real world.Thanks!
Hristo
A: 

Since this is homework, I won't give you an answer but I'll point out a classic issue that trips people up.

In C, it's easiest to read a fixed byte count (you can try to do line by line but if a line is too long, that reverts to reading a fixed number of bytes). If the string you are trying to replace ends up getting split between one buffer and a second buffer:

buf1 -> "...<img src=\"/ima"
buf2 -> "ges/logo.gif\"..."

you won't be able to do a simple search replace in memory.

R Samuel Klatchko
+1  A: 

Since it's homework I'm going with the assumption that the string can not span multiple lines. If this assumption is correct (and barring the complications with "replacing text in HTML") then:

1 Read the next line

2 Replace string and write line (to another file)

3 If not at end, goto #1

4 Win \o/

Or perhaps the teacher wants something else shrug

pst
I'm not sure what you mean by Step 4... and I also don't know if your assumption is correct either. But I guess what I was thinking was along the lines of what you outlined above. Thanks.
Hristo
A: 

Are you try strcpy function for this,

Assign the url in one string and replace it by strcpy function.

Karthik
+1  A: 

First of all, C is an awesome language, but is one of the most painful languages to do this type of operation in. Just had to say it.

Can you safely assume that the contents of the entire file can fit in memory? If so:

allocate buffer big enough to hold file contents
read entire file into buffer
inputPtr = 0

while(inputPtr < size of buffer) {
    replacePosition = strstr(inputPtr, stringToReplace);
    if (replacePosition != NULL)
        writeUntil = replacePosition - 1
    else
        writeUntil = end of buffer

    write out buffer from inputPtr to writeUntil inclusive (could be 0 bytes)

    if (replacePosition == NULL) break

    write out the replacement string

    inputPtr = replacePosition + strlen(stringToReplace)
}
RarrRarrRarr
haha... I was thinking the same thing. String operations in C are not fun. Thanks for your response. I'll give it a try tomorrow.
Hristo
A: 

You should investigate the sed command. See what it does and do something similar.

It works as a filter, so when using it to replace something in a file what you often do is capture the output into a file and then replace the old file with the new file.

nategoose