views:

170

answers:

5

Hi,

is there any way to remove what found between two lines that contain two concrete strings?

I mean: I want to remove anything found between 'heaven' and 'hell' in a text file with this text:

I'm in heaven
foobar
I'm in hell

After executing the script/function i'm asking the text file will be empty.

Regards

Javi

+3  A: 

Use a flag to indicate whether you're writing or not.

from __future__ import with_statement

writing = True

with open('myfile.txt') as f:
    with open('output.txt') as out:
        for line in f:
            if writing:
                if "heaven" in line:
                    writing = False
                else:
                    out.write(line)
            elif "hell" in line:
                writing = True    
os.remove('myfile.txt')
os.rename('output.txt', 'myfile.txt')

EDIT

As extraneon pointed in the comments, the requirement is to remove the lines between two concrete strings. That means that if the second (closing) string is never found, nothing should be removed. That can be achieved by keeping a buffer of lines. The buffer gets discarded if the closing string "I'm in hell" is found, but if the end of file is reached without finding it, the whole contents must be written to the file.

Example:

I'm in heaven
foo
bar

Should keep the whole contents since there's no closing tag and the question says between two lines.

Here's an example to do that, for completion:

from __future__ import with_statement

writing = True
with open('myfile.txt') as f:
    with open('output.txt') as out:
        for line in f:
            if writing:
                if "heaven" in line:
                    writing = False
                    buffer = [line]
                else:
                    out.write(line)
            elif "hell" in line:
                writing = True
            else:
                buffer.append(line)
        else:
            if not writing:
                #There wasn't a closing "I'm in hell", so write buffer contents
                out.writelines(buffer)

os.remove('myfile.txt')
os.rename('output.txt', 'myfile.txt')
nosklo
Thanks, but i have to use 2.5.2.
@user249959: edited to allow python 2.5.2
nosklo
@noskio This is probably what's intended, but not what is asked. If there is no closing "I'm in hell" there is no _between_ to speak of and that contents should thus be in the output. Yeah, I'm a nitpicker when it comes to requirements :)
extraneon
@extraneon: You're right!! hm... Perhaps one has to buffer the lines and write them back if the end of file is reached without closing...
nosklo
@noskio I wish I could upvote twice :)
extraneon
A: 

I apologize but this sounds like a homework problem. We have a policy on these: http://meta.stackoverflow.com/questions/10811/homework-on-stackoverflow

However, what I can say is that the feature @nosklo wrote about is available in any Python 2.5.x (or newer), but you need to learn enough Python to enable it. :-)

My solution would involve using creating a new string with the undesired stuff stripped out using str.find() or str.index() (or some relative of those 2).

Best of luck!

wescpy
A: 

You could do something like the following with regular expressions. There are probably more efficient ways to do it since I'm still learning a lot of python, but this should work.

import re

f = open('hh_remove.txt')
lines = f.readlines()

pattern1 = re.compile("heaven",re.I)
pattern2 = re.compile("hell",re.I)

mark1 = False
mark2 = False

for i, line in enumerate(lines):
    if pattern1.search(line) != None:
        mark1 = True
        set1 = i
    if pattern2.search(line) != None:
        mark2 = True
        set2 = i+1
    if ((mark1 == True) and (mark2 == True)):
        del lines[set1:set2]
        mark1 = False
        mark2 = False

f.close()
out = open('hh_remove.txt','w')
out.write("".join(lines))
out.close()
Jacinda S
+1  A: 

Looks like by "remove" you mean "rewrite the input file in-place" (or make it look like you're so doing;-), in which case fileinput.input helps:

import fileinput
writing = True
for line in fileinput.input(['thefile.txt'], inplace=True):
    if writing:
        if 'heaven' in line: writing = False
        else: print line,
    else:
        if 'hell' in line: writing = True
Alex Martelli
@nosklo: I understand the question as Alex do and believe that's your understanding that is wrong (ie: "I'm NOT in hell should match" as I understand the wording of the original poster). Don't see any problem with spaces here... voting up back.
kriss
@kriss: You're right. But I didn't downvote myself.
nosklo
A: 

Hi again,

see below. I dont know if it's ok but It seems is working ok.

import re,fileinput,os


for path, dirs, files in os.walk(path):
    for filename in files:
        fullpath = os.path.join(path, filename)


        f = open(fullpath,'r')


        data = f.read()

        patter = re.compile('Im in heaven.*?Im in hell', re.I | re.S)
        data = patter.sub("", data)

        f.close()

        f = open(fullpath, 'w')

        f.write(data)
        f.close()

Anyway when i execute it, it leaves a blank line. I mean, if have this function:

public function preFetchAll(Doctrine_Event $event){ 
//Im in heaven
$a = sfContext::getInstance()->getUser()->getAttribute("passw.formulario");
var_dump($a);
//Im in hell
foreach ($this->_listeners as $listener) {
    $listener->preFetchAll($event);
}
}

and i execute my script, i get this:

public function preFetchAll(Doctrine_Event $event){ 

foreach ($this->_listeners as $listener) {
    $listener->preFetchAll($event);
}
}

As you can see there is an empty line between "public..." and "foreach...".

Why?

Javi

Maybe sub() is leaving the newline after 'Im in hell'?
Jacinda S
That's another question, it should be asked again in a separate question.
nosklo