I have a text file that has sets of text I need to extract that looks something like as follows:
ITEM A blah blah blah ITEM B bloo bloo bloo ITEM A blee blee blee ITEM B
Here is the working code I have so far:
finda = r'(Item\sA)'
findb = r'(Item\sB)'
match_a = re.finditer(finda, usefile, 2) # the "2" is a flag to say ignore case
match_b = re.finditer(findb, usefile, 2)
I know that I can use commands like span, start, and end to find the text positions of my matches. But I need to do this many times so what I need is:
- start writing at ITEM A and stop writing at ITEM B.
- if that first iteration is less than 50 characters long then discard and move to the next one
- once you find a set that starts with ITEM A and ends with ITEM B and is larger than 50 characters write it to a file
Thanks a ton in advance! I have been spinning my wheels for a while.