views:

366

answers:

4

Like if I have a string like str1 = "IWantToMasterPython"

If I want to extract "Py" from the above string. I write:

extractedString = foo("Master","thon")

I want to do all this because i am trying to extract lyrics from an html page. The lyrics are written like <div class = "lyricbox"> ....lyrics goes here....</div>.

Any suggestions on how can I implement.

+5  A: 

The solution is to use a regexp:

import re
r = re.compile('Master(.*?)thon')
m = r.search(str1)
if m:
    lyrics = m.group(1)
tonfa
Nicely answered. Exactly what I wanted to know.Thanx
shadyabhi
+5  A: 
def foo(s, leader, trailer):
  end_of_leader = s.index(leader) + len(leader)
  start_of_trailer = s.index(trailer, end_of_leader)
  return s[end_of_leader:start_of_trailer]

this raises ValueError if the leader is not present in string s, or the trailer is not present after that (you have not specified what behavior you want in such anomalous conditions; raising an exception is a pretty natural and Pythonic thing to do, letting the caller handle that with a try/except if it know what to do in such cases).

A RE-based approach is also possible, but I think this pure-string approach is simpler.

Alex Martelli
+2  A: 

If you're extracting any data from a html page, I'd strongly suggest using BeautifulSoup library. I used it also for extracting data from html and it works great.

paffnucy
+2  A: 

BeautifulSoup is the easiest way to do what you want. It can be installed like:

sudo easy_install beautifulsoup

The sample code to do what you want is:

from BeautifulSoup import BeautifulSoup

doc = ['<div class="lyricbox">Hey You</div>']
soup = BeautifulSoup(''.join(doc))
print soup.find('div', {'class': 'lyricbox'}).string

You can use Python's urllib to grab content from the url directly. The Beautiful Soup doc is helpful too if you want to do some more parsing.

Thierry Lam
This is definitely the correct way to about it for what he says he's using it for.
wxs
+1, That helps greatly...thanks
mshsayem
Nicely put. Thats what my purpose was.It really helps.
shadyabhi