Hi,
Based on a previous stack overflow question and contribution by cgoldberg, I came up with this regex using the python re module:
import re
urls = re.finditer('http://(.*?).mp3', htmlcode)
The variable urls is an iterable object and I can use a loop to access each mp3 file url individually if there is more than one :
for url in urls:
mp3fileurl = url.group(0)
This technique, however, only works sometimes. I realize regular expressions will not be as reliable as a fully fledged parser module. But, sometimes, this is not reliable for the same page.
I sometimes receive everything before http for some url entries.
I am relatively new to regular expressions. So, I am just wondering if there is a more reliable way to go about it.
Thanks in advance. New to stackoverflow and looking forward to contributing some answers as well.