I'm trying to parse the title tag in an RSS 2.0 feed into three different variables for each entry in that feed. Using ElementTree I've already parsed the RSS so that I can print each title [minus the trailing )
] with the code below:
feed = getfeed("http://www.tourfilter.com/dallas/rss/by_concert_date") for item in feed: print repr(item.title[0:-1])
I include that because, as you can see, the item.title is a repr() data type, which I don't know much about.
A particular repr(item.title[0:-1])
print
ed in the interactive window looks like this:
'randy travis (Billy Bobs 3/21' 'Michael Schenker Group (House of Blues Dallas 3/26'
The user selects a band and I hope to, after parsing each item.title
into 3 variables (one each for band, venue, and date... or possibly an array or I don't know...) select only those related to the band selected. Then they are sent to Google for geocoding, but that's another story.
I've seen some examples of regex
and I'm reading about them, but it seems very complicated. Is it? I thought maybe someone here would have some insight as to exactly how to do this in an intelligent way. Should I use the re
module? Does it matter that the output is currently is repr()
s? Is there a better way? I was thinking I'd use a loop like (and this is my pseudoPython, just kind of notes I'm writing):
list = bandRaw,venue,date,latLong for item in feed: parse item.title for bandRaw, venue, date if bandRaw == str(band) send venue name + ", Dallas, TX" to google for geocoding return lat,long list = list + return character + bandRaw + "," + venue + "," + date + "," + lat + "," + long else
In the end, I need to have the chosen entries in a .csv (comma-delimited) file looking like this:
band,venue,date,lat,long randy travis,Billy Bobs,3/21,1234.5678,1234.5678 Michael Schenker Group,House of Blues Dallas,3/26,4321.8765,4321.8765
I hope this isn't too much to ask. I'll be looking into it on my own, just thought I should post here to make sure it got answered.
So, the question is, how do I best parse each repr(item.title[0:-1])
in the feed
into the 3 separate values that I can then concatenate into a .csv file?