ansaurus

Question

Answer 1

+1 A:

That looks fine, and BeautifulSoup is useful for this (although I personally tend to use lxml). You should be able to take that data you get, and make a csv file out of is using the csv module without any obvious problems...

I think you need to actually tell us what the problem is. "It still doesn't work" is not a problem descripton.

Lennart Regebro 2009-07-06 10:02:33

Answer 2

+3 A:

You don't really explain why you are stuck - what's not working exactly?

The following line may well be your problem:

soup = BeautifulSoup(open(filename["r"]))

It looks to me like this should be:

soup = BeautifulSoup(open(filename, "r"))

The following line:

for row in soup.findAll("tr", attrs={ "class" : "evenColor" })

looks like it will only pick out even rows (assuming your even rows have the class 'evenColor' and odd rows have 'oddColor'). Assuming you want all rows with a class of either evenColor or oddColor, you can use a regular expression to match the class value:

for row in soup.findAll("tr", attrs={ "class" : re.compile(r"evenColor|oddColor") })

Judy2K 2009-07-06 10:06:04

@It looks to me like this should be:soup = BeautifulSoup(open(filename, "r"))--thanks I changed it

northnodewolf 2009-07-06 12:57:34

Answer 3

+2 A:

You need to import the csv module by adding import csv to the top of your file.

Then you'll need something to create a csv file outside your loop of the rows, like so:

writer = csv.writer(open("%s.csv" % filename, "wb"))

Then you need to actually pull the data out of the html row in your loop, similar to

values = (td.fetchText() for td in row)
writer.writerow(values)

Hank Gay 2009-07-06 11:17:50

Yes yes. This is what I am talking about. Thanks. I also realized that maybe I need to import re for regex? We are using '*' and '%' so we need to import re, maybe?For the second part you say I need to pull the data out of the html row... but if the rows have been written from html to csv what's the point. I am probably not wrapping my head around something but even if there are 1230 .csv files that's good enough for me right now. Here's a link to the files I am working with:http://www.nhl.com/scores/htmlreports/20082009/PL020808.HTM

northnodewolf 2009-07-06 13:04:21

ansaurus

tags:

views:

answers:

Parsing HTML rows into CSV

related questions