tags:

views:

64

answers:

1
           "Type","Name","Description","Designation","First-term assessment","Second-term assessment","Total"
           "Subject","Nick","D1234","F4321",10,19,29
           "Unit","HTML","D1234-1","F4321",18,,
           "Topic","Tags","First Term","F4321",18,,
           "Subtopic","Review of representation of HTML",,,,,

All the above are the value from an excel sheet , which is converted to csv and that is the one shown above

The header as you notice contains seven coulmns,the data below them vary,

I have this script to generate these from python script,the script is below

 from django.db import transaction
 import sys
 import csv
 import StringIO



 file = sys.argv[1]
 no_cols_flag=0
 flag=0
 header_arr=[]


 print file
 f = open(file, 'r')



while (f.readline() != ""):
  for i in [line.split(',') for line in open(file)]: # split on the separator
    print "==========================================================="
    row_flag=0
    row_d=""
    for j in i: # for each token in the split string
      row_flag=1
      print j


      if j:
        no_cols_flag=no_cols_flag+1
        data=j.strip()
        print j

    break

How to modify the above script to say that this data belongs to a particular column header..

thanks..

+5  A: 

You're importing the csv module but never use it. Why?

If you do

import csv
reader = csv.reader(open(file, "rb"), dialect="excel") # Python 2.x
# Python 3: reader = csv.reader(open(file, newline=""), dialect="excel")

you get a reader object that will contain all you need; the first row will contain the headers, and the subsequent rows will contain the data in the corresponding places.

Even better might be (if I understand you correctly):

import csv
reader = csv.DictReader(open(file, "rb"), dialect="excel") # Python 2.x
# Python 3: reader = csv.DictReader(open(file, newline=""), dialect="excel")

This DictReader can be iterated over, returning a sequence of dicts that use the column header as keys and the following data as values, so

for row in reader:
    print(row)

will output

{'Name': 'Nick', 'Designation': 'F4321', 'Type': 'Subject', 'Total': '29', 'First-term assessment': '10', 'Second-term assessment': '19', 'Description': 'D1234'}
{'Name': 'HTML', 'Designation': 'F4321', 'Type': 'Unit', 'Total': '', 'First-term assessment': '18', 'Second-term assessment': '', 'Description': 'D1234-1'}
{'Name': 'Tags', 'Designation': 'F4321', 'Type': 'Topic', 'Total': '', 'First-term assessment': '18', 'Second-term assessment': '', 'Description': 'First Term'}
{'Name': 'Review of representation of HTML', 'Designation': '', 'Type': 'Subtopic', 'Total': '', 'First-term assessment': '', 'Second-term assessment': '', 'Description': ''}
Tim Pietzcker
i have fixed the indentation
Hulk
In Python 2.x, *ALWAYS* open the file in binary mode ('rb' or 'wb', as appropriate).
John Machin
@John Machin: Why? The csv module docs say nothing about this, and I've never had problems opening files without the `b` flag. Some examples use it, some don't. You may be very right, but I'd like to know the rationale behind this.
Tim Pietzcker
@Tim: 2.x docs http://docs.python.org/library/csv.html#csv.reader say something: "If csvfile is a file object, it must be opened with the ‘b’ flag on platforms where that makes a difference." i.e. Windows platforms. So for platform independence, one should use 'rb' always. The same applies when writing even though the docs don't say so. CSV records are terminated by CRLF independent of platform -- it's in essence a BINARY format. If you don't supply 'wb' on Windows, you get CRCRLF.
John Machin
@John: Must have been blind (I just grepped the page for `rb`), thanks. Will edit.
Tim Pietzcker
Nice one..Thanks
Hulk