tags:

views:

67

answers:

3

If in a file the values present are in either " or , separated values

         "Name" "Tom" "CODE 041" "Has"
         "Address" "NSYSTEMS c/o" "First Term" "123" 18  
         "Occ" "Engineer" "Level1" "JT" 18

How should the python script be written so as to get all the above values individually

+1  A: 

An alternative approach to using the csv reader.

in.txt

"Name" "Tom" "CODE 041" "Has"
"Address" "NSYSTEMS c/o" "First Term" "123" 18  
"Occ" "Engineer" "Level1" "JT" 18

parse.py

for i in [line.split('"') for line in open("in.txt")]: # split on the separator
    for j in i: # for each token in the split string
        if len(j.strip())>0: # ignore empty string, like the spaces between elements
            print j.strip()

out.txt

Name
Tom
CODE 041
Has
Address
NSYSTEMS c/o
First Term
123
18
Occ
Engineer
Level1
JT
18

But I would call your values " enclosed. And I cant see any , separated. Could you expand your test data? Show some rows with , separated values and Ill expand my code.

mizipzor
this will fail for "CODE 041"
Anurag Uniyal
I just realized that, but since using the csv reader is a better approach I didnt update the question. But leaving broken code here is bad, so Ill update it now.
mizipzor
+3  A: 

Your question is a little vague, and there are no commas in your example, so it's a bit hard to provide a good answer.

On your example file containing

"Name" "Tom" "CODE 041" "Has"
"Address" "NSYSTEMS c/o" "First Term" "123" 18  
"Occ" "Engineer" "Level1" "JT" 18

this script

import csv
reader = csv.reader(open('test.txt'), delimiter=' ', quotechar='"')
for row in reader:
    print(row)

produces

['Name', 'Tom', 'CODE 041', 'Has']
['Address', 'NSYSTEMS c/o', 'First Term', '123', '18']
['Occ', 'Engineer', 'Level1', 'JT', '18']

This assumes that the delimiter between values is a space. If it's a tab, use delimiter='\t' instead.

You're out of luck with this approach if delimiters change throughout the file - in this case they are not valid CSV/TSV files anymore. But all this is just speculation until you can provide some actual and relevant examples of the data you want to analyse.

Tim Pietzcker
+1, of course one should use the csv reader for this. I should have thought of that.
mizipzor
A: 

Use csv module it will handle all type of delimiters and quotes properly, writing such code using split etc isn't trivial

import csv
import StringIO

data = '''"Name" "Tom" "CODE 041" "Has"
"Address" "NSYSTEMS c/o" "First Term" "123" 18  
"Occ" "Engineer" "Level1" "JT" 18"
'''

reader = csv.reader(StringIO.StringIO(data), delimiter=' ')
for row in reader:
    print row

Output:

['Name', 'Tom', 'CODE 041', 'Has']
['Address', 'NSYSTEMS c/o', 'First Term', '123', '18']
['Occ', 'Engineer', 'Level1', 'JT', '18']
Anurag Uniyal
I get the outuput as `['Subtopic\t"Explaining', 'that', 'every', 'real', 'number', 'is', 'represented', 'by', 'a', 'unique', 'point', 'on', 'the', 'number', 'line', 'and', 'conversely,', 'every', 'point', 'on', 'the', 'number', 'line', 'represents', 'a', 'unique', 'real', 'number."\t\t\t\t\t']`.....................
Hulk
@Hulk: If you don't provide correct examples, we can't write correct code. Now it looks as though the records are tab separated, not space nor comma.
Tim Pietzcker