views:

692

answers:

3

I have a file like:

<space>
<space>
line1
<space>
column 1    column 2    column 3   ...

.
.
.


<space>
<space>

How to remove this extra spaces?

I need to extract the heading which will be on line1. Also, I need to extract column 1, column 2, column 3 etc.

At the end of last column content there is '\n'.How to get rid of it ???

Help me with this...

Thank you

+4  A: 

Start by opening the file and reading all the lines:

f = open('filename string');
lines = f.readlines()

Then...

# remove empty lines
lines = [l for l in lines if len(l.strip()) > 0]
header = lines[0]
line = lines[1].split(' ')
column1 = line[0]
column2 = line[1]
...

Also:

total_lines = len(lines)
total_columns = len(line)
João da Silva
At the end of last column content there is '\n'.How to get rid of it ???
You can use the strip() method of strings. The l.strip() expression should have removed it for you, though.
João da Silva
To be more precise - the l.strip() removes both trailing and leading spaces. If (for some reason) you want to preserve spaces that are in front of the first column - use l.rstrip() instead.
Abgan
+1  A: 

A straightforward solution, using strip() to drop spaces and split() to separate column data:

>>> mylines
[' \n', ' \n', 'line1\n', ' \n', ' \n', 'column1    column2    column3 \n']
>>> def parser(lines):
...     header=""
...     data=[]
...     for x in lines:
...      line = x.strip()
...      if line == "":
...       continue
...      if header == "":
...       header=line
...      else:
...       data.append(line.split())
...     return {"header":header,"data":data}
... 
>>> parser(mylines)
{'header': 'line1', 'data': [['column1', 'column2', 'column3']]}
>>>
gimel
A: 

Using Generator functions to handle each element of parsing

def nonEmptyLines( aFile ):
    """Discard empty lines, yield only non-empty lines."""
    for line in aFile:
        if len(line) > 0:
            yield line

def splitFields( aFile ):
    """Split a non-empty line into fields."""
    for line in nonEmptyLines(aFile):
        yield line.split()

def dictReader( aFile ):
    """Turn non-empty lines file with header and data into dictionaries.
    Like the ``csv`` module."""
    iter= iter( splitFields( aFile ) )
    heading= iter.next()
    for line in iter:
        yield dict( zip( heading, line ) )

rdr= dictReader( "myFile", "r" )
for d in rdr:
    print d
S.Lott