tags:

views:

545

answers:

2

I have a CSV file that has only 1 column, but has close to 1500 records.

I'd like to extract information from each record, e.g.,

"The sample battery has a Voltage: 11.1V, and capacity: 4500mAh"

I'd like to extract out 11.1 and place in another file, i.e., after "voltage: ", before "V". If the record does not have "voltage: ", I would like to have a empty line in it.

I'm in a Linux environment, what's the easiest way to do it?

+2  A: 

Python

import csv
source = open( "myfile.csv", "rb" )
rdr= csv.reader( source )
for row in rdr:
    print "The sample battery has a Voltage: %.1fV, and capacity: %dmAh" % ( float(row[0]), int(row[1]), )

Will get you started with pulling data from a CSV file.


Apparently (based on comments) the file looks like this.

"The sample battery has a Voltage: 11.1V, and capacity: 4500mAh"

Which could be a 1-column CSV. Or a single row with bonus quotes. Let's pretend it's a 1-column CSV.

import csv
import re
v_pat= re.compile(r' (\d+\.\d+)V' )
mah_pat = re.compile(r' (\d+)mAh' )
source = open( "myfile.csv", "rb" )
rdr= csv.reader( source )
for row in rdr:
   v_match= v_pat.search( row[0] )
   mah_match= mah_pat.search( row[0] )
   if v_match and mah_match:
       print v_match.group(1), mah_match.group(1)
   else:
       print # empty line -- not very informative

Something like that might be appropriate.

S.Lott
Hi, this is exactly opposite of what I intend to do. Basically it's a CSV file from a shopping cart (exported with phpMyAdmin), and I want to extract out the numbers instead.So it's like whenever the program sees "Voltage:", it would extract the real number (floating point in this case) just after it.
Bo Tian
Please clarify your question, to include this new information.
S.Lott
+2  A: 
simon