tags:

views:

132

answers:

5

I have a csv file with the data like this

Zoos, Sanctuaries & Animal Parks,7469,3.00

Unfortunately this is not correct as the first section should be all one field like this

"Zoos, Sanctuaries & Animal Parks","7469","3.00"

As this is just a once off import I would be just happy to transform it to

Zoos, Sanctuaries & Animal Parks|7469|3.00

with the last and second last comma's converted to pipes. Is there an easy way to do this with regex?

+1  A: 

Something like this should work:

s/(\S),(\S)/\1|\2/g

(Replaces all commas which have are surrounded on both sides by non-space characters with pipes.)

clee
+2  A: 

To convert comma to pipe last 2 items, you could do like this

>>> re.sub(",(\d+),([\d.]+)$","|\\1|\\2","Zoos, Sanctuaries & Animal Parks,7469,3.00")
'Zoos, Sanctuaries & Animal Parks|7469|3.00'
S.Mark
Wait until a record turns up that has a decimal point in first number (or no decimal point in the second number)
Amarghosh
Or a number with a comma in it...
clee
@clee That would break my regex too - but that's against the question specs ;)
Amarghosh
+1  A: 

You can convert to pipes this way. Just feed your text through this command:

sed 's/,\([^,]*\),\([^,]*\)$/|\1|\2/'
Dietrich Epp
+1  A: 
$ cat test.csv 
Zoos, Sanctuaries & Animal Parks,7469,3.00
a,100,2000
a,b and c, 100,300

$ cat test.csv | perl -npe 's/^(.*),(.*),(.*)$/$1|$2|$3/'
Zoos, Sanctuaries & Animal Parks|7469|3.00
a|100|2000
a,b and c| 100|300
Damodharan R
A: 

To convert last commas into pipes:

Replace ^(.*?),([^,]*?),([^,]*?)$ with $1|$2|$3

Or even better - to convert them in to the correct format:

Replace ^(.*?),([^,]*?),([^,]*?)$ with "$1","$2","$3"

Amarghosh