views:

173

answers:

3

I am using a simple Perl script to parse XML and convert it to usable SQL. My current SQL lines go something like this:

INSERT INTO table VALUES ('data1', 'data2', 'data3', );

Obviously I need to remove the comma at the end there. Sounds simple but I just can't get regex to find it. I tried s/,\s+)/)/ but that doesn't change anything when I run it. Strangely, s/,\s+/WTF/ doesn't modify anything either, when it should be replacing all the commas and the spaces next to them. BUT when I run s/\s+)/something/ it correctly finds and replaces the close parentheses at the end of the line. So apparently the whitespace character right after the commas is some strange ghost character that I can't find by any means. Not even with the . expression.

What's really weird though is when I use Find on the document in Notepad++ with the Regular Expression option, it finds all of them perfectly when I enter ,\s+) yet the exact same sequence in Perl regex will not find them.

I suspected it was something with \r (I'm using Windows) since I previously removed the \n characters but it won't find a \r in the whole sql file.

Thank you in advance for your help this is really puzzling me.

+7  A: 

First off,

$ perl -E 'my $foo = "bar, baz"; $foo =~ s/,\s+/WTF/; say $foo'
barWTFbaz

It does work. (For perl 5.8 and before, change that to -e and print "$foo\n")

Second, you're doing it wrong. Instead of doing something like:

$values = "'$values[0]', ";
$values .= "'$values[1]', ";
⋮

you should do:

$values = join(q{,}, map("'$_'", @values)); # map adds 'quotes'; join adds commas

Third, you shouldn't even do that, you should use placeholders:

# note specifying the column names, its a good idea! Schema change.
my $query = "INSERT INTO table (col1, col2, col3) VALUES (?,?,?)";
my $sth = $dbh->prepare($query);
$sth->execute(@values);
derobert
Thank you, that was very helpful.
RobbR
+2  A: 

what about s/,\W+)/)

jcopenha
A: 

Do you need to escape the parenthesis in your regex?

s/,\s+)/)/

won't compile for me, but

s/,\s+\)/\)/

does (and seems to do what you want).

mobrule