views:

62

answers:

2

Hi people.

I would need a regular expression that escapes or captures (if not already escaped) ALL the double quote characters INSIDE a single quoted string and then convert the opening single quotes to double quotes!

We are refactoring files that have a lot (and i mean a lot!) of single quoted strings in either PHP and also JS files. The only thing they have in common is that the strings are at least in one line and are concated with = in both languages.

I give an example (the example is ugly legacy code so dont judge it please, i already did this :) ) We have a file that starts like this:

var baseUrl = $("#baseurl").html();
var head = '<div id="finishingDiv" style="background-image:url({baseUrl}css/userAd/images/out_main.jpg); background-repeat: repeat-y; ">'+
'<div id="buttonbar" style="width:810px; text-align:right">';

and i want it to look like this:

var baseUrl = $("#baseurl").html();
var head = "<div id=\"finishingDiv\" style=\"background-image:url({baseUrl}css/userAd/images/out_main.jpg); background-repeat: repeat-y; \">" +
"<div id=\"buttonbar\" style=\"width:810px; text-align:right\">";

As you see the correct double quote strings are not touched.

So my basic question: How do i capture all characters of one kind (in my case the character " ) between a certain start and end character (in my case the character ' ).

This regex '.*(").*' or '[^']*(")[^']*' just captures always one " for me per match. If if needs more than one step its also ok, it should just work. I would be happy of any solution, IDE specific, language specific or shell specific, that acutally works.

Please help, im desperate, thanks a lot

A: 

That regex only captures one " because you're only asking for one. If you want to capture all the quotes, you need something more like (".*)+ in the middle. That says, "Capture one or more of this pattern: a double quote followed by zero or more of any characters."

Swordgleam
'.*(".*)+.*' does not deliver what i want, captures some other characters per match after the " , but doesnt get more matches :(
Tschef
A: 

The biggest problem is going to be figuring out where all the strings are, since you can't parse all of JS or PHP with a regex. However, if I assume that you don't care about comments, this Ruby code will catch most cases (but you should review its output):

#!/usr/bin/ruby -p

gsub!(/'((?:[^\\']|\\[\\'])+)'/) do |m|
  %Q{"#{$1.gsub("\\'","'").gsub(/\\[^\\]/) { "\\#{$0}" }.gsub('"','\\"')}"}
end

This code takes whatever's presented on stdin / the contents of the file arguments, finds a single-quoted string (taking into account the possible presence of \\ and \'), and then, for its replacement, runs a series of substitutions within the matched string (sanitizing backslashes, etc.). The result is printed to stdout. If you want a more automated approach, replace the first line with #!/usr/bin/ruby -pi.bak; then, whatever file arguments are presented have the substitution run on them destructively in-place. The old files are kept with an additional .bak extension.

To run this code, if you haven't used Ruby before: save it as anything, such as fix-sq.rb; run chmod +x fix-sq.rb; and then run ./fix-sq.rb file1 file2 file3.

Antal S-Z