OK, I need to scan many HTML / XHTML documents to see if a particular file has been embedded with SWFObject. If it's the case, I need to replace the call to something else.
So far I have extracted the <script>
contents where the calls can be made. Now I need to scan this string to check if the call is there and if it's there I need to replace it.
I know this is a bit odd, but the content comes from a third party which we don't have control on.
Since the call can be made in many different syntax, I will need a regular expression to find and replace the calls.
OK imagine the following scenario:
I'm searching if the file test.swf
is embedded with SWFObject in the file.
The <script>
content look like this:
alert('test.swf');
//some other random stuff here
swfobject.embedSWF("test.swf",
"The alternative content can screw the regexp with );", "300", "120",
"9.0.0", false, flashvars, params, attributes);
Now I would like to replace swfobject.embedSWF
(and all parameters) to something else.
Is there a not too horrible way to do this? Don't forget that the call can be on one or many lines, that the parameters can be wrapped with single quotes (') or double quotes ("), that whitespace can be all around...
EDIT: OK since catching all kind of JS syntax is a bit overkill I will simplify the requirement:
The regular expression can assume only the following
- The call is always on the same line
- It always start with
swfobject.embedSWF
(case sensitive) - Is then followed (or not) by whitespaces and then a
(
- Is then followed (or not) by whitespaces and then a
"
or a'
(either one but one of the 2 is required) - Is then followed by the filename
- Is then followed by
"
or'
(if we can ensure that it's the same char that in 4 good if not too bad) - Is then followed (or not) by whitespaces and then a
,
- Is then followed by anything
- Is then followed by
)
then any whitespaces (or not) then;
then anend of line
.
It should be much simpler to parse this way (I guess).
EDIT 2: I've cooked a solution. I think I'm close but it's not working, Anyone can help? 0 should match but it's not...
<?php
$myFilename = 'test.swf';
$testCases = array();
$testCases[] = 'swfobject.embedSWF("test.swf", "The alternative content can screw the regexp with );", "300", "120", "9.0.0", false, flashvars, params, attributes);';
foreach ($testCases as $i => $currTest)
{
$currResult = preg_match('/\s*swfobject\.embedSWF\s*\(\s*(["\'])(' . preg_quote($myFilename) . ')[^"\']+\1\s*,[\s\S]+?\)\s*;\s*$/', $currTest);
if ($currResult === false || $currResult < 1)
echo $i, ' Not matching', PHP_EOL;
else
echo $i, ' Matching', PHP_EOL;
}
?>