views:

49

answers:

1

I am trying to search through the .htm files for our intranet to find out which network files are being linked to on which pages of the site. What I would like to do is have PowerShell go through each .htm and return any string that begins with "file:///" and ends with a double quote. For instance:

<td colspan="3"><a href="file:///X:/Name of Document.doc" style="text-decoration: none">

Would return:

file:///X:/Name of Document.doc

As for the PowerShell commands, I have been using this:

select-string -Path [Document Path] -Pattern '[Pattern]' -AllMatches | % { $_.Matches } | % { $_.Value }

The only trouble I am running into is that I cannot figure out the regular expression that I should be using to pull the strings that I am looking for. Any ideas?

+3  A: 

This pattern should work: `file:///[^"]*' e.g.:

$str = @'
<td colspan="3">
    <a href="file:///X:/Name of Document.doc" style="text-decoration: none"> 
'@
$str | select-string '(file:///[^"]*)' | %{$_.Matches[0].Value}

file:///X:/Name of Document.doc
Keith Hill
Worked like a champ, thanks!
Psycho Bob