tags:

views:

843

answers:

4

Hello, I am looking for help in how to create a regular expression that extracts a substring from a string that looks like the following:

test123 #AMMA-TestFileName File's.xml

to...

AMMA-TEstFileName File's

Basically, removing the first "#" and everything before it. In addition, removing the ".xml" file extension.

Any help is appreciated as I am just getting started with regex. This is going to be used in a Nintex workflow action that supports the .NET regular expressions API.

A: 

The string you are looking for will be in the first group.

/#(.+?)\.xml$/

In C#

String extractFilename(String s) 
{
    Regex r = new Regex(@"#(?<filename>.+?)\.xml$", RegexOptions.Compiled);
    return r.Match(s).Result("${filename}"); 
}

edits: removed escaping the #, added end of line qualifier for extension, and added C# example

Jesse
Fast response from all, thank you very much. Nintex doesn't allow me to write code directly in to the workflow action to get group 1 so I used the example above to create a web service to call and do the work.
+2  A: 

Anchor the pattern at the end of the string:

/#(.+)\.xml$/
Sinan Ünür
+1  A: 

if you want to take care of extensions other than xml

/#(.+)\..*$/
Ratnesh Maurya
should that end match be \.[^\.]+$ so file..ext only cuts off the last .ext
Simeon Pilgrim
yeah.. thats right
Ratnesh Maurya
A: 

Edit: obviously this is not Regex, but it is faster than regex for this use so I included it.

Path.GetFileNameWithoutExtension(s.Substring(s.IndexOf('#') + 1));
280Z28