views:

485

answers:

4

I am new to regular expressions.

I want to do multiline search. Here is the example of what I want to do:

Suppose I have following text:

*Project #1:
CVC – Customer Value Creation (Sep 2007 – till now)
Time Warner Cable is the world's leading media and entertainment company, Time Warner Cable (TWC) makes coaxial quiver.
Client   : Time Warner Cable, US.
ETL Tool  : Informatica 7.1.4
Database  : Oracle 9i.
Role   : ETL Developer/Team Lead.
O/S   : UNIX.
Responsibilities:
Created Test Plan and Test Case Book.
Peer reviewed team members Mappings.
Documented Mappings.
Leading the Development Team.
Sending Reports to onsite.
Bug fixing for Defects, Data and Performance related.                                                                                                     
Project #2:
MYER – Sales Analysis system (Nov 2005 – till now)
            Coles Myer is one of Australia's largest retailers with more than 2,000 stores throughout Australia,
Client   : Coles Myer Retail, Australia.
ETL Tool  : Informatica 7.1.3
Database  : Oracle 8i.
Role   : ETL Developer.
O/S   : UNIX.
Responsibilities:
Extraction, Transformation and Loading of the data using Informatica.
Understanding the entire source system.                                                                                     
Created and Run Sessions and Workflows.
Created Sort files using Syncsort Application.*

I want to write RegEx which should first try to match word "Project" which can be either in small or upper case.

If "project" matches, then RegEx should try to match either client, role, environment. If RegEx. matches ANY ONE of these, then match is complete. (Words client, role, enviornment can be in any case also they may or may not be on the same line as that of word "project")

I have written one regular expression for above task which is like this :

^((P|p)roject.*\s*.*((((E|e)nviornment)|((P|p)latform)|((R|r)ole(s)?)|((R|r)esponsibilit(y|ies))|((C|c)lient)|((C|c)ustomer)|((P|p)eriod)))

This RegEx. matches Project #1 but does not match Project #2.

Can anyone please tell me what is wrong with this RegEx or how to write RegEx for this kind of text?

A: 

since you didn't specified a programming language, here some commonly used patterns to accomplish this

/yourRegexpattern/m  <-- the m stays for multiline

you could also use

/yourRegexpattern/im <-- the i stays for case insensitivity

to remove the need of those (P|p) etc.

In C#, you have to specify these flags in the regex's constructor, just use autocompletion.

Etan
Thanks for such a quick response.I am using C# for this task, but to test regular expression I am using Expresso editor.In Expresso, this regular expression is not working.It is able to search "Project #1", but not "Project #2".
Shekhar
A: 

In case of C# you can specify the Multiline options as a parameter to the Regex constructor:

Regex r = new Regex("(var matches = new Array\\([^\\)]*\\);)",  
          RegexOptions.IgnoreCase | RegexOptions.Compiled 
          | RegexOptions.Multiline);

For more code details please refer the link: C# and Regex: How to extract strings between quotation marks

Rashmi Pandit
quotation marks?
Rubens Farias
Thanks a lot for quick reply !!
Shekhar
+2  A: 

Try this:

Regex project = new Regex(
   @"^(Project [\s\S]*?" + 
   @"(Environment|Platform|Roles?|Responsibilit(y|ies)|Client|Customer|Period))",
   RegexOptions.ECMAScript | RegexOptions.IgnoreCase | RegexOptions.Multiline);
Rubens Farias
Thanks Rubens for helping me out. Your RegEx. works
Shekhar
A: 

Wow, this community is really active and fast !!!

Thank you everyone for helping me out.

Shekhar
just accept one of the answers :)
Etan