tags:

views:

89

answers:

3

Lets try again. I need to get VALUE directly behind BLAH_X. BLAH_X is EVERYWHERE however i have known KEYS. So i would like to use regex to find the VALUE directly behind BLAH_X. The way i am doing this now matches DUMMY which is directly before blahx instead of the value which is directly behind the blahx behind the known KEY.


I have been given a file that has its string like this

blah blah DUMMY blahX blah DUMMY blah blah VALUE blahX junk random junk  blahY KEY
blah blah DUMMY blahX blah DUMMY blah blah junk  blahX junk random junk  
blah blah DUMMY blahX blah DUMMY blah blah VALUE blahX junk random junk  blahY KEY

Some values/keys are optional, i cannot depend on order. With regular expressions using C# how do i write an regex that takes the value directly behind the key? I know i could write it in such a way it will match the first DUMMY but i cant think of how to make the VALUE instead.

-edit- I didnt realize the question was unclear. The example above is less complex then mine (edit to make it more complex). Basically VALUE can be any number of words behind key. However, there are known keys and the value is ALWAYS directly behind blahX.

+3  A: 

Why do you have to use regular expressions? I think you're asking the wrong question. You should ask "I have this string, how can I parse it? I thought of regular expressions, is that the best way?"

See http://catb.org/~esr/faqs/smart-questions.html#goal

By the way, I think regular expressions are the wrong tool for your task. Maybe you could use them just to split the string into words (['blah', 'blah', 'DUMMY'...]) but the rest of the parsing should be done by reading that array yourself.

Nicolás
Its not an array. It a DB entry with some XML and some HTML. Its very nasty. I need to generate a clean output and export the data. -edit- some data they dont need and they are archiving the data. I think regex is a fine solution i prefer not to use so many indexOf in this code.
acidzombie24
The following answer is censored due to harsh languge: What the #%=!?!
Filip Ekberg
I think the OP's username is more realistic than I would like.
Tim Pietzcker
@Nicolás: *Thank you* for the link! Excellent. Saved my day :)
Tim Pietzcker
A: 

You need to read up on C# and Regular Expressions. There are great resources for this really and your question is a little to localized / furry to give a better example.

If you want everything inside of VALUE and KEY you can use VALUE(.*)KEY.

But regex isn't always the answer. If it's XML / HTML that you are parsing look into HTML Agility Pack / XML Parsers.

Filip Ekberg
+1  A: 

Assuming your VALUE can be identified by

  • Being a word immidiately before blahX
  • Being behind DUMMY and before KEY, with no DUMMYs between it and the KEY,

the following Regex will capture it:

"DUMMY(?:.(?!UMMY))*?(\w+)\sblahX(?:.(?!UMMY))*KEY"
Jens
+1 and accepted.
acidzombie24