views:

80

answers:

5

I am starting to get a grip on RegEx thanks to all the great help here on SO with my other questions. But I am still suck on this one:

My code is:

   StreamReader reader = new StreamReader(fDialog.FileName.ToString());
   string content = reader.ReadToEnd();
   reader.Close();


I am reading in a text file and I want to search for this text and change it (the X and Y value always follow each other in my text file):

X17.8Y-1.

But this text can also be X16.1Y2.3 (the values will always be different after X and Y)


I want to change it to this

X17.8Y-1.G54
or
X(value)Y(value)G54



My RegEx statement follows but it is not working.

content = Regex.Replace(content, @"(X(?:\d*\.)?\d+)*(Y(?:\d*\.)?\d+)", "$1$2G54");


Can someone please modify it for me so it works and will search for X(wildcard) Y(Wildcard) and replace it with X(value)Y(value)G54?

A: 

I know this doesn't answer your question directly, but you should check out Expresso. It's a .NET Regular Expression Tool that allows you to debug, test, and fine-tune your complex expressions. It's a life-saver.

More of a do-it yourself answer, but it'll be helpful even if someone gives you an answer here.

Aren
Awesome. Thanks for the tip.
fraXis
A: 

It looks like you just need to support optional negative values:

content = Regex.Replace(content, @"(X-?(?:\d*\.)?\d+)*(Y-?(?:\d*\.)?\d+)", "$1$2G54");
Chris Schmich
When run on my test case this gives output `"foo X17.8Y-1G54. bar"`.
Mark Byers
@Mark Byers: see my comment on your answer, but yes, mine is incorrect. I based it off of the original regex and just added support for negatives. There were other issues with the original regex. gbacon's answer is the most accurate.
Chris Schmich
+1  A: 

To be picky about the input, you could use

string num = @"-?(?:\d+\.\d+|\d+\.|\.\d+|\d+)";
content = Regex.Replace(content, "(?<x>X" + num + ")(?<y>Y" + num + ")", "${x}${y}G54");

Is there a reliable terminator for the Y value? Say it's whitespace:

content = Regex.Replace(content, @"(X.+?)(Y.+?)(\s)", "$1$2G54$3");

How robust does the code need to be? If it's rewriting debugging output or some other quick-and-dirty task, keep it simple.

Greg Bacon
+2  A: 

The regular expression you need is:

X[-\d.]+Y[-\d.]+

Here is how to use it in C#:

string content = "foo X17.8Y-1. bar";
content = Regex.Replace(content, @"X[-\d.]+Y[-\d.]+", "$0G54");
Console.WriteLine(content);

Output:

foo X17.8Y-1.G54 bar
Mark Byers
Works perfect. Thank you so much.
fraXis
or `X[0-9.-]+Y[0-9.-]+`
Atømix
This now also matches strings like `"X-17--2..3.-Y-...1."`, which I don't think he wants. His original regex in his question specifically filters out non-numeric expressions (it only accepts numbers like .5 or 2.3 or 44). Wouldn't this introduce false positives?
Chris Schmich
Thanks for the change. Your right, it is a lot more readable (and understandable for me) doing it that way, instead of the \d
fraXis
@Chris Schmich: Yes the original expression doesn't match all the strings that my regular expression does... and this is why it doesn't work. :) If you carefully look at his example you can see it doesn't fit the pattern that his regular expression is looking for.
Mark Byers
@Mark Byers: But now you're also matching strings that I don't think the original regex ever *intended* to match. It just seems like the original regex needed tweaking to support optional negative values, i.e. `@"(X-?(?:\d*\.)?\d+)*(Y-?(?:\d*\.)?\d+)"` which supports the two examples in the question and still rejects other cases that I think should probably still be rejected (see my example above). In other words, I could propose a regex of `X.*?Y.*?`, which would fix his problem but also introduce lots of new ones.
Chris Schmich
@Mark Byers: I based mine off of the original regex which has more issues than just handling the negative. I think gbacon's answer below is the most accurate. Basically, it's `content = Regex.Replace(content, @"(X-?(\d+\.\d+|\d+\.|\.\d+|\d+))(Y-?(\d+\.\d+|\d+\.|\.\d+|\d+))", "$0G54");`. It's more verbose, but it covers everything well and rejects the rest.
Chris Schmich
+1  A: 

What comes after "X(value)Y(value)" in your text file? A space? A newline? Another "X(value)Y(value)" value pair?

I'm not very good using shortcuts in regexes, like \d, nor am I familiar with .NET, but I had used the following regex to match the value pair:

(X[0-9.-]+Y[0-9.-]+)

And the replacement is

$1G54

This will work as long as a value pair is not directly followed by a digit, period or a dash.

Mecki
I for one am a huge fan of the `[0-9]` syntax over `\d`. It's far more readable and with regular expressions, ANYTHING that enhances readability is a good thing.
Atømix