tags:

views:

725

answers:

6

Hi I want to replace the following line in C#. Replace first word with last word. I have to remove '[' and ']' from last word as well.

string oldString = "200 abc def abc [a18943]"

Output should be

string newString="a18943 abc def abc";

Thanks

A: 

Here's one way to do it. Note that I'm assuming that the string is at least 1 word long.

string oldString = "200 abc def abc [a18943]";
string[] words = oldString.Split(' ');
StringBuilder sb = new StringBuilder(words[words.Length-1].Trim('[', ']'));
for (int i = 1; i < words.Length-1; i++)
{
    sb.Append (" ");
    sb.Append (words[i]);
}
string newString = sb.ToString();

OOPS, fixed a typo above. Serves me to write code without compiling first. :-)

Justin Grant
Could also use .Trim('[', ']') to get rid of the brackets which may be a little cleaner assuming the last word always starts and ends with the brackets.
Wil P
Good idea-- Trim is also much, much faster-- cuts the time required to run this almost in half. Updating now.
Justin Grant
A: 

It's ugly, but it works.

string[] a = oldString.Split(' ');
var result = a.Skip( a.Length-1)
            .Select(w => w.Replace("[","").Replace("]",""))
            .Concat( a.Take( a.Length -1 ).Skip(1)).ToArray();

var newString = string.Join(" ", result);
Winston Smith
also, result is not a string, it's a System.Linq.Enumerable.ConcatIterator
Justin Grant
I'd since edited it - newString will of course contain the flattened string.
Winston Smith
Got it. BTW, perf of this is over 3x worse than the winning (regex) solution. Probably won't make much of a difference since even the slowest solution ran 400k per second on my (slow) PC, but food for thought that a regex solution beats a LINQ solution by a wide margin, at least in this case.
Justin Grant
+14  A: 
string newString = Regex.Replace(oldString, @"^(\w+)(.+) \[(\w+)\]$", "$3$2");
Matajon
+1, very concise.
Winston Smith
Its workedI just want to understand how its woring
NETQuestion
Very clean. I like it. I just wonder if RegEx is overkill here?
Wil P
@NetQuestion: The regex pattern defines 3 groups within the parentheses. It will match any fragment which contains a word (\w+) followed by any character (.+) and ending with a word surrounded by [ and ] (\[(\w)\]$. The replacement is a concatenation of the backreferences to the 3rd group and the 2nd group.
Rob van Groenewoud
And it's fast too-- I wrote a quick benchmark of the solutions on this thread, and the regex was within 15% of the winner, but was by far the cleanest code. This surprised me... I'd have expected regexes to be slower. Nice work!
Justin Grant
I would say concise, but obtuse. Who could ever debug, that if there was an issue with it!? Maybe it's just me. I'd rather one of my devs give me the wilpeck code than regex any day.
Scott P
@Scott P, I have encountered this argument often. I can appreciate concise code like this answer here. I think the argument could go both ways with respect to maintenance, my solution may be easier to understand at first glance, but there is more code to consider if requirements change. I guess when people ask these kind of questions I don't always see regex or linq as my first solution, mostly because I don't always understand what is happening under the covers with LINQ or the Regex classes. I'd rather use LINQ or Regex when I really need it vs. using it out of habit.
Wil P
+1  A: 

Try:

 Regex       : ^\w+(.*\s)\[(\w+)]$
 Replacement : $2$1
Bart Kiers
+3  A: 
        string oldString = "200 abc def abc [a18943]";
        string[] values = oldString.Split(' ');
        string lastWord = values[values.Length - 1].Trim('[', ']');
        values[0] = lastWord;
        string newString = string.Join(" ", values, 0, values.Length - 1);
Wil P
+2  A: 

Just for fun, I wrote a little benchmark to perf-test all these answers (including my other answer above). Here's results on my workstation (32-bit Core 2 Duo @ 2.66GHz) for 5M repetitions using a Release build :

  • LINQ : 10.545 seconds
  • my Split + StringBuilder way : 3.633 seconds
  • wipeck's Split-and-Join way! : 3.32 seconds
  • (uncompiled) regex : 3.845 seconds
  • (compiled) regex : 12.431 seconds

Results: wipeck's Split-and-Join solution wins, but the (OP-selected) regex solution was only 15% slower, which surprised me. I was expecting 100% or more worse. Kudos to the .NET Regex developers for speed.

My own solution (using Split and StringBuilder) was, I thought, optimized for speed, but requires a lot more code and doesn't actually make it fast. Doh!

Most surprisingly, I tried a compiled regex solution and it was almost 3x slower than the uncompiled regex (and I didn't include the compilation time in the results-- including compilation it'd be even worse). So much for compiled regex perf advantage.

LINQ was, as I expected, really slow-- the overhead of all those extra objects and method calls really adds up.

Here's the test code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

class Timer : IDisposable
{
    private DateTime _start;
    private string _name;

    public Timer(string name)
    {
        _name = name;
        _start = DateTime.Now;
    }
    public void Dispose()
    {
        TimeSpan taken = DateTime.Now - _start;
        Console.WriteLine(string.Format ("{0} : {1} seconds", _name, taken.TotalMilliseconds / 1000.0));
    }
}
class Program
{
    static void Main(string[] args)
    {
        int reps = 5000000;
        string oldString = "200 abc def abc [a18943]";

        using (new Timer("LINQ"))
        {
            for (int n = 0; n < reps; n++)
            {
                string[] a = oldString.Split(' ');
                var result = a.Skip(a.Length - 1)
                            .Select(w => w.Replace("[", "").Replace("]", ""))
                            .Concat(a.Take(a.Length - 1).Skip(1)).ToArray();

                var newString = string.Join(" ", result);
            }
        }

        using (new Timer("my Split + StringBuilder way"))
        {
            for (int n = 0; n < reps; n++)
            {
                string[] words = oldString.Split(' ');
                StringBuilder sb = new StringBuilder(words[words.Length - 1].Trim('[', ']'));
                for (int i = 1; i < words.Length - 1; i++)
                {
                    sb.Append(' ');
                    sb.Append(words[i]);
                }
                string newString = sb.ToString();
            }
        }

        using (new Timer("wipeck's Split-and-Join way!"))
        {
            for (int n = 0; n < reps; n++)
            {
                string valueString = "200 abc def abc [a18943]";
                string[] values = valueString.Split(' ');
                string lastWord = values[values.Length - 1];
                lastWord = lastWord.Trim('[', ']');
                values[0] = lastWord;
                string movedValueString = string.Join(" ", values, 0, values.Length - 1);
            }
        }

        using (new Timer("(uncompiled) regex"))
        {
            for (int n = 0; n < reps; n++)
            {
                string newString = Regex.Replace(@"^(\w+)(.+) \[(\w+)\]$", oldString, "$3$2");
            }
        }

        Regex regex = new Regex(@"^(\w+)(.+) \[(\w+)\]$", RegexOptions.Compiled);
        string newStringPreload = regex.Replace(oldString, "$3$2");
        using (new Timer("(compiled) regex"))
        {
            for (int n = 0; n < reps; n++)
            {
                string newString = regex.Replace(oldString, "$3$2");
            }
        }
    }
}
Justin Grant
Nice! Thanks for posting this. While my code may not be all that elegant it gets the job done :)
Wil P