tags:

views:

222

answers:

11

I wrote the following method to remove the namespace in brackets from strings.

I would like to make this as fast as possible.

Is there a way to speed up the following code?

using System;

namespace TestRemoveFast
{
    class Program
    {
        static void Main(string[] args)
        {
            string[] tests = {
            "{http://company.com/Services/Types}ModifiedAt",
            "{http://company.com/Services/Types}CreatedAt"
                             };

            foreach (var test in tests)
            {
                Console.WriteLine(Clean(test));
            }

            Console.ReadLine();
        }

        static string Clean(string line)
        {
            int pos = line.IndexOf('}');
            if (pos > 0)
                return line.Substring(pos + 1, line.Length - pos - 1);
            else
                return line;
        }
    }
}
+1  A: 

Have you tried this with Regex and/or use a stringbuilder instead of a string?

Lucas B
Have you profiled this? I doubt that this is faster.
Jason
+3  A: 

You could try parallelism since it doesn't look like you need a synchronous treatment. A parallel foreach with PLINQ would do the trick.

But if you cannot wait until VS2010 is officially out, you could try Poor Man's Parallel.ForEach Iterator by Emre Aydinceren

Dynami Le Savard
+1 Parallel.ForEach is perfect for this example.
Ian
A: 

instead of your foreach...

for( int i=0;i<tests.Length;i++)
     Console.WriteLine( tests[i].Replace("}",null ) );
DRapp
Isn't that just replacing the closing brace, not everything preceding it?
Matt Grande
yeah... my bad...However, from your code, you dont need the secondary parm on Substring for the length. If you only supply the starting position, it will return all the rest of the string from that point.
DRapp
+1  A: 

If you changed speed for space, you can loop once in the given array, and copy chars other than '{.*}'. That would save two calls (.IndexOf() and .Substring()).

Ariel
Have you profiled this? I doubt that this is faster.
Jason
+3  A: 

The approach seems to be quiet fast. But from the string you have, I conclude that the name is usually smaller then the {URL}. You can use .LastIndexOf() method. I think it starts from the end of string

cornerback84
I tried LastIndexOf combined with Remove and got a few percent gain, but only when the delimiter was towards the end of the input string. When I moved it towards the beginning, things slowed down again. Of course, if the actual data always fits this restriction, then the OP can take advantage of it.
Bob Moore
+2  A: 

Let's say you do find an answer here. I think you need to consider what the "solution" is going to look like for the next guy that looks at your code.

I'll take more readable code vs. a couple milliseconds any day.

Kris Krause
Unless you need to process a few million records per second.
Matthew Whited
@Matthew Whited: The given code successfully processed 1,000,000 strings on my pathetic laptop (http://ark.intel.com/Product.aspx?id=29761) in 0.0875438 seconds implying that it can process 11,422,853 in one second. So, more than a few million per second.
Jason
A: 

The slowest thing in there, by far, is Console.WriteLine(). Take the following example:

    public void TestCleanSpeed()
    {
        var start = DateTime.Now;
        for (var i = 0; i < 10000; i++)
        {
            string[] tests = {
                                 "{http://company.com/Services/Types}ModifiedAt",
                                 "{http://company.com/Services/Types}CreatedAt"
                             };

            foreach (var test in tests)
            {
                Console.WriteLine(Clean(test));
            }
        }
        var end = DateTime.Now;

        var ts = end - start;
        Console.WriteLine(ts);
    }

If you run it as-is, it takes nearly six seconds. Then, remove the Console.WriteLine and instead assign var newTest = Clean(test);. In my test, it took less 0.02 seconds for the 10000 executions.

Matt Grande
The `Console.WriteLine` is irrelevant. It's not even part of the `Clean` method that he asked about.
Jason
The for(var i=0 .... loop is also irrelevant. I guess the array of string[] tests can be big. you describe a different scenario altogether.
Johannes
@Jason - I read "Is there a way to speed up the following code?" and assumed he meant _all_ the code that he posted. I guess I misunderstood.
Matt Grande
@Johannes - I put the for loop in so it would run multiple times. It didn't register the times of either execution when it only ran once.
Matt Grande
A: 

The only other approach would be to use line.Remove(0, pos + 1);, but i think internally is Remove more complex then Substring, due to the fact, that Remove can also cut something out of the middle.

So Substring() should be the fastes one.

Oliver
A: 

You can convert your string to XName and get the name section as follows.

((System.Xml.Linq.XName)"{http://company.com/Services/Types}ModifiedAt").LocalName
Mehdi Golchin
interesting, may be able to use that elsewhere, but in this example it makes it about 3 times as slow
Edward Tanguay
+1  A: 

Are you sure this is a bottleneck in your code? What is your specification for the time requirements of this method? Have you profiled the code you have now to see if it meets those specifications?

I would guess that the code that you have is nearly optimal. Both IndexOf and Substring invoke unsafe code to do all sorts of fancy optimizations that won't be available to you unless you also go the unsafe route. If you do that, you're just going to end up rewriting IndexOf and Substring anyway.

So unless this code is definitively a bottleneck in your code and you have a reasonable specification of the time requirements for this method I would focus your efforts elsewhere.

Jason
A: 
line.Substring(line.IndexOf('}') + 1);

Marginally faster.

Terje