views:

105

answers:

5

Ok,

We have a lot of where clauses in our code. We have just as many ways to generate a string to represent the in condition. I am trying to come up with a clean way as follows:

public static string Join<T>(this IEnumerable<T> items, string separator)
{
    var strings = from item in items select item.ToString();
    return string.Join(separator, strings.ToArray());
}

it can be used as follows:

var values = new []{1, 2, 3, 4, 5, 6};
values.StringJoin(",");
// result should be:
// "1,2,3,4,5,6"

So this is a nice extension method that does a very basic job. I know that simple code does not always turn into fast or efficient execution, but I am just curious as to what could I have missed with this simple code. Other members of our team are arguing that:

  • it is not flexible enough (no control of the string representation)
  • may not be memory efficient
  • may not be fast

Any expert to chime in?

Regards,

Eric.

A: 

this would work also:

public static string Test(IEnumerable<T> items, string separator)
{
    var builder = new StringBuilder();
    bool appendSeperator = false;
    if(null != items)
    {
        foreach(var item in items)
        {
            if(appendSeperator)
            {
                builder.Append(separator)
            }

            builder.Append(item.ToString());

            appendSeperator = true;
        }
   }

   return builder.ToString();
}
Kevin
A: 

You are missing null checks for the sequence and the items of the sequence. And yes, it is not the fastest and most memory efficient way. One would probably just enumerate the sequence and render the string representations of the items into a StringBuilder. But does this really matter? Are you experiencing performance problems? Do you need to optimize?

Daniel Brückner
Seeing as it's an extension method, checking for a `null` collection isn't making sense. Checking the number of items in the collection might be nice though.
Cloud
I don't get the first point - if it would be an instance method, checking this for null does not make sense, but you have to check the sequence if it is an extension method because it might get called on a null reference.
Daniel Brückner
A: 

Why don't you use StringBuilder, and iterate through the collection yourself, appending. Otherwise you are creating an array of strings (var strings) and then doing the Join.

Nestor
+3  A: 

For some reason, I thought that String.Join is implemented in terms of a StringBuilder class. But if it isn't, then the following is likely to perform better for large inputs since it doesn't recreate a String object for each join in the iteration.

public static string Join<T>(this IEnumerable<T> items, string separator)
{
    // TODO: check for null arguments.
    StringBuilder builder = new StringBuilder();
    foreach(T t in items)
    {
        builder.Append(t.ToString()).Append(separator);
    }

    builder.Length -= separator.Length;
    return builder.ToString();
}

EDIT: Here is an analysis of when it is appropriate to use StringBuilder and String.Join.

Steve Guidi
Join() uses direct acces to the string's char array and will be at least as fast.
Henk Holterman
Steve, it's interesting, you are exhibiting the exact concern I had about the StringBuilder approach. If you look around, there are dozens of variations on your algorithm. Actually, inspired by the String.Join() method (thank you Reflector), I would probably follow their pattern. I was hoping that more people would post their algorithm here and back their preference with some hard data. But it's an interesting topic.
Eric Liprandi
I found a question on SO that analyzes the efficiency of `StringBuilder` and `String.Join` -- see my edit for the link.
Steve Guidi
+1 for the link http://stackoverflow.com/questions/585860/string-join-vs-stringbuilder-which-is-faster
cottsak
+5  A: 

Regarding the first issue, you could add another 'formatter' parameter to control the conversion of each item into a string:

public static string Join<T>(this IEnumerable<T> items, string separator)
{
    return items.Join(separator, i => i.ToString());
}

public static string Join<T>(this IEnumerable<T> items, string separator, Func<T, string> formatter)
{
    return String.Join(separator, items.Select(i => formatter(i)).ToArray());
}

Regarding the second two issues, I wouldn't worry about it unless you later run into performance issues and find it to be a problem. It's unlikely to much of a bottleneck however...

Lee
Lee, I agree with you. That was my original argument with my team. We have yet to show any actual performance bottleneck. At least with this approach, one can easily update the implementation in one place and our entire app will benefit from it.
Eric Liprandi
+1 for the `Join` preference as the links below suggest that the performance difference is negligible and readability/simplicity is preferred.
cottsak