ansaurus

Question

Other ways to deal with "loop initialization" in C#

Answer 1

+10 A:

For your specific example there is a standard solution: string.Join. This handles adding the delimiter correctly so that you don't have to write the loop yourself.

If you really want to write this yourself an approach you can use is the following:

string delimiter = "";
foreach (var element in array)
{
    append(delimiter);
    append(element);
    delimiter = ",";
}

This should be reasonably efficient and I think it is reasonable to read. The constant string "," is interned so this won't result in a new string being created on each iteration. Of course if performance is critical for your application you should benchmark rather than guess.

Mark Byers 2010-08-29 20:00:13

@Mark: He's just using `String.Join` as an example...

Matti Virkkunen 2010-08-29 20:02:37

to use `String.Join()` you first need to have an array of strings, if you don't have that given then you are looking at building `List<string>` then converting it to to `string[]` which is more ugliness than he started with.

liho1eye 2010-08-29 20:11:21

Like this. Always annoyed by "first"... How sure can you be that the compiler will optimize away the repeated assignments to delimiter?

Nicolas78 2010-08-29 20:13:12

@liho1eye: With .NET 4.0 it is no longer true that you need an array of strings: http://msdn.microsoft.com/en-us/library/dd992421.aspx

Mark Byers 2010-08-29 20:13:52

+1 This is a great solution for strings. However, there are other examples where using the loop I described would be helpful than just strings. This requires that whatever type you're dealing with in your loop have a signal value like the empty string that will effectively perform the no-op that you're getting for your initial loop. I don't know that this always is true but when it is it's a great answer.

Peter Oehlert 2010-08-29 20:44:55

Answer 2

A:

I prefer first variable method. It is probably not cleanest but most efficient way. Alternatively you could use Length of the thing you appending to and compare it to zero. Works well with StringBuilder.

liho1eye 2010-08-29 20:06:12

Answer 3

+17 A:

Personally I like Mark Byer's option, but you could always write your own generic method for this:

public static void IterateWithSpecialFirst<T>(this IEnumerable<T> source,
    Action<T> firstAction,
    Action<T> subsequentActions)
{
    using (IEnumerator<T> iterator = source.GetEnumerator())
    {
        if (iterator.MoveNext())
        {
            firstAction(iterator.Current);
        }
        while (iterator.MoveNext())
        {
            subsequentActions(iterator.Current);
        }
    }
}

That's relatively straightforward... giving a special last action is slightly harder:

public static void IterateWithSpecialLast<T>(this IEnumerable<T> source,
    Action<T> allButLastAction,
    Action<T> lastAction)
{
    using (IEnumerator<T> iterator = source.GetEnumerator())
    {
        if (!iterator.MoveNext())
        {
            return;
        }            
        T previous = iterator.Current;
        while (iterator.MoveNext())
        {
            allButLastAction(previous);
            previous = iterator.Current;
        }
        lastAction(previous);
    }
}

EDIT: As your comment was concerned with the performance of this, I'll reiterate my comment in this answer: while this general problem is reasonably common, it's not common for it to be such a performance bottleneck that it's worth micro-optimizing around. Indeed, I can't remember ever coming across a situation where the looping machinery became a bottleneck. I'm sure it happens, but that isn't "common". If I ever run into it, I'll special-case that particular code, and the best solution will depend on exactly what the code needs to do.

In general, however, I value readability and reusability much more than micro-optimization.

Jon Skeet 2010-08-29 20:07:04

+1 for showing the correct way to use an enumerator directly: `using`.

Martinho Fernandes 2010-08-29 20:14:19

But calling a delegate is rather expensive action (delegate `Action`) so this solution is meant to be legible but it is on the expense of performance. Am I right or is there a trick?

MartyIX 2010-08-29 20:16:33

@MartyIX: Your question talks about a *common* coding problem, but also specifies performance as a key requirement. Just how often does this crop up in such a way that the cost of executing a delegate is going to be a bottleneck? I can't think of that *ever* happening to me. If I ever ran into it, I'd treat that specific situation as an anomaly, and write code specific to it. I wouldn't try to think of a general purpose solution to such an unusual situation. "General purpose" is often counter to the kind of micro-optimization it looks like you're after.

Jon Skeet 2010-08-29 20:28:26

Is calling a delegate an expensive action? Since .NET 2 it is on par with calling a method on an interface.

Martinho Fernandes 2010-08-29 20:29:07

I was just curious I though that delegates are way slower than direct method calls but it seems they're not: http://stackoverflow.com/questions/304770/does-using-delegates-slow-down-my-net-programs (Also from Jon :))

MartyIX 2010-08-29 20:38:08

@Jon: To justify my question note that in OP question there is the performance clearly the requirement.

MartyIX 2010-08-29 20:43:08

@MartyIX: Oops, for some reason I thought you were the OP. It was a reasonable comment given the question... but I don't think the question's entirely reasonable to start with :)

Jon Skeet 2010-08-29 20:43:59

Answer 4

A:

Why don't move dealing with first element outside a loop ?

StringBuilder sb = new StrindBuilder()
sb.append(array.first)
foreach (var elem in array.skip(1)) {
  sb.append(",")
  sb.append(elem)
}

Bart 2010-08-29 20:13:23

Actually, Peter briefly mentions this solution in his question already.

Martinho Fernandes 2010-08-29 20:16:40

If `array` is actually a DB query, won't this method execute the query *twice*?

Gabe 2010-08-29 20:18:55

@Gabe: yes, as written, it would. In that case, you should probably save the query result beforehand with a call to `ToList()`.

Martinho Fernandes 2010-08-29 20:25:42

if array is DB query it shouldn't be hard to execute it only once. Storing in temporary variable or something. But I may be wrong.

Bart 2010-08-29 20:25:49

The reason I strongly dislike this answer is because of the duplication of code. If your line sb.append(...) were to change in any way, you have to remember to change it twice. In your example above it may not be difficult but with more complex real world code my experience is that it happens frequently. I've fixed a good share of colleagues bugs around this exact problem.

Peter Oehlert 2010-08-29 20:35:34

Ok, so I was confused by performance thing. Because I still think that this is the fastest way (modulo buffering), even overriding delimiter by "," in every loop is wasteing some MSIL instructions. And I agree that it could make some troubles. However you removed your performance demand..

Bart 2010-08-30 07:02:46

Answer 5

+5 A:

You are already willing to give up on foreach. So this ought to be suitable:

        using (var enumerator = array.GetEnumerator()) {
            if (enumerator.MoveNext()) {
                for (;;) {
                    append(enumerator.Current);
                    if (!enumerator.MoveNext()) break;
                    append(delimiter);
                }
            }
        }

Hans Passant 2010-08-29 20:23:10

I remember seeing this construct in a language somewhere: a loop with the check, not at the beginning, not at the end, but in the middle. Can't remember what language it was though...

Martinho Fernandes 2010-08-29 20:27:01

This works in almost any language.

Hans Passant 2010-08-29 20:33:50

+1 This is called a loop-with-exit loop in Code Complete and is my preferred solution for this kind of problem.

Jordão 2010-08-29 21:29:58

@Martinho: it is possible to have a loop condition (as a distinct construct, rather than `if`/`break`) in the middle of the loop in Sather: `loop do_something; while!(condition); do_something_else; end;`. This is because loop conditions in Sather are not special language constructs, but are a limited form of coroutines: http://www.icsi.berkeley.edu/~sather/Documentation/LanguageDescription/webmaker/DescriptionX2Eiterators-chapte-1.html

Pavel Minaev 2010-08-30 19:42:08

Answer 6

+4 A:

Sometimes I use LINQ .First() and .Skip(1) to handle this ... This can give a relatively clean (and very readable) solution.

Using you example,

append(array.First());
foreach(var x in array.Skip(1))
{
  append(delimiter);
  append (x);
}

[This assumes there is at least one element in the array, an easy test to add if that's to be avoided.]

Use F# would be another suggestion :-)

Hightechrider 2010-08-29 20:31:20

I've used a foreach to "test" for the first, and used "Take(1)" instead of first. a little funky, but gets the job done.

Merlyn Morgan-Graham 2010-08-29 20:45:53

I confess I haven't quite gotten around to learning F# (it's on the list); how would this be done in F#?

Peter Oehlert 2010-08-29 20:56:34

F# has pattern matching. The `Cons Pattern` allows you to decompose a list into `head :: tail`. See http://msdn.microsoft.com/en-us/library/dd547125.aspx

Hightechrider 2010-08-29 21:10:05

Answer 7

+2 A:

There are ways you "can" get around the doubled code, but in most cases the duplicated code is much less ugly/dangerous than the possible solutions. The "goto" solution you quote doesn't seem like an improvement to me - I don't really think you really gain anything significant (compactness, readability or efficiency) by using it, while you increase the risk of a programmer getting something wrong at some point in the code's lifetime.

In general I tend to go for the approach:

A special case for the first (or last) action
loop for the other actions.

This removes the inefficiencies introduced by checking if the loop is in the first iteration on every time around, and is really easy to understand. For non-trivial cases, using a delegate or helper method to apply the action can minimise code duplication.

Or another approach I use sometimes where efficiency isn't important:

loop, and test if the string is empty to determine if a delimiter is required.

This can be written to be more compact and readable than the goto approach, and doesn't require any extra variables/storage/tests to detect the "special case" iteraiton.

But I think Mark Byers' approach is a good clean solution for your particular example.

Jason Williams 2010-08-29 20:34:45

Answer 8

+4 A:

You can certainly create a goto solution in C# (note: I didn't add null checks):

string Join(string[] array, string delimiter) {
  var sb = new StringBuilder();
  var enumerator = array.GetEnumerator();
  if (enumerator.MoveNext()) {
    goto start;
    loop:
      sb.Append(delimiter);
      start: sb.Append(enumerator.Current);
      if (enumerator.MoveNext()) goto loop;
  }
  return sb.ToString();
}

For your specific example, this looks pretty straighforward to me (and it's one of the solutions you described):

string Join(string[] array, string delimiter) {
  var sb = new StringBuilder();
  foreach (string element in array) {
    sb.Append(element);
    sb.Append(delimiter);
  }
  if (sb.Length >= delimiter.Length) sb.Length -= delimiter.Length;
  return sb.ToString();
}

If you want to get functional, you can try using this folding approach:

string Join(string[] array, string delimiter) {
  return array.Aggregate((left, right) => left + delimiter + right);
}

Although it reads really nice, it's not using a StringBuilder, so you might want to abuse Aggregate a little to use it:

string Join(string[] array, string delimiter) {
  var sb = new StringBuilder();
  array.Aggregate((left, right) => {
    sb.Append(left).Append(delimiter).Append(right);
    return "";
  });
  return sb.ToString();
}

Or you can use this (borrowing the idea from other answers here):

string Join(string[] array, string delimiter) {
  return array.
    Skip(1).
    Aggregate(new StringBuilder(array.FirstOrDefault()),
      (acc, s) => acc.Append(delimiter).Append(s)).
    ToString();
}

Jordão 2010-08-29 21:54:33

Replacing a loop with a goto is terribly ugly and should never be used without a *really* good reason, but +1 for thinking outside the box and finding a way to implement the construct in C#.

Heinzi 2010-08-29 22:27:13

Answer 9

A:

If you want to go the functional route, you could define String.Join like LINQ construct that is reusable across types.

Personally, I would almost always go for code clarity over saving a few opcode executions.

EG:

namespace Play
{
    public static class LinqExtensions {
        public static U JoinElements<T, U>(this IEnumerable<T> list, Func<T, U> initializer, Func<U, T, U> joiner)
        {
            U joined = default(U);
            bool first = true;
            foreach (var item in list)
            {
                if (first)
                {
                    joined = initializer(item);
                    first = false;
                }
                else
                {
                    joined = joiner(joined, item);
                }
            }
            return joined;
        }
    }

    class Program
    {

        static void Main(string[] args)
        {
            List<int> nums = new List<int>() { 1, 2, 3 };
            var sum = nums.JoinElements(a => a, (a, b) => a + b);
            Console.WriteLine(sum); // outputs 6

            List<string> words = new List<string>() { "a", "b", "c" };
            var buffer = words.JoinElements(
                a => new StringBuilder(a), 
                (a, b) => a.Append(",").Append(b)
                );

            Console.WriteLine(buffer); // outputs "a,b,c"

            Console.ReadKey();
        }

    }
}

Sam Saffron 2010-08-30 02:01:57

Linq has this, it's called `Aggregate`

joshperry 2010-08-30 04:11:50

@josh, this is subtly different to Aggregate: words.Aggregate( new StringBuilder(), (a, b) => a.Append(",").Append(b) ); ... returns (",a,b,c")

Sam Saffron 2010-08-30 04:41:18

ansaurus

tags:

views:

answers:

Other ways to deal with "loop initialization" in C#

related questions