tags:

views:

603

answers:

7

Is it possible to write the following 'foreach' as a LINQ statement, and I guess the more general question can any for loop be replaced by a LINQ statement.

I'm not interested in any potential performance cost just the potential of using declarative approaches in what is traditionally imperative code.

    private static string SomeMethod()
    {
        if (ListOfResources .Count == 0)
            return string.Empty;

        var sb = new StringBuilder();
        foreach (var resource in ListOfResources )
        {
            if (sb.Length != 0)
                sb.Append(", ");

            sb.Append(resource.Id);
        }

        return sb.ToString();
    }

Cheers

AWC

+2  A: 

Technically, yes.

Any foreach loop can be converted to LINQ by using a ForEach extension method,such as the one in MoreLinq.

If you only want to use "pure" LINQ (only the built-in extension methods), you can abuse the Aggregate extension method, like this:

foreach(type item in collection { statements }

type item;
collection.Aggregate(true, (j, itemTemp) => {
    item = itemTemp;
    statements
    return true;
);

This will correctly handle any foreach loop, even JaredPar's answer. EDIT: Unless it uses ref / out parameters, unsafe code, or yield return.
Don't you dare use this trick in real code.


In your specific case, you should use a string Join extension method, such as this one:

///<summary>Appends a list of strings to a StringBuilder, separated by a separator string.</summary>
///<param name="builder">The StringBuilder to append to.</param>
///<param name="strings">The strings to append.</param>
///<param name="separator">A string to append between the strings.</param>
public static StringBuilder AppendJoin(this StringBuilder builder, IEnumerable<string> strings, string separator) {
    if (builder == null) throw new ArgumentNullException("builder");
    if (strings == null) throw new ArgumentNullException("strings");
    if (separator == null) throw new ArgumentNullException("separator");

    bool first = true;

    foreach (var str in strings) {
        if (first)
            first = false;
        else
            builder.Append(separator);

        builder.Append(str);
    }

    return builder;
}

///<summary>Combines a collection of strings into a single string.</summary>
public static string Join<T>(this IEnumerable<T> strings, string separator, Func<T, string> selector) { return strings.Select(selector).Join(separator); }
///<summary>Combines a collection of strings into a single string.</summary>
public static string Join(this IEnumerable<string> strings, string separator) { return new StringBuilder().AppendJoin(strings, separator).ToString(); }
SLaks
There's not one single LINQ query in your code so it's not real answering the question is it? Language INtegrated Query does not mean Extension methods :) it's build around extension methods but that does not make every extension method part of the Language INtegrated Query syntax
Rune FS
@Rune: I'm well aware of that. However, common usage, including every other answer here, refers to any extension methods that follow LINQ's style. (eg, MoreLinq)
SLaks
+8  A: 

In general yes, but there are specific cases that are extremely difficult. For instance, the following code in the general case does not port to a LINQ expression without a good deal of hacking.

var list = new List<Func<int>>();
foreach ( var cur in (new int[] {1,2,3})) {
  list.Add(() => cur);
}

The reason why is that with a for loop, it's possible to see the side effects of how the iteration variable is captured in a closure. LINQ expressions hide the lifetime semantics of the iteration variable and prevent you from seeing side effects of capturing it's value.

Note. The above code is not equivalent to the following LINQ expression.

var list = Enumerable.Range(1,3).Select(x => () => x).ToList();

The foreach sample produces a list of Func<int> objects which all return 3. The LINQ version produces a list of Func<int> which return 1,2 and 3 respectively. This is what makes this style of capture difficult to port.

JaredPar
@Sean, your counter example produces a different result. My sample will create a list of `Func<int>` which all return 3. Yours will return 1,2 and 3.
JaredPar
Yeah, good point. Deleted.
Sean Devlin
+1  A: 

In general, you can write a lambda expression using a delegate which represents the body of a foreach cycle, in your case something like :

resource => { if (sb.Length != 0) sb.Append(", "); sb.Append(resource.Id); }

and then simply use within a ForEach extension method. Whether this is a good idea depends on the complexity of the body, in case it's too big and complex you probably don't gain anything from it except for possible confusion ;)

Thomas Wanner
+1  A: 

The specific loop in your question can be done declaratively like this:

var result = ListOfResources
            .Select<Resource, string>(r => r.Id.ToString())
            .Aggregate<string, StringBuilder>(new StringBuilder(), (sb, s) => sb.Append(sb.Length > 0 ? ", " : String.Empty).Append(s))
            .ToString(); 

As to performance, you can expect a performance drop but this is acceptable for most applications.

ChaosPandion
+2  A: 

I think what's most important here is that to avoid semantic confusion, your code should only be superficially functional when it is actually functional. In other words, please don't use side effects in LINQ expressions.

Max Strini
Are you hinting at the fact that for each item of the loop any equivalent LINQ statement should do exactly the same for all items...
AWC
No, I'm just saying that a LINQ expression shouldn't be thought of as "doing" anything, but as "being" something, i.e. having a value. If you can't find a natural way to think of it that way, best not to formulate it as a LINQ expression.As it happens, the specific example you gave *does* make sense as a functional expression, as Konrad showed above.edit: er, wait, sort of. It should do "exactly the same" because it should be stateless (shouldn't depend on statements that update variables etc.), if that's what you meant.
Max Strini
+16  A: 

Sure. Heck, you can replace arithmetic with LINQ queries:

http://blogs.msdn.com/ericlippert/archive/2009/12/07/query-transformations-are-syntactic.aspx

But you shouldn't.

The purpose of a query expression is to represent a query operation. The purpose of a "for" loop is to iterate over a particular statement so as to have its side-effects executed multiple times. Those are frequently very different. I encourage replacing loops whose purpose is merely to query data with higher-level constructs that more clearly query the data. I strongly discourage replacing side-effect-generating code with query comprehensions, though doing so is possible.

Eric Lippert
not everything is a nail...
AWC
@Eric: Do you see anything wrong with using a ForEach extension method for executing non-side-effect-generating code e.g. doing a WriteLn on every result from the query?
Tom Bushell
@Tom: I'm not following your train of thought here. Surely writing out a line is a side effect, no?
Eric Lippert
@Eric: The only two arguments I've seen that hold any water with me for turning `for` loops into query-like extensions, are: 1) Lazy evaluation of side-effects at the consume site, and 2) converting loops into functional constructs for parallel evaluation (as in PLINQ). It's hard to achieve either with a normal loop.
LBushkin
@Eric: Yeah, I suppose that's true in the strictest definition, but it doesn't change any values in the result set - just dumps them to output. Almost every LINQ example you see ends with a separate foreach loop that dumps the values, and this has always bothered me - it looks clumsy after using the nice LINQ syntax. Could you give a few more details about why you "strongly discourage" this approach? Your point seems to be that "LINQ is for queries only", but is there any potential harm, in the simple cases at least? Not trying to argue, just to understand.
Tom Bushell
@Tom: An operation without side effects should be able to be (1) evaluated lazily, (2) evaluated out of order with respect to other lazy operations, (3) evaluated once and cached, (4) evaluated twice if the cache policy throws out the cached result. If you want to write something to the console, you want it to be written ONCE, in the correct order with respect to all other writes.
Eric Lippert
@LBushkin: parallelization is a good example of a loop being used to evaluate a query result; it seems perfectly reasonable to eschew the loop and move that to a higher level of abstraction. Lazy evaluation is good precisely in those circumstances when there are no side effects; see my previous comment.
Eric Lippert
@Eric: I largely agree. With regard to lazy evaluation being useful only when there are no side effects, I would argue that if the side effects are completely independent of each other - meaning that order of evaluation has no effect on outcome, then lazy evaluation can still be appropriate. I think an example of poorly thought through lazy evaluation would be the answer to this question, which you responded to earlier. http://stackoverflow.com/questions/2222292/optimize-this-slow-linq-to-objects-query/2222375#2222375
LBushkin
+3  A: 

In fact, your code does something which is fundamentally very functional, namely it reduces a list of strings to a single string by concatenating the list items. The only imperative thing about the code is the use of a StringBuilder.

The functional code makes this much easier, actually, because it doesn’t require a special case like your code does. Better still, .NET already has this particular operation implemented, and probably more efficient than your code1):

return String.Join(", ", ListOfResources.Select(s => s.Id.ToString()).ToArray());

(Yes, the call to ToArray() is annoying but Join is a very old method and predates LINQ.)

Of course, a “better” version of Join could be used like this:

return ListOfResources.Select(s => s.Id).Join(", ");

The implementation is rather straightforward – but once again, using the StringBuilder (for performance) makes it imperative.

public static String Join<T>(this IEnumerable<T> items, String delimiter) {
    if (items == null)
        throw new ArgumentNullException("items");
    if (delimiter == null)
        throw new ArgumentNullException("delimiter");

    var strings = items.Select(item => item.ToString()).ToList();
    if (strings.Count == 0)
        return string.Empty;

    int length = strings.Sum(str => str.Length) +
                 delimiter.Length * (strings.Count - 1);
    var result = new StringBuilder(length);

    bool first = true;

    foreach (string str in strings) {
        if (first)
            first = false;
        else
            result.Append(delimiter);
        result.Append(str);
    }

    return result.ToString();
}

1) Without having looked at the implementation in the reflector, I’d guess that String.Join makes a first pass over the strings to determine the overall length. This can be used to initialize the StringBuilder accordingly, thus saving expensive copy operations later on.

EDIT by SLaks: Here is the reference source for the relevant part of String.Join from .Net 3.5:

string jointString = FastAllocateString( jointLength );
fixed (char * pointerToJointString = &jointString.m_firstChar) {
    UnSafeCharBuffer charBuffer = new UnSafeCharBuffer( pointerToJointString, jointLength); 

    // Append the first string first and then append each following string prefixed by the separator. 
    charBuffer.AppendString( value[startIndex] ); 
    for (int stringToJoinIndex = startIndex + 1; stringToJoinIndex <= endIndex; stringToJoinIndex++) {
        charBuffer.AppendString( separator ); 
        charBuffer.AppendString( value[stringToJoinIndex] );
    }
    BCLDebug.Assert(*(pointerToJointString + charBuffer.Length) == '\0', "String must be null-terminated!");
} 
Konrad Rudolph
+1. The only sane way to handle this specific situation in a clear, concise way.
jeroenh
In .NET4, `String.Join` also takes an IEnumerable :)
JulianR
@Julian: good to hear that. *Finally*!
Konrad Rudolph
String.Join actually uses a `char[]` with some internal unsafe copy tricks to achieve even better performance.
SLaks
@SLaks: yes it does but I wanted to keep it simple and still get reasonable performance.
Konrad Rudolph
@Konrad: Since your code isn't in mscorlib, you can't use those tricks anyway. I don't think there's any way to make your code faster without using internal members.
SLaks