tags:

views:

777

answers:

12
+9  Q: 

LINQ for beginners

Hello,

I love C#,I love the framework and I also love to learn as much as possible.Today I began to read articles about LINQ in C# and I couldn't find anything good for a beginner that never worked with SQL in his life.

I found this article very helpful and I understood small parts of it,but I'd like to get more examples.

After reading it couple of times,I tried to use LINQ in a function of mine,but I failed.

    private void Filter(string filename)
    {
        using (TextWriter writer = File.CreateText(Application.StartupPath + "\\temp\\test.txt"))
        {
            using(TextReader reader = File.OpenText(filename))
            {
                string line;
                while((line = reader.ReadLine()) != null)
                {
                    string[] items = line.Split('\t');
                    int myInteger = int.Parse(items[1]);
                    if (myInteger == 24809) writer.WriteLine(line); 
                }
            }
        }
    }

This is what I did and it did not work,the result was always false.

    private void Filter(string filename)
    {
        using (TextWriter writer = File.CreateText(Application.StartupPath + "\\temp\\test.txt"))
        {
            using(TextReader reader = File.OpenText(filename))
            {
                string line;
                while((line = reader.ReadLine()) != null)
                {
                    string[] items = line.Split('\t');
                    var Linqi = from item in items
                                where int.Parse(items[1]) == 24809
                                select true;
                    if (Linqi == true) writer.WriteLine(line); 
                }
            }
        }
    }

I'm asking for two things:

  1. How would the function look like using as much Linq as possible?
  2. A website/book/article about Linq,but please note I'm a decent beginner in sql/linq.

Thank you in advance!

+5  A: 

For a website as a starting point, you can try Hooked on LINQ

cbeuker
+6  A: 

101 LINQ Samples is certainly a good collection of examples. Also LINQPad might be a good way to play around with LINQ.

Joey
+1  A: 

MSDN LINQ Examples: http://msdn.microsoft.com/en-us/vcsharp/aa336746.aspx

Stephen Wrighton
A: 

cannot just check if Linqi is true...Linqi is an IEnumerable<bool> (in this case) so have to check like Linqi.First() == true

here is a small example:

string[] items = { "12121", "2222", "24809", "23445", "24809" };

                        var Linqi = from item in items
                                    where Convert.ToInt32(item) == 24809
                                    select true;
                        if (Linqi.First() == true) Console.WriteLine("Got a true");

You could also iterate over Linqi, and in my example there are 2 items in the collection.

CSharpAtl
An example of using as much linq as possible in my function will be very well appreciated. :)
John
love when there is no explanation for a down vote...that should be a requirement.
CSharpAtl
since the question was about SQL and LINQ I did not try to completely rewrite his code.
CSharpAtl
A: 

If I was to rewrite your filter function using LINQ where possible, it'd look like this:

private void Filter(string filename)
{
    using (TextWriter writer = File.CreateText(Application.StartupPath + "\\temp\\test.txt"))
    {
        var lines = File.ReadAllLines(filename);
        var matches = from line in lines
                      let items = line.Split('\t')
                      let myInteger = int.Parse(items[1]);
                      where myInteger == 24809
                      select line;

        foreach (var match in matches)
        {
            writer.WriteLine(line)
        }
    }
}
Judah Himango
Hmm...looks familiar :) (although not sure why you're selecting the int when you want to print the line)
lc
Note that reading all the lines in one go is a bit of a limiting factor - and unnecessarily so. (See my answer :)
Jon Skeet
It's an issue for big files, obviously. Jon Skeet's answer is better...sigh, what else is new. :-)
Judah Himango
@lc, separation of query and output is nice. I declaratively grabbed the the data via a LINQ query, then did output afterwards. I like this clean separation.
Judah Himango
Will this function compile? foreach loop is trying to access "line", but line is only available in the Linq statement...
Milan Gardian
@Judah, Ok, but where are you getting the "line" in "writer.WriteLine(line)" then?
lc
My fault -- I didn't compile this before posting. I'll update the post with proper code.
Judah Himango
A: 

To answer the first question, there frankly isn't too much reason to use LINQ the way you suggest in the above function except as an exercise. In fact, it probably just makes the function harder to read.

LINQ is more useful at operating on a collection than a single element, and I would use it in that way instead. So, here's my attempt at using as much LINQ as possible in the function (make no mention of efficiency and I don't suggest reading the whole file into memory like this):

private void Filter(string filename)
{
    using (TextWriter writer = File.CreateText(Application.StartupPath + "\\temp\\test.txt"))
    {
        using(TextReader reader = File.OpenText(filename))
        {
            List<string> lines;
            string line;
            while((line = reader.ReadLine()) != null)
                lines.Add(line);

            var query = from l in lines
                        let splitLine = l.Split('\t')
                        where int.Parse(splitLine.Skip(1).First()) == 24809
                        select l;

            foreach(var l in query)               
                writer.WriteLine(l); 
        }
    }
}
lc
+18  A: 

Well one thing that would make your sample more "LINQy" is an IEnumerable<string> for reading lines from a file. Here's a somewhat simplified version of my LineReader class from MiscUtil:

using System;
using System.Collections;
using System.Collections.Generic;
using System.IO;

public sealed class LineReader : IEnumerable<string>
{
    readonly Func<TextReader> dataSource;

    public LineReader(string filename)
        : this(() => File.OpenText(filename))
    {
    }

    public LineReader(Func<TextReader> dataSource)
    {
        this.dataSource = dataSource;
    }

    public IEnumerator<string> GetEnumerator()
    {
        using (TextReader reader = dataSource())
        {
            string line;
            while ((line = reader.ReadLine()) != null)
            {
                yield return line;
            }
        }
    }


    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

Now you can use that:

    var query = from line in new LineReader(filename)
                let items = line.Split('\t')
                let myInteger int.Parse(items[1]);
                where myInteger == 24809
                select line;

    using (TextWriter writer = File.CreateText(Application.StartupPath 
                                               + "\\temp\\test.txt"))
    {
        foreach (string line in query)
        {
            writer.WriteLine(line);
        }
    }

Note that it would probably be more efficient to not have the let clauses:

    var query = from line in new LineReader(filename)
                where int.Parse(line.Split('\t')[1]) == 24809
                select line;

at which point you could reasonably do it all in "dot notation":

    var query = new LineReader(filename)
                        .Where(line => int.Parse(line.Split('\t')[1]) == 24809);

However, I far prefer the readability of the original query :)

Jon Skeet
@Jon,I've got your book,It's really great to have such a piece of art.I have one more question - Could you tell me the page where you explain in deep details the LINQ in your book?
John
Your too damn fast Jon. :)
Matthew Whited
On a similar note, here's blog post on making Streams enumerable: http://www.atalasoft.com/cs/blogs/stevehawley/archive/2009/01/30/making-streams-enumerable.aspx
plinth
@John: Glad you like the book :) Chapters 11 and 12 explain LINQ, but chapter 6 covers iterators so that's what you want to understand the LineReader class. (I'm going to use it as an example for the second edition.) If there are any specific details of LINQ which you reckon are missing, please let me know so I can include them in the second edition :)
Jon Skeet
@plinth: That's interesting, but I'd prefer the enumerable to take an () => Stream. That way the iterator can close the stream itself, and potentially open multiple streams (if GetEnumerator is called multiple times). I'm also disturbed at the use of *Stream* (binary data) to enumerate *characters* (text data). It should be IEnumerable<byte> or use a TextReader instead of a stream.
Jon Skeet
@plinth: Whoops, I meant a Func<Stream>, not a "() => Stream" :)
Jon Skeet
+1  A: 

First, I would introduce this method:

private IEnumerable<string> ReadLines(StreamReader reader)
{
    while(!reader.EndOfStream)
    {
        yield return reader.ReadLine();
    }
}

Then, I would refactor the main method to use it. I put both using statements above the same block, and also added a range check to ensure items[1] doesn't fail:

private void Filter(string fileName)
{
    using(var writer = File.CreateText(Application.StartupPath + "\\temp\\test.txt"))
    using(var reader = File.OpenText(filename))
    {
        var myIntegers =
            from line in ReadLines(reader)
            let items = line.Split('\t')
            where items.Length > 1
            let myInteger = Int32.Parse(items[1])
            where myInteger == 24809
            select myInteger;

        foreach(var myInteger in myIntegers)
        {
            writer.WriteLine(myInteger);
        }
    }
}
Bryan Watts
In what way does it not dispose of the TextReader? It's in a using statement.
Jon Skeet
(Note that it *won't* dispose it if a caller manually calls MoveNext() and then abandons the iterator, but foreach calls Dispose automatically.)
Jon Skeet
You're right, I mis-read. You aren't using the constructor which takes a Func<TextReader>. I meant that when you use that function, the LineReader class doesn't fully encapsulate the lifetime of the TextReader instance.
Bryan Watts
As in your example - there are two "where" statements.If the first where statement is not true does it continue to read the linq statement or it leaves it and myIntegers = null?
John
@Bryan: No, even if you use the constructor which takes a Func<TextReader> my class is responsible for the lifetime of the TextReader itself - the function is only called within the GetEnumerator call. And I *am* using that constructor, chained from the string constructor. It's the fact that I only keep a Func<TextReader> instead of an actual TextReader which makes it safe.
Jon Skeet
@Jon: You're right, I am being too pedantic. I wasn't thinking about your example, but more the general design of that constructor. What if someone passed you a local variable, or member variable? It's not that you don't encapsulate the lifetime, but that the lifetime cannot be guaranteed to be encapsulated. I like the idea, but it seems dangerous in a general-purpose API.
Bryan Watts
@John: the where clause will filter out elements which don't match. This means that any line with a length of 0 or 1 will not have an integer represented in the final sequence. The rest of the lines will still be considered. If they all happen to fail the where clause, the result would be an empty sequence, not null.
Bryan Watts
@Bryan: It doesn't matter if they pass a local variable or a member variable. They're passing in a *function*, not a TextReader. If they pass in a function which *returns* a local variable via a lambda expression, then that will be captured. In what way is this any different to your method taking a StreamReader? In both cases, if the client is foolish they'll end up with a leak. However, it's *easier* to get it right with my code IMO, as the client doesn't need to do any clean-up themselves.
Jon Skeet
I was talking about closure captures. The function cannot guarantee a new instance is created every time the delegate is invoked, so it cannot reasonably dispose of whatever instance it gets. Calling a constructor in the function can be a tacit part of the contract, but it can't be enforced. The difference between the LineReader class and my method is that my method is private, and thus the client is known and trusted. It's my own dumb fault if a leak is caused. With LineReader, a simple misunderstanding of the API makes the client vulnerable to a leak: "Hey, great, I already have a reader!"
Bryan Watts
Well yes, your method is private and can't be reused. You could make LineReader a private nested class too if you wanted to go down that route. As soon as you introduce reuse into the equation, you need to specify the behaviour in the documentation etc. I don't think that's unreasonable - I think people expect StreamReader.Close to close the underlying stream, for example.
Jon Skeet
The perfect is the enemy of the good, to be certain. I was seeking a way to express the function's intent. Perhaps using "factory" in the variable name, or even defining a factory interface. One implementation could use a function, another a file name. LineReader would no longer require knowledge of which File.OpenText overload to use, and its API would be uniform and clear. Just some thoughts.
Bryan Watts
+3  A: 

If you're after a book, I found LINQ in action from Manning Publications a good place to start.

Nick
+1  A: 

As for Linq books, I would recommend:

  

Both are excellent books that drill into Linq in detail.

To add yet another variation to the as-much-linq-as-possible topic, here's my take:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

namespace LinqDemo
{
    class Program
    {
        static void Main()
        {
            var baseDir = AppDomain.CurrentDomain.BaseDirectory;
            File.WriteAllLines(
                Path.Combine(baseDir, "out.txt"),
                File.ReadAllLines(Path.Combine(baseDir, "in.txt"))
                    .Select(line => new KeyValuePair<string, string[]>(line, line.Split(','))) // split each line into columns, also carry the original line forward
                    .Where(info => info.Value.Length > 1) // filter out lines that don't have 2nd column
                    .Select(info => new KeyValuePair<string, int>(info.Key, int.Parse(info.Value[1]))) // convert 2nd column to int, still carrying the original line forward
                    .Where(info => info.Value == 24809) // apply the filtering criteria
                    .Select(info => info.Key) // restore original lines
                    .ToArray());
        }
    }
}

Note that I changed your tab-delimited-columns to comma-delimited columns (easier to author in my editor that converts tabs to spaces ;-) ). When this program is run against an input file:

A1,2
B,24809,C
C

E
G,24809

The output will be:

B,24809,C
G,24809

You could improve memory requirements of this solution by replacing "File.ReadAllLines" and "File.WriteAllLines" with Jon Skeet's LineReader (and LineWriter in a similar vein, taking IEnumerable and writing each returned item to the output file as a new line). This would transform the solution above from "get all lines into memory as an array, filter them down, create another array in memory for result and write this result to output file" to "read lines from input file one by one, and if that line meets our criteria, write it to output file immediately" (pipeline approach).

Milan Gardian
+1  A: 

I found this article to be extremely crucial to understand LINQ which is based upon so many new constructs brought in in .NET 3.0 & 3.5:

I'll warn you it's a long read, but if you really want to understand what Linq is and does I believe it is essential

http://blogs.msdn.com/ericwhite/pages/FP-Tutorial.aspx

Happy reading

Jose