views:

131

answers:

3

I wrote these helper functions a few weeks ago and I feel like I'm not taking advantage of some C# language features that would keep me from writing these same loops again for other similar helpers.

Can anyone suggest what I'm missing?

public static IList<string> GetListOfLinesThatContainTextFromList(
        Stream aTextStream, IList<string> aListOfStringsToFind)
{
    IList<string> aList = new List<string>();

    using (var aReader = new StreamReader(aTextStream))
    {
        while (!aReader.EndOfStream)
        {
            var currLine = aReader.ReadLine();

            foreach (var aToken in aListOfStringsToFind)
                if (currLine.Contains(aToken))
                    aList.Add(currLine);
        }
    }

    return aList;
}

public static DataTable GetDataTableFromDelimitedTextFile(
        Stream aTextStream, string aDelimiter)
{
    var aTable = new DataTable();
    Regex aRegEx = new Regex(aDelimiter);

    using (var aReader = new StreamReader(aTextStream))
    {
        while (!aReader.EndOfStream)
        {
            // -snip-
            // build a DataTable based on the textstream infos
        }
    }

    return aTable;
}
+1  A: 

Unfortunately, in this case, I'm not sure the refactoring would be worth the "savings" in terms of logic. You could refactor this so that your using statement and while loop were moved into a separate method, which took an Action<T> delegate. This could be passed in as a lambda expression. This would work very much like List<T>.ForEach.

However, I'd argue that, in this case, it's going to make your code less maintainable. The using statement + while loop, while duplicated, is a very small amount of shared code. The complexity of the lambda is probably not worth the single duplicated loop construct.

If there was more shared logic involved, refactoring this may make sense - but with nothing but a while loop (with the internals of the loop differing), I think you're making it more complicated by refactoring. This would be particularly important if one of your cases needs to break out differently - refactoring this may actually make your life more complicated.

Reed Copsey
In general I feel the same way; that refactoring isn't going to save me much from a performance perspective, but it will make my code less readable. Still, my question was more focused on discovering language features that I may have overlooked.
JMP
+6  A: 

I'd use composition here. For example, for the first one, first split out the bit reading lines - and make it lazy...

public static IEnumerable<string> ReadLines(Stream input)
{
    // Note: don't close the StreamReader - we don't own the stream!
    StreamReader reader = new StreamReader(input);
    string line;
    while ((line = reader.ReadLine()) != null)
    {
        yield return line;
    }
}

(You can find a more fully-featured version of this in MiscUtil.)

Now you can use LINQ to filter out lines not in an appropriate set (and I would use a HashSet rather than a list, unless it's a really short list):

var acceptedLines = new HashSet<string>();
// Populate acceptedLines here
var query = ReadLines(input).Where(line => acceptedLines.Contains(line))
                            .ToList();

You can use the same line-reading method when populating the DataTable. You need to be careful though - this is lazily evaluated so you need to keep the stream open until you've finished reading, and then close it yourself. I prefer to pass in something which lets you get at a Stream (or a TextReader) as then the closing logic can be in the ReadLines method.

Jon Skeet
Is there any particular reason that you chose "ReadLines(Stream input)" over "Lines(this Stream input)"? - it seems like the latter would be more readable: var query = input.Lines.Where...
Robert Venables
Well for one thing you can't write extension properties, and for another I prefer method names that include a verb.
Jon Skeet
*input.Lines().Where =P
Robert Venables
A: 

I agree with Reed on this, that refactoring that little bit of code may not be worth it. But, you seemed to imply that this bit of code will be in several places.

My approach on developing is that I make the application work. Once it is working, with unit tests, then I will go back and refactor. Refactoring while coding runs the risks of harder to maintain code, as you have added some complexity and any more refactoring will continue adding more.

But, I have used the Action delegate to pass in functions, for refactoring, to remove duplicate code, but your two examples are sufficiently different that the use of the Action delegate would not gain you anything.

If you can rework the functions so that the only difference is the processing in the loop then it may make sense to try using an Action delegate, but, if you want to use LINQ then I would strongly go with Jon's suggestion (I love it when others respond while I am typing my response :).

James Black