views:

235

answers:

7

Do you have a blind spot in programming?

I mean is there a common technique or language feature that you can't really get used to. Well, I have one (or probably more than one) and mine is usage of delegate. Hands up! Who else doesn't feel comfortable with delegates? Be honest!

So what's a delegate?

Since my courses at university introduced me to C, I know about function pointers. Function pointers are handy if you want to pass methods as arguments. So in my mind a delegate is something like a function pointer. Eureka! I got it. I have not!

A concrete scenario?

I would like to remove any line from a text file that matches a regular expression. Assuming I have a collection of lines, List<T> has method RemoveAll which seems to be perfectly suitable for that purpose. RemoveAll expects an evaluation method as argument for deciding on whether to remove or leave a list element. And there it is: The function pointer!

Any code here?

public static int RemoveLinesFromFile(string path, string pattern)
{
  List<string> lines = new List<string>(File.ReadAllLines(path));
  int result = lines.RemoveAll(DoesLineMatch);
  File.WriteAllLines(path, lines.ToArray());
  return result;
}

So I'm looking for a function DoesLineMatch which evaluates if a line matches a pattern.

Do you see the problem?

RemoveAll expects a delegate Predicate<string> match as argument. I would have coded it like this:

private static bool DoesLineMatch(string line, string pattern)
{
  return Regex.IsMatch(line, pattern);
}

But then I'm getting an error "Expected a method with 'bool DoesLineMatch(string)' signature". What am I missing here?

Does it work at all?

This is how I finally got it working:

public static int RemoveLinesFromFile(string path, string pattern)
{
  List<string> lines = new List<string>(File.ReadAllLines(path));
  int result = lines.RemoveAll(delegate(string line)
    {
      return Regex.IsMatch(line, pattern);
    });
  File.WriteAllLines(path, lines.ToArray());
  return result;
}

I'm happy that it works but I don't understand it.

And what is the question?

What I did to get it working is simply inlining the method. As far as I understand inlining, it is just some kind of use-once-and-destroy-code. If you use a variable or method only once you may inline it, but inlining is always equivalent to declaring it explicitly.

Is there a way to declare the method explicitly? How would I do it?

PS.: Pardon me that my question is somewhat lengthy.

PPS.: As soon as I get this delegate thing I will make the leap from 2.0 to 3.0 and learn lambdas.

PPPS.: Following Jon's hint on efficiency of Regex.IsMatch(string, string) I modified my code:

  int result = lines.RemoveAll(delegate(string line)
    {
      Regex regex = new Regex(pattern);
      return regex.IsMatch(line);
    });

That isn't of much help regarding efficiency matters. So I followed ReSharper's proposal and moved the Regex instantiation to the outer scope:

  Regex regex = new Regex(pattern);
  int result = lines.RemoveAll(delegate(string line)
    {
      return regex.IsMatch(line);
    });

Now ReSharper urged me to replace this with a method group:

  Regex regex = new Regex(pattern);
  int result = lines.RemoveAll(regex.IsMatch);

And that is quite similar to the answers proposed here. Not what I asked for, but again I'm amazed how ReSharper (and Stack Overflow of course) helps learning.

+7  A: 

You're trying to use a method with a signature of:

bool DoesLineMatch(string line, string pattern)

for a delegate with signature:

bool Predicate(string value)

Where would it get the second string value (the pattern) from?

The only way to do this with an explicitly declared method would be something like this:

public sealed class RegexHolder
{
    private readonly string pattern;

    public RegexHolder(string pattern)
    {
        this.pattern = pattern;
    }

    public bool DoesLineMatch(string line)
    {
        return Regex.IsMatch(line, pattern);
    }
}

Then:

public static int RemoveLinesFromFile(string path, string pattern)
{
    List<string> lines = new List<string>(File.ReadAllLines(path));
    RegexHolder holder = new RegexHolder(pattern);
    int result = lines.RemoveAll(holder.DoesLineMatch);
    File.WriteAllLines(path, lines.ToArray());
    return result;
}

That's close to what the compiler's doing for you with the anonymous method - it will have created a nested class to hold the captured variable (pattern in this case).

(Note that I've avoided any discussion of the efficiency of calling Regex.Match(string, string) rather than creating a single instance of the Regex... that's a different matter.)

Jon Skeet
The answer is currying: http://en.wikipedia.org/wiki/CurryingIs there a simple way to do that in C#?
Joachim Sauer
Jon I think you need to remove the second arg from `RegexHolder.DoesLineMatch`.
Drew Noakes
@Drew: Whoops, yes, thanks.
Jon Skeet
@Joachim, there is a way to do currying in C#, even C# 2.0 because, anonymous methods, support closures, see my answer. (shameless plug :P)
Pop Catalin
@Joachim: With more shenanigans, yes. It's not as clean as it might be though, and to be honest I'd want the OP to understand what's going on with my suggestion before moving onto currying and lamdbas. Personally I'd just use a lambda expression or an anonymous method to call it directly and not bother with the separate method - but I believe the point here is *understanding* what's going on.
Jon Skeet
By the way, why the downvote folks?
Jon Skeet
@Jon, I've down voted you because you had and upvote but the code was not compiling as Drew mentioned, I've undone the downvote once you corrected the code.
Pop Catalin
I was disturbed by Jon's comment on Regex and wanted to know more, so here's the link: http://stackoverflow.com/questions/1155694/static-vs-instance-versions-of-regex-match-in-c
Philippe
@Pop: You downvote for basically a typo? That's pretty harsh, IMO... especially when you could have just edited it to make it work instead.
Jon Skeet
That seems to be the norm of late, I got down-voted the other day for using HTML formatting styles from the "90's" (uppercase tags), which I found interesting.
Kyle Rozendo
@Jon Skeet, no not because a typo, but because it was the only upvoted answer and it had exacly the same issue (compiling error) that the asker was trying to avoid. I thoght it was an unfair upvote given the circumstances. The code would fail with the same error, and given the fact that there already were corect answers posted. I don't want to tax you for typos or anyone else, however, I didn't thought it was fair for the top answer to have the same error that the asker had.
Pop Catalin
It was never meant to be a permanent downvote, it was meant to remove the aswer from the top until it was fixed so it woudn't apear to be the corect anwer until fixed. As for editing someone else's code that's not CW, I refrain to do it.
Pop Catalin
A: 

Wot Jon Says.

Furthermore, in C# 3 you might choose to use a lambda, assuming you still want to pass pattern to your method:

int result = lines.RemoveAll(l => DoesLineMatch(l, pattern));
Drew Noakes
A: 

You could declare it like this:

bool DoesLineMatch(string line)
{
  return Regex.IsMatch(line, pattern);
}

Where pattern is a private variable in you class. But that's a bit ugly, that why you can declare the deleage inline and use a closure for the pattern variable that is declared localy in your RemoveLinesFromFile method.

Philippe
+1  A: 

In C# 2.0 you can create an anonymous delegate, which you can use to capture your pattern variable:

        int result = lines.RemoveAll( delegate (string s) {return DoesLineMatch(s, pattern);});
Pop Catalin
But that's just substituting one anonymous method for another. The way I read the question, the OP wants to avoid using the anonymous method, or at least understand why it's necessary.
Jon Skeet
(I wouldn't really describe that as currying - or even partial function application - either btw. It's more using a closure than currying.)
Jon Skeet
@Jon, Yes you're right, I was confusing terms ...
Pop Catalin
+2  A: 

Basically, your anonymous delegate causes compiler to do following: generate an class with unpronounceable name having a field 'pattern' and a method similar to written by you in a delegate. Generated class looks like this:

class Matcher {
    public string Pattern;
    bool IsMatch(string value){
       return Regex.IsMatch(Pattern, value);
    }
}

You see, this class converts two argument function to a function with one argument.

Your code is converted to something like

public static int RemoveLinesFromFile(string path, string pattern)
{
  List<string> lines = new List<string>(File.ReadAllLines(path));
  Matcher matcher = new Matcher(pattern);
  int result = lines.RemoveAll(matcher.IsMatch);
  File.WriteAllLines(path, lines.ToArray());
  return result;
}

You see, runtime takes a variable from scope and binds it with function. Now you have a function with required signature that encloses additional variable. That's why delegates are called closures from CS point of view. Of course, everything mentioned can be made manually, this is just a more simple way of doing it.

Hope this helps.

elder_george
+2  A: 

To expand on some of the other answers here, here's a generic currying function for C#:

public static class DelegateUtils
{
    public static Predicate<T> ToPredicate<T>(this Func<T, Boolean> func)
    {
        return value => func(value);
    }

    public static Func<TResult> Curry<T1, TResult>(
        this Func<T1, TResult> func, T1 firstValue)
    {
        return () => func(firstValue);
    }

    public static Func<T2, TResult> Curry<T1, T2, TResult>(
        this Func<T1, T2, TResult> func, T1 firstValue)
    {
        return p2 => func(firstValue, p2);
    }

    public static Func<T2, T3, TResult> Curry<T1, T2, T3, TResult>(
        this Func<T1, T2, T3, TResult> func, T1 firstValue)
    {
        return (p2, p3) => func(firstValue, p2, p3);
    }

    // if you need more, follow the examples
}

In your example, you would switch the order of the arguments to your matching function, so that the parameter you want to match against is the first, like this:

private static bool DoesLineMatch(string pattern, string line)
{
    return Regex.IsMatch(line, pattern);
}

Then you would use currying to fix the first parameter, and obtain a delegate that you could then convert to a predicate, like this:

Func<String, String, Boolean> func = DoesLineMatch;
Func<String, Boolean> predicateCandidate = func.Curry("yourPattern");
Predicate<String> predicate = predicateCandidate.ToPredicate();
lines.RemoveAll(predicate);

of course, you can inline it all:

lines.RemoveAll(new Func<String, String, Boolean>(DoesLineMatch)
    .Curry("yourPattern")
    .ToPredicate());
Lasse V. Karlsen
+1  A: 

What you've been bitten by here is the phenomenon that C programmers don't usually consider functions with different arguments as differently typed - it doesn't occur to them that passing a pointer to a function with two string arguments where a pointer to a function with a single string argument is expected should generate a type error at compile time, as would happen in e.g. Algol 68.

The C language is only partially to blame for this: it can in fact properly type function pointers by argument and return types. But the notation for these types is really awkward, C compilers don't always require this, and when they do programmers tend to get around it by casting all pointers to (void *) anyway.

Learning C as a first language does teach you some bad habits.

reinierpost