tags:

views:

628

answers:

2

After reading a book on LINQ I'm thinking about re-writing a mapper class that I wrote in c# to use LINQ. I'm wondering if anyone can give me a hand. Note: its a bit confusing, but the User object is the local user and user (lowercase) is the object generated from the Facebook XSD.

Original Mapper

public class FacebookMapper : IMapper
{
    public IEnumerable<User> MapFrom(IEnumerable<User> users)
    {
      var facebookUsers = GetFacebookUsers(users);
      return MergeUsers(users, facebookUsers);
    }

    public Facebook.user[] GetFacebookUsers(IEnumerable<User> users)
    {
      var uids = (from u in users
        where u.FacebookUid != null
        select u.FacebookUid.Value).ToList();

      // return facebook users for uids using WCF
    }

    public IEnumerable<User> MergeUsers(IEnumerable<User> users, Facebook.user[] facebookUsers)
    {
      foreach(var u in users)
      {
        var fbUser = facebookUsers.FirstOrDefault(f => f.uid == u.FacebookUid);
        if (fbUser != null)
          u.FacebookAvatar = fbUser.pic_sqare;
      }
      return users;
    }
}

My first two attempts hit walls

Attempt 1

public IEnumerable<User> MapFrom(IEnumerable<User> users)
{
  // didn't have a way to check if u.FacebookUid == null
  return from u in users
    join f in GetFacebookUsers(users) on u.FacebookUid equals f.uid
    select AppendAvatar(u, f);
}

public void AppendAvatar(User u, Facebook.user f)
{
  if (f == null)
    return u;
  u.FacebookAvatar = f.pic_square;
  return u;
}

Attempt 2

public IEnumerable<User> MapFrom(IEnumerable<User> users)
{
  // had to get the user from the facebook service for each single user,
  // would rather use a single http request.
  return from u in users
    let f = GetFacebookUser(user.FacebookUid)
    select AppendAvatar(u, f);
}
+5  A: 

I would be inclined to write something like this instead:

public class FacebookMapper : IMapper
{
    public IEnumerable<User> MapFacebookAvatars(IEnumerable<User> users)
    {
        var usersByID =
            users.Where(u => u.FacebookUid.HasValue)
                 .ToDictionary(u => u.FacebookUid.Value);

        var facebookUsersByID =
            GetFacebookUsers(usersByID.Keys).ToDictionary(f => f.uid);

        foreach(var id in usersByID.Keys.Intersect(facebookUsersByID.Keys))
            usersByID[id].FacebookAvatar = facebookUsersByID[id].pic_sqare;

        return users;
    }

    public Facebook.user[] GetFacebookUsers(IEnumerable<int> uids)
    {
       // return facebook users for uids using WCF
    }
}

However, I wouldn't claim that was a big improvement over what you've got (unless the user or facebook user collections are very large, in which case you might wind up with a noticable performance difference.)

(I'd recommend against using Select like a foreach loop to perform an actual mutating action on an element of a set, the way you did in your refactoring attempts. You can do it, but people will be surprised by your code, and you'll have to keep lazy evaluation in mind the whole time.)

mquander
+1 Thanks for you answer it was helpful, Jon's grouping technique in LINQ was more along the lines of what I was looking for though.
bendewey
+5  A: 

Okay, it's not clear exactly what IMapper has in it, but I'd suggest a few things, some of which may not be feasible due to other restrictions. I've written this out pretty much as I've thought about it - I think it helps to see the train of thought in action, as that'll make it easier for you to do the same thing next time. (Assuming you like my solutions, of course :)

LINQ is inherently functional in style. That means that ideally, queries shouldn't have side-effects. For instance, I'd expect a method with a signature of:

public IEnumerable<User> MapFrom(IEnumerable<User> users)

to return a new sequence of user objects with extra information, rather than mutating the existing users. The only information you're currently appending is the avatar, so I'd add a method in User along the lines of:

public User WithAvatar(Image avatar)
{
    // Whatever you need to create a clone of this user
    User clone = new User(this.Name, this.Age, etc);
    clone.FacebookAvatar = avatar;
    return clone;
}

You might even want to make User fully immutable - there are various strategies around that, such as the builder pattern. Ask me if you want more details. Anyway, the main thing is that we've created a new user which is a copy of the old one, but with the specified avatar.

First attempt: inner join

Now back to your mapper... you've currently got three public methods but my guess is that only the first one needs to be public, and that the rest of the API doesn't actually need to expose the Facebook users. It looks like your GetFacebookUsers method is basically okay, although I'd probably line up the query in terms of whitespace.

So, given a sequence of local users and a collection of Facebook users, we're left doing the actual mapping bit. A straight "join" clause is problematic, because it won't yield the local users which don't have a matching Facebook user. Instead, we need some way of treating a non-Facebook user as if they were a Facebook user without an avatar. Essentially this is the null object pattern.

We can do that by coming up with a Facebook user who has a null uid (assuming the object model allows that):

// Adjust for however the user should actually be constructed.
private static readonly FacebookUser NullFacebookUser = new FacebookUser(null);

However, we actually want a sequence of these users, because that's what Enumerable.Concat uses:

private static readonly IEnumerable<FacebookUser> NullFacebookUsers =
    Enumerable.Repeat(new FacebookUser(null), 1);

Now we can simply "add" this dummy entry to our real one, and do a normal inner join. Note that this assumes that the lookup of Facebook users will always find a user for any "real" Facebook UID. If that's not the case, we'd need to revisit this and not use an inner join.

We include the "null" user at the end, then do the join and project using WithAvatar:

public IEnumerable<User> MapFrom(IEnumerable<User> users)
{
    var facebookUsers = GetFacebookUsers(users).Concat(NullFacebookUsers);
    return from user in users
           join facebookUser in facebookUsers on
                user.FacebookUid equals facebookUser.uid
           select user.WithAvatar(facebookUser.Avatar);
}

So the full class would be:

public sealed class FacebookMapper : IMapper
{
    private static readonly IEnumerable<FacebookUser> NullFacebookUsers =
        Enumerable.Repeat(new FacebookUser(null), 1);

    public IEnumerable<User> MapFrom(IEnumerable<User> users)
    {
        var facebookUsers = GetFacebookUsers(users).Concat(NullFacebookUsers);
        return from user in users
               join facebookUser in facebookUsers on
                    user.FacebookUid equals facebookUser.uid
               select user.WithAvatar(facebookUser.pic_square);
    }

    private Facebook.user[] GetFacebookUsers(IEnumerable<User> users)
    {
        var uids = (from u in users
                    where u.FacebookUid != null
                    select u.FacebookUid.Value).ToList();

        // return facebook users for uids using WCF
    }
}

A few points here:

  • As noted before, the inner join becomes problematic if a user's Facebook UID might not be fetched as a valid user.
  • Similarly we get problems if we have duplicate Facebook users - each local user would end up coming out twice!
  • This replaces (removes) the avatar for non-Facebook users.

A second approach: group join

Let's see if we can address these points. I'll assume that if we've fetched multiple Facebook users for a single Facebook UID, then it doesn't matter which of them we grab the avatar from - they should be the same.

What we need is a group join, so that for each local user we get a sequence of matching Facebook users. We'll then use DefaultIfEmpty to make life easier.

We can keep WithAvatar as it was before - but this time we're only going to call it if we've got a Facebook user to grab the avatar from. A group join in C# query expressions is represented by join ... into. This query is reasonably long, but it's not too scary, honest!

public IEnumerable<User> MapFrom(IEnumerable<User> users)
{
    var facebookUsers = GetFacebookUsers(users);
    return from user in users
           join facebookUser in facebookUsers on
                user.FacebookUid equals facebookUser.uid
                into matchingUsers
           let firstMatch = matchingUsers.DefaultIfEmpty().First()
           select firstMatch == null ? user : user.WithAvatar(firstMatch.pic_square);
}

Here's the query expression again, but with comments:

// "Source" sequence is just our local users
from user in users
// Perform a group join - the "matchingUsers" range variable will
// now be a sequence of FacebookUsers with the right UID. This could be empty.
join facebookUser in facebookUsers on
     user.FacebookUid equals facebookUser.uid
     into matchingUsers
// Convert an empty sequence into a single null entry, and then take the first
// element - i.e. the first matching FacebookUser or null
let firstMatch = matchingUsers.DefaultIfEmpty().First()
// If we've not got a match, return the original user.
// Otherwise return a new copy with the appropriate avatar
select firstMatch == null ? user : user.WithAvatar(firstMatch.pic_square);

The non-LINQ solution

Another option is to only use LINQ very slightly. For example:

public IEnumerable<User> MapFrom(IEnumerable<User> users)
{
    var facebookUsers = GetFacebookUsers(users);
    var uidDictionary = facebookUsers.ToDictionary(fb => fb.uid);

    foreach (var user in users)
    {
        FacebookUser fb;
        if (uidDictionary.TryGetValue(user.FacebookUid, out fb)
        {
            yield return user.WithAvatar(fb.pic_square);
        }
        else
        {
            yield return user;
        }
    }
}

This uses an iterator block instead of a LINQ query expression. ToDictionary will throw an exception if it receives the same key twice - one option to work around this is to change GetFacebookUsers to make sure it only looks for distinct IDs:

    private Facebook.user[] GetFacebookUsers(IEnumerable<User> users)
    {
        var uids = (from u in users
                    where u.FacebookUid != null
                    select u.FacebookUid.Value).Distinct().ToList();

        // return facebook users for uids using WCF
    }

That assumes the web service works appropriately, of course - but if it doesn't, you probably want to throw an exception anyway :)

Conclusion

Take your pick out of the three. The group join is probably hardest to understand, but behaves best. The iterator block solution is possibly the simplest, and should behave okay with the GetFacebookUsers modification.

Making User immutable would almost certainly be a positive step though.

One nice by-product of all of these solutions is that the users come out in the same order they went in. That may well not be important to you, but it can be a nice property.

Hope this helps - it's been an interesting question :)

EDIT: Is mutation the way to go?

Having seen in your comments that the local User type is actually an entity type from the entity framework, it may not be appropriate to take this course of action. Making it immutable is pretty much out of the question, and I suspect that most uses of the type will expect mutation.

If that's the case, it may be worth changing your interface to make that clearer. Instead of returning an IEnumerable<User> (which implies - to some extent - projection) you might want to change both the signature and the name, leaving you with something like this:

public sealed class FacebookMerger : IUserMerger
{
    public void MergeInformation(IEnumerable<User> users)
    {
        var facebookUsers = GetFacebookUsers(users);
        var uidDictionary = facebookUsers.ToDictionary(fb => fb.uid);

        foreach (var user in users)
        {
            FacebookUser fb;
            if (uidDictionary.TryGetValue(user.FacebookUid, out fb)
            {
                user.Avatar = fb.pic_square;
            }
        }
    }

    private Facebook.user[] GetFacebookUsers(IEnumerable<User> users)
    {
        var uids = (from u in users
                    where u.FacebookUid != null
                    select u.FacebookUid.Value).Distinct().ToList();

        // return facebook users for uids using WCF
    }
}

Again, this isn't a particularly "LINQ-y" solution (in the main operation) any more - but that's reasonable, as you're not really "querying"; you're "updating".

Jon Skeet
Jon, this is awesome +1, I love your work. I think I understand the grouping, I might go with that option. If I use the grouping method do I have to call the .Concat(NullFacebookUsers)? Also, my local User is a EF Object, do you know a good clone method for that?
bendewey
No, the grouping bit deals with "non-facebook users" by it not matching any of them, which is why we use DefaultIfEmpty().First(). And no, I don't know about cloning an EF entity I'm afraid :(
Jon Skeet
@bendewey: Given the EF nature of this, I've added an extra bit to my answer (at the bottom).
Jon Skeet
Oops - I've just seen why you asked about the .Concat(...) bit for the grouping operation. Sorry about accidentally including that! Fixed now...
Jon Skeet