views:

580

answers:

5

What is the recommended way to return ad hoc (custom case by case) data from repository which don't fit any model entities or which extend some?

The 101 example would be the ubiquitous hello word application: a blog system. Suppose you want to load a list of posts where post entry has some additional information which does not exists in the Post entity. Let’s say it is the number of comments and the date and time of the last comment. This would be highly trivial if one was using the plain old SQL and reading data directly from the database. How am I supposed to do it optimally using the repository pattern if I cannot afford loading the entire collections of Comments for each Post, and I want to do it in one database hit? Is there any commonly used pattern for this situation? Now imagine that you have moderately complex web application where each page needs a slightly different custom data, and loading full hierarchies is not possible (performance, memory requirements etc).

Some random ideas:

  1. Add a list of properties to each model which could be populated by the custom data.

  2. Subclass model entities case by case, and create custom readers for each subclass.

  3. Use LINQ, compose ad hoc queries and read anonymous classes.

Note: I’ve asked a similar question recently, but it seemed to be too general and did not attract much attention.

Example:

Based on suggestions in answers below, I am adding a more concrete example. Here is the situation I was trying to describe:

IEnumarable<Post> posts = repository.GetPostsByPage(1);
foreach (Post post in posts)
{

    // snip: push post title, content, etc. to view

    // determine the post count and latest comment date
    int commentCount = post.Comments.Count();
    DateTime newestCommentDate = post.Comments.Max(c => c.Date);

    // snip: push the count and date to view

}

If I don’t do anything extra and use an off the shelf ORM, this will result to n+1 queries or possibly one query loading all posts and comments. But optimally, I would like to be able to just execute one SQL which would return one row for each post including the post title, body etc. and the comment count and most recent comment date in the same. This is trivial in SQL. The problem is that my repository won’t be able to read and fit this type of data into the model. Where do the max dates and the counts go?

I’m not asking how to do that. You can always do it somehow: add extra methods to the repository, add new classes, special entities, use LINQ etc., but I guess my question is the following. How come the repository pattern and the proper model driven development are so widely accepted, but yet they don’t seem to address this seemingly very common and basic case.

A: 

Can't say I really see what the problem is, just firing in the air here:

  • Add a specific entity to encapsulate the info yo want
  • Add a property Comments to the Post. (I don't see why this would require you to fetch all comments - you can just fetch the comments for the particular post you're loading)
  • Use lazy loading to only fetch the comments when you access the property

I think you would have a greater chance of seeing your question answered if you would make platform, language and O/R mapper specific (seems to be .NET C# or VB, since you mentioned LINQ. LINQ 2 SQL? Entity framework? Something else?)

Kurt Schelfthout
Thank you for pointing me to that. I’ve added a simple concrete example and more explanation.
Jan Zich
+1  A: 

There's a lot to this question. Are you needing this specific data for a reporting procedure? If so, then the proper solution is to have separate data access for reporting purposes. Flattened databases, views, ect.

Or is it an ad-hoc query need? If so, Ayende has a post on this very problem. http://ayende.com/Blog/archive/2006/12/07/ComplexSearchingQueryingWithNHibernate.aspx

He utilizes a "Finder" object. He's using NHibernate, so essentially what he's doing is creating a detached query.

I've done something similar in the past by creating a Query object that I can fill prior to handing it to a repository (some DDD purist will argue against it, but I find it elegant and easy to use).

The Query object implements a fluent interface, so I can write this and get the results back:

IQuery query = new PostQuery()
   .WithPostId(postId)
   .And()
   .WithCommentCount()
   .And()
   .WithCommentsHavingDateLessThan(selectedDate);


Post post = _repository.Find(query);

However, in your specific case I have to wonder at your design. You are saying you can't load the comments with the post. Why? Are you just being overly worrisome about performance? Is this a case of premature optimization? (seems like it to me)

If I had a Post object it would be my aggregate root and it would come with the Comments attached. And then everything you want to do would work in every scenario.

Chris Holmes
Thank you. You suggestions seem like a good start. I wonder where do you actually store the comment counts? Surely, there is no separate data member for it in the Post entity.
Jan Zich
Regarding the performance, the example with the blog posts was just an example. The real application which I have in mind is already running, and we cannot really afford to load the entire collections.
Jan Zich
Chris Holmes
+1  A: 

Since we needed to urgently solve the issue I outlined in my original question, we resorted to the following solution. We added a property collection (a dictionary) to each model entity, and if the DAL needs to, it sticks custom data into to. In order to establish some kind of control, the property collection is keyed by instances of a designated class and it supports only simple data types (integers, dates, ...) which is all we need at movement, and mostly likely will ever need. A typical case which this solves is: loading an entity with counts for its subcollections instead of full populated collections. I suspect that this probably does not get any award for a software design, but it was the simplest and the most practical solution for our case.

Jan Zich
I think the other option would've been attaching some named queries to the entities and using those in the repositories. Interesting question though, too bad so few people seemed to grok it.
wds
A: 

If you aren't locked into a RDBMs then a Database like CouchDB or Amazons SimpleDB might be something to look at. What you are describing is trivial in a CouchDB View. This probably doesn't really answer you specific question but sometimes it's good to look at radically different options.

Jeremy Wall
A: 

For this I generally have a RepositoryStatus and a Status class that acts as my Data Transfer Object (DTO). The Status class is used in my application service layer (for the same reason) from which the RepositoryStatus inherits. Then with this class I can return error messages, response objects, etc. from the Repository layer. This class is generic in that it will accept any object in and cast it out for the receiver.

Here is the Status class:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using RanchBuddy.Core.Domain;
using StructureMap;

namespace RanchBuddy.Core.Services.Impl
{
    [Pluggable("Default")]
    public class Status : IStatus
    {
        public Status()
        {
            _messages = new List<string>();
            _violations = new List<RuleViolation>();
        }

        public enum StatusTypes
        {
            Success,
            Failure
        }

        private object _object;
        public T GetObject<T>()
        {
            return (T)_object;
        }
        public void SetObject<T>(T Object)
        {
            _object = Object;
        }

        private List<string> _messages;
        public void AddMessage(string Message)
        {
            _messages.Add(Message);
        }
        public List<string> GetMessages()
        {
            return _messages;
        }
        public void AddMessages(List<string> Messages)
        {
            _messages.AddRange(Messages);
        }

        private List<RuleViolation> _violations;
        public void AddRuleViolation(RuleViolation violation)
        {
            _violations.Add(violation);
        }
        public void AddRuleViolations(List<RuleViolation> violations)
        {
            _violations.AddRange(violations);
        }
        public List<RuleViolation> GetRuleViolations()
        {
            return _violations;
        }
        public StatusTypes StatusType { get; set; }
    }
}

And here is the RepositoryStatus:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using RanchBuddy.Core.Services.Impl;
using StructureMap;

namespace RanchBuddy.Core.DataAccess.Impl
{
    [Pluggable("DefaultRepositoryStatus")]
    public class RepositoryStatus : Status, IRepositoryStatus
    {

    }
}

As you can see the RepositoryStatus doesn't yet do anything special and just relies on the Status objects utilities. But I wanted to reserve the right to extend at a later date!

I am sure that some of the die-hards out there will state that this should not be used if you are to be a prueist...however I know your pain in that sometimes you need to pass out more than just a returned object!

Andrew Siemer