views:

161

answers:

5

I'm refactoring a little bit of C# data access code from a previous developer and am curious about a pattern he used.

The code initially exposed collections (arrays) of a variety of ActiveRecord-style business objects - essentially objects wrapping database fields. I'm changing the arrays to generic lists, but the aspect of the code I'm curious about is that the previous developer had Get methods for each type of object he was wrapping, thusly:

public Thing GetThing(int i) {
    return things[i];
}

There are several of these methods, and I cannot for the life of me think of any possible advantage of using that mechanism over simply referring to things[i] directly. Let's assume, for argument's sake, that things is a public property, not a public field (in this case it's actually an auto-implemented property, so that assumption is actually true).

Am I missing something obvious? Or even something esoteric?

UPDATE I should probably clarify that these collections are currently accessed from within for loops:

for (int i = 0; i < thingsCount; i== ) {
    dosomthing( GetThing(i) );
    dosomethingelse( GetThing(i) );
}

which I am refactoring to:

for (int i = 0; i < thingsCount; i== ) {
    Thing thing = things[i];
    dosomthing( thing );
    dosomethingelse( thing );
}

and perhaps even to use things.foreach().

+2  A: 

There are two things to note here. First is that you want to keep object variables private and use getters and setters to access them. This prevents the user from accidentally changing or modifying object variables.

Secondly, it is considered a good naming convention where the terms get/set must be used where an attribute is accessed directly. This helps improve readability.

Although it is a public property in this case, the use of getters and setters improve readability and help prevents unexpected behavior. It doesn't matter if you are accessing the object variables within a loop, you should continue to use the GetThing convention.

Finally if you are accessing the variables from within the object, you don't need to use the getter or setter.

NOTE: Its generally considered good style to keep object variables private and use getters/setters for all languages

C++ style guidelines

C# style guidelines

You may also be interested in the question "getter and setter for class in class c#"

Elpezmuerto
I can understand that logic (although I don't see anything referencing it in the C# example you point to). Thanks!
cori
@Cori Updated link
Elpezmuerto
+2  A: 

You probably want to expose the object as an IList<Thing>. This will give you the indexing capabilities you're looking for, but you also get to use the range of LINQ functions as well, such as creating a new list of items based on conditions. Plus IList<T> implements IEnumerable<T> so you're going to be able to use foreach to loop through the objects.

e.g. in your class:

public IList<Thing> Things { get; private set; }

e.g. usage:

Thing x = business.Things[3];

or

var x = business.Things.Where(t => t.Name == "cori");
Warren
That's exactly where I was headed with refactoring to List (and yes, IList would probably be a better choice).
cori
+6  A: 

I don't know if it's obvious, but I do think you're missing something.

Let's say things is an IList<Thing>. Then exposing it directly (as Things) would allow calling code to call Add, Insert, RemoveAt, etc. Maybe the previous developer didn't want to allow this (and I'm sure there are plenty of good reasons for that).

Even supposing it's a Thing[] (so Add, etc. wouldn't be available), exposing it as such would still allow calling code to do something like obj.Things[0] = new Thing(); which may be an operation that should not be allowed depending on the class's implementation.

You could expose Things as a ReadOnlyCollection<Thing> which would take care of most of these problems. But what it comes down to is this: if the developer only wants to allow calling code to access items by index -- nothing more -- then providing a single GetThing method as the means to do so, honestly, makes by far the most sense.

Now, granted, there's also this option: implementing a this[int] property with only a get accessor. But that only makes sense if the class in question is essentially a collection of Thing objects exclusively (i.e., there isn't also a collection of some other type of object you want to provide access to within the class).

All told, I think the GetThing approach is pretty sound.

That said, from the way you've worded your question, it does sound like the previous developer made some other pretty poor decisions:

  1. If he/she also exposed the things collection directly as a public property, well, then... that defeats the whole purpose of the GetThing method, doesn't it? The result is simply a bloated interface (I generally think it's not a great sign when you've got multiple methods to accomplish exactly the same thing, unless they're clearly documented as aliases for some justifiable reason). [Update: It appears the previous developer did not do this. Good.]
  2. It looks like the previous developer was also internally accessing items from things using the GetThing method, which is just silly (in my opinion). Why introduce the pointless overhead of extra method calls using a class's public interface from within the class itself? If you're in the class, you're already inside the implementation and can access private/protected data all you want -- no need to pretend otherwise.
Dan Tao
Thinking through @Elpezmuerto's response I was coming to exactly this conclusion - even though I am exposing the *collections* as properties with {get; private set;} once I get the collection I can modify it in calling code in ways that don't make any sense.The previous developer had these exposed as protected, so your point 1 isn't relevant in my case, but he was calling GetThing(i) multiple times in each loop, which I agree makes no sense.Seems like the best refactor is to leave the GetThing() methods as-is, but only call them once a loop.Thanks!
cori
@cori: Yes, a really common error developers make in designing their classes is exposing mutable collections directly, without taking into account what can happen if/when those collections are modified. If you only want to enable calling code to enumerate and not modify a collection, expose an `IEnumerable<T>` and not, e.g, a `List<T>`!
Dan Tao
@cori: I will admit it's pretty frustrating, though, when you want to allow random access by index but not all the other features of the `IList<T>` interface or even a `T[]` (which allows you to set items, not just get them). This is actually something I asked about in [an old question of my own](http://stackoverflow.com/questions/2110440/why-is-there-no-iarrayt-interface-in-net) a while back.
Dan Tao
This is a well-thought-out response, which presents many good points, but I would argue that the *best* solution would be to expose the `Things` property with a public getter, and make it implement an interface that exposes exactly the methods/indexers that make sense for this particular purpose. (This is assuming that the class is not exclusively a collection of Things.)
StriplingWarrior
@StriplingWarrior: I agree; the only issue there, however, is that there's no such interface built into the .NET framework (this is basically the crux of the question I linked to in my previous comment), which means that returning an instance of said interface isn't going to add a whole lot of benefit other than allowing you to implement the interface differently somewhere down the line (but then, you could also do that if all you're exposing is a `GetThing` method).
Dan Tao
@StriplingWarrior: I tend to think that in general, what's great about interfaces is that they establish a contract that guarantees specific functionality, which external code can leverage any way it wants. When you invent your own interface for a specific case in your own code, chances are, hardly anyone other than you is ever going to write code that depends on that particular contract.
Dan Tao
@Dan Tao: The interface I was referring to was actually more generic than my comment made it sound. See my answer's definition of `IIndexedReadonlyCollection`. I can easily envision this interface being reused in similar situations throughout the code base. Furthermore, even interfaces that only have a single implementation can still be useful for telling consumers what they're expected to do with it. For example, the implementation in my example would expose Add, Remove, etc, but the interface will only permit read-only operations.
StriplingWarrior
You should really avoid exposing collections as modifiable properties or fields (fields or properties doesn't really matter, usually, unless you're writing a library thats updated without recompiling library users - rare). There's almost no reason to prefer List over an array as a property type, AFAIK, and several to prefer an array over a List - or better yet, something cleaner. Basically, any property should be at most one of the following: settable, of mutable return type, or virtual.
Eamon Nerbonne
+1  A: 

If I had to guess, I'd say that this developer was used to some other language like Java, and wasn't fully aware of standard practice in C#. The "get[Property]" nomenclature is very heavily used in Java, javascript, etc. C# replaces this with properties and indexers. Properties are every bit as powerful as getters and setters, but are easier to write and use. The only time you typically see "Get[something]" in C# is if:

  • The operation is likely to be expensive enough that you really want to drive home the fact that this is no simple member access (e.g. GetPrimeNumbers()), or
  • Your collection actually includes multiple indexed collections. (e.g. GetRow(int i) and GetColumn(int i)). Even in this case, it's more common to simply expose each of these indexed collections as a property unto itself, which is of an indexed type ("table.Rows[2]").

If you are only accessing these values in for loops, the collection should implement IEnumerable<Thing>, which would give you access to LINQ methods and the foreach construct. If you still need to have indexed-based getters, you should consider using your own interface which extends IEnumerable<T>, but additionally provides:

T this[int i] { get; }

This way, you don't give consumers the impression that they can Add and Remove objects in this collection.

Update

I know this is mostly a matter of style, which is subject to debate, but I really think the GetThings solution is not the correct way to do things. The following strategy, while it takes a little more work, is far more in keeping with the way that the standard .NET classes and frameworks are designed:

public class ThingHolderDataAccess
{
    public ThingHolder GetThingHolderForSomeArgs(int arg1, int arg2)
    {
        var oneThings = GetOneThings(arg1);
        var otherThings = GetOtherThings(arg2);
        return new ThingHolder(oneThings, otherThings);
    }
    private IEnumerable<OneThing> GetOneThings(int arg)
    {
        //...
        return new List<OneThing>();
    }
    private IEnumerable<AnotherThing> GetOtherThings(int arg2)
    {
        //...
        return new List<AnotherThing>();
    }
}

public class ThingHolder
{
    public IIndexedReadonlyCollection<OneThing> OneThings
    {
        get;
        private set;
    }

    public IIndexedReadonlyCollection<AnotherThing> OtherThings
    {
        get;
        private set;
    }

    public ThingHolder(IEnumerable<OneThing> oneThings,
                       IEnumerable<AnotherThing> otherThings)
    {
        OneThings = oneThings.ToIndexedReadOnlyCollection();
        OtherThings = otherThings.ToIndexedReadOnlyCollection();
    }
}

#region These classes can be written once, and used everywhere
public class IndexedCollection<T> 
    : List<T>, IIndexedReadonlyCollection<T>
{
    public IndexedCollection(IEnumerable<T> items)
        : base(items)
    {
    }
}

public static class EnumerableExtensions
{
    public static IIndexedReadonlyCollection<T> ToIndexedReadOnlyCollection<T>(
        this IEnumerable<T> items)
    {
        return new IndexedCollection<T>(items);
    }
}

public interface IIndexedReadonlyCollection<out T> : IEnumerable<T>
{
    T this[int i] { get; }
}
#endregion

Using the code above might look something like this:

var things = _thingHolderDataAccess.GetThingHolderForSomeArgs(a, b);
foreach (var oneThing in things.OneThings)
{
    // do something
}
foreach (var anotherThing in things.OtherThings)
{
    // do something else
}

var specialThing = things.OneThings[c];
// do something to special thing
StriplingWarrior
@StriplingWarrior thanks for the well-documented response. Since my code is dealing only with homogenous collections (and its completely under my control and will only be consumed by code under my control) this is a little to omuch overhead for my rather simple needs..
cori
@cori: I totally understand. When your code is only being consumed locally, it's not worth the added effort.
StriplingWarrior
... and in that case I'd suggest exposing the IList like Warren says. I see no point to having a GetThing method.
StriplingWarrior
A: 

Like the other answers have indicated, it's an issue of restricting access to the array (or list) itself.

In general, you don't want clients of the class to be able to directly modify the array itself. Hiding the implementation of the underlying list also allows you to change the implementation in the future without affecting any client code that uses the class.

Loadmaster