views:

55

answers:

1

I'm designing an API for the first time, and trying to follow SOLID guidelines. One of the things I find myself struggling with is balancing OCP and testability with simplicity and ease of extensibility.

This open-source API is geared toward scientific modeling and computation. The aim is that various groups will be able to easily import their particular models into this "pluggable" architecture. Thus, its success as a project will depend on the ease by which these scientists can impart their domain-specific knowledge without unnecessary overhead or too steep of a learning curve.

For example, our compute engine relies on "vectorized" computations - we rarely need just one scalar value computed. Many models can take advantage of this and perform "overhead" calculations to be re-used in each scalar sub-computation. BUT, I'd like a user to be able to define a simple scalar operation which will inherit (or otherwise be provided with) a default vectorization behavior.

My goals were to make it

1) as simple as possible for novice user to implement their basic computational model 2) as simple as possible for advanced user to override the vectorization behavior

...of course while maintaining SoC, testability, etc.

After a few revisions, I have something simple and object-oriented. Computation contracts are defined via interfaces, but users are encouraged to derive from an abstract ComputationBase class that will provide the default vectorization. Here's a scaled-down representation of the design:

public interface IComputation<T1, T2, TOut>
{
    TOut Compute(T1 a, T2 b);
}

public interface IVectorizedComputation<T1, T2, TOut>
{
    IEnumerable<TOut> Compute(IEnumerable<T1> a, IEnumerable<T2> b);
}

public abstract class ComputationBase<T1, T2, TOut> :
    IComputation<T1, T2, TOut>,
    IVectorizedComputation<T1, T2, TOut>
{
    protected ComputationBase() { }

    // the consumer must implement this core method
    public abstract TOut Compute(T1 a, T2 b);

    // the consumer can optimize by overriding this "dumb" vectorization
    // use an IVectorizationProvider for vectorization capabilities instead?
    public virtual IEnumerable<TOut> Compute(IEnumerable<T1> a, IEnumerable<T2> b)
    {
        return
            from ai in a
            from bi in b
            select Compute(ai, bi);
    }
}

public class NoobMADCalculator
    : ComputationBase<double, double, double>
{
    // novice user implements a simple calculation model
    // CalculatorBase will use a "dumb" vectorization
    public override double Compute(double a, double b)
    {
        return a * b + 1337;
    }
}

public class PwnageMADCalculator
    : ComputationBase<double, double, double>
{
    public override double Compute(double a, double b)
    {
        var expensive = PerformExpensiveOperation();
        return ComputeInternal(a, b, expensive);
    }

    public override IEnumerable<double> Compute(IEnumerable<double> a, IEnumerable<double> b)
    {
        foreach (var ai in a)
        {
            // example optimization: only perform this operation once
            var expensive = PerformExpensiveOperation();
            foreach (var bi in b)
            {
                yield return ComputeInternal(ai, bi, expensive);
            }
        }
    }

    private static double PerformExpensiveOperation() { return 1337; }

    private static double ComputeInternal(double a, double b, double expensive)
    {
        return a * b + expensive;
    }
}

For the vectorized Compute in ComputationBase I had originally used a provider pattern (via constructor DI), but kept the scalar Compute as abstract. The rationale was that this was good "protected variation" - the base class would always "own" the vectorization operation, but delegate the computation to the injected provider. This also seemed generally beneficial from a testability and vectorization code re-use standpoint. I had the following problems with this, however:

1) The heterogeneity of the approaches for scalar (inheritance) and vector (provider) computations seemed likely to confuse the user, seemed overly complex for the requirements, and just had bad code smell.

2) Creating a "separate" provider for the vectorization was a leaky abstraction - if a provider was to do anything smart, it would typically need inside knowledge of the class' implementation. I found myself creating private nested classes to implement them, which told me it was a concern that couldn't be separated

Is this a good approach w/r/t OCP vs testability vs simplicity? How have others designed their API for extension at various levels of complexity? Would you use more dependency injection mechanisms than I've included? I'm also just as interested in good general references on good API design than I am answers to this particular example. Thanks.

Thanks, David

+1  A: 

If you can live without inheritance, you can just use Funcs. They offer a simple way to pass around arbitrary code and can offer something much simpler. Basically this:

Func<double, double, double> pwnageComputation;//takes 2 doubles and returns one double
pwnageComputation = (num1, num2) => 
{
    if (num1 + num2 > 1337)
        return 1;
    else if (num1 + num2 < 1337)
        return -1;
    return 0;
}

Func<>s are an implementation of Lambda expressions, which are basically wrappers around delegates to make them easier to use (, at least in c#). In this way, you can have your users write ad-hoc functions (similar to your code) but without the complexity of a class definition (they only need to provide a function). You can learn more about them here (second half) or here.

RCIX