I'm designing an API for the first time, and trying to follow SOLID guidelines. One of the things I find myself struggling with is balancing OCP and testability with simplicity and ease of extensibility.
This open-source API is geared toward scientific modeling and computation. The aim is that various groups will be able to easily import their particular models into this "pluggable" architecture. Thus, its success as a project will depend on the ease by which these scientists can impart their domain-specific knowledge without unnecessary overhead or too steep of a learning curve.
For example, our compute engine relies on "vectorized" computations - we rarely need just one scalar value computed. Many models can take advantage of this and perform "overhead" calculations to be re-used in each scalar sub-computation. BUT, I'd like a user to be able to define a simple scalar operation which will inherit (or otherwise be provided with) a default vectorization behavior.
My goals were to make it
1) as simple as possible for novice user to implement their basic computational model 2) as simple as possible for advanced user to override the vectorization behavior
...of course while maintaining SoC, testability, etc.
After a few revisions, I have something simple and object-oriented. Computation contracts are defined via interfaces, but users are encouraged to derive from an abstract ComputationBase class that will provide the default vectorization. Here's a scaled-down representation of the design:
public interface IComputation<T1, T2, TOut>
{
TOut Compute(T1 a, T2 b);
}
public interface IVectorizedComputation<T1, T2, TOut>
{
IEnumerable<TOut> Compute(IEnumerable<T1> a, IEnumerable<T2> b);
}
public abstract class ComputationBase<T1, T2, TOut> :
IComputation<T1, T2, TOut>,
IVectorizedComputation<T1, T2, TOut>
{
protected ComputationBase() { }
// the consumer must implement this core method
public abstract TOut Compute(T1 a, T2 b);
// the consumer can optimize by overriding this "dumb" vectorization
// use an IVectorizationProvider for vectorization capabilities instead?
public virtual IEnumerable<TOut> Compute(IEnumerable<T1> a, IEnumerable<T2> b)
{
return
from ai in a
from bi in b
select Compute(ai, bi);
}
}
public class NoobMADCalculator
: ComputationBase<double, double, double>
{
// novice user implements a simple calculation model
// CalculatorBase will use a "dumb" vectorization
public override double Compute(double a, double b)
{
return a * b + 1337;
}
}
public class PwnageMADCalculator
: ComputationBase<double, double, double>
{
public override double Compute(double a, double b)
{
var expensive = PerformExpensiveOperation();
return ComputeInternal(a, b, expensive);
}
public override IEnumerable<double> Compute(IEnumerable<double> a, IEnumerable<double> b)
{
foreach (var ai in a)
{
// example optimization: only perform this operation once
var expensive = PerformExpensiveOperation();
foreach (var bi in b)
{
yield return ComputeInternal(ai, bi, expensive);
}
}
}
private static double PerformExpensiveOperation() { return 1337; }
private static double ComputeInternal(double a, double b, double expensive)
{
return a * b + expensive;
}
}
For the vectorized Compute in ComputationBase I had originally used a provider pattern (via constructor DI), but kept the scalar Compute as abstract. The rationale was that this was good "protected variation" - the base class would always "own" the vectorization operation, but delegate the computation to the injected provider. This also seemed generally beneficial from a testability and vectorization code re-use standpoint. I had the following problems with this, however:
1) The heterogeneity of the approaches for scalar (inheritance) and vector (provider) computations seemed likely to confuse the user, seemed overly complex for the requirements, and just had bad code smell.
2) Creating a "separate" provider for the vectorization was a leaky abstraction - if a provider was to do anything smart, it would typically need inside knowledge of the class' implementation. I found myself creating private nested classes to implement them, which told me it was a concern that couldn't be separated
Is this a good approach w/r/t OCP vs testability vs simplicity? How have others designed their API for extension at various levels of complexity? Would you use more dependency injection mechanisms than I've included? I'm also just as interested in good general references on good API design than I am answers to this particular example. Thanks.
Thanks, David