views:

548

answers:

8

Assuming a largish template library with around 100 files containing around 100 templates with overall more than 200,000 lines of code. Some of the templates use multiple inheritance to make the usage of the library itself rather simple (i.e. inherit from some base templates and only having to implement certain business rules).

All that exists (grown over several years), "works" and is used for projects.

However, compilation of projects using that library consumes a growing amount of time and it takes quite some time to locate the source for certain bugs. Fixing often causes unexpected side effects or is quite difficult, because some interdependent templates need changing. Testing is nearly impossible due to the sheer amount of functions.

Now, I would really like to simplify the architecture to use less templates and more specialized smaller classes.

Is there any proven way to go about that task? What would be a good place to start?

+2  A: 

Well, the problem is that template way of thinking is very different from object-oriented inheritance-based way. It's hard to answer anything else than "redesign the whole thing and start from scratch".

Of course, there may be a simple way for a particular case. We can't tell without knowing more about what you have.

The fact that the template solution is so difficult to maintain is an indication of a poor design anyway.

ckarmann
I agree this sounds like the real problem is not templates but bad design.
Leon Timmermans
+7  A: 

You need automated tests, that way in ten years time when your succesor has the same problem he can refactor the code (probably to add more templates because he thinks it will simplify usage of the library) and know it still meets all test cases. Similarly the side effects of any minor bug fixes will be immediately visible (assuming your test cases are good).

Other than that, "divide and conqueor"

Patrick
+3  A: 

Write unit tests.

Where the new code must do the same as the old code.

That's one tip at least.

Edit:

If you deprecate old code that you have replaced with the new functionality you can phase over to the new code little by little.

Nailer
A: 

As mentioned, unit tests are a good idea. Indeed, rather than breaking your code by introducing "simple" changes that are likely to ripple out, just focus on creating a suite of tests, and fixing non-compliance with the tests. Have an activity to update the tests when bugs come to light.

Beyond that, I would suggest upgrading your tools, if possible, to help with debugging template-related problems.

Marcin
+2  A: 

Some points (but note: these are not evil indeed. If you want to change to non-template code, though, this can help out):


Lookup your static interfaces. Where do templates depend on what functions exist? Where do they need typedefs?

Put the common parts in an abstract base class. A good example is when you happen to stumble over the CRTP idiom. You can just replace it with an abstract base class having virtual functions.

Lookup integer lists. If you find your code uses integral lists like list<1, 3, 3, 1, 3>, you can replace them with std::vector, if all the codes using them can live with working with runtime values instead of constant expressions.

Lookup type traits. There is much code involved checking whether some typedef exists, or whether some method exists in typical templated code. Abstract baseclasses solve these two issues by using pure virtual methods, and by inheriting typedefs to the base. Often, typedefs are only needed to trigger hideous features like SFINAE, which would then be superfluous too.

Lookup expression templates. If your code uses expression templates to avoid creating temporaries, you will have to eliminate them and use the traditional way of returning / passing temporaries to the operators involved.

Lookup function objects. If you find your code uses function objects, you can change them to use abstract base classes too, and have something like void run(); to call them (or if you want to keep using operator(), better so! It can be virtual too).

Johannes Schaub - litb
+12  A: 

I'm not sure I see how/why templates are the problem, and why plain non-templated classes would be an improvement. Wouldn't that just mean even more classes, less type safety and so larger potential for bugs?

I can understand simplifying the architecture, refactoring and removing dependencies between the various classes and templates, but automatically assuming that "fewer templates will make the architecture better" is flawed imo.

I'd say that templates potentially allow you to build a much cleaner architecture than you'd get without them. Simply because you can make separate classes totally independent. Without templates, classes functions which call into another class must know about the class, or an interface it inherits, in advance. With templates, this coupling isn't necessary.

Removing templates would only lead to more dependencies, not fewer. The added type-safety of templates can be used to detect a lot of bugs at compile-time (Sprinkle your code liberally with static_assert's for this purpose)

Of course, the added compile-time may be a valid reason to avoid templates in some cases, and if you only have a bunch of Java programmers, who are used to thinking in "traditional" OOP terms, templates might confuse them, which can be another valid reason to avoid templates.

But from an architecture point of view, I think avoiding templates is a step in the wrong direction.

Refactor the application, sure, it sounds like that's needed. But don't throw away one of the most useful tools for producing extensible and robust code just because the original version of the app misused it. Especially if you're already concerned with the amount of code, removing templates will most likely lead to more lines of code.

jalf
Good points! Thanks for making me think in a different direction.
HS
A: 

I've often come across legacy templates that were huge and required a lot of time and memory to instantiate, but didn't need to be. In those cases, the easiest way to cut out the fat was to take all of the code that didn't rely on any of the template arguments and hide it in separate functions defined in a normal translation unit. This also had the positive side-effect of triggering fewer recompiles when this code had to be slightly modified or documentation changed. It sounds rather obvious, but it's really surprising how often people write a class template and think that EVERYTHING it does has to be defined in the header, rather than just the code that needs the templated information.

Another thing you might want to consider is how often you clean up the inheritance hierarchies by making the templates "mixin" style instead of aggregations of multiple inheritance. See how many places you can get away with making one of the template arguments the name of the base class that it should derive from (the way boost::enable_shared_from_this works). Of course this typically only works well if the constructors take no arguments, as you don't have to worry about initializing anything correctly.

Michel
+1  A: 

As I understand, you are most concerned with build times, and the maintainability of your library?

First, don't try to "fix" all at once.

Second, understand what you fix. Template complexity is there often for a reason, e.g. to enforce certain use, and make the compiler help you not make a mistake. That reason might sometimes be taken to far, but throwing out 100 lines because "noone really knows what they do" shouldn't be taken lightly. Everything I suggest here can introduce really nasty bugs, you have been warned.

Third, consider cheaper fixes first: e.g. faster machines or distributed build tools. At least, throw in all the RAM the boards will take, and throw out old disks. It does maike a difference. One drive for OS, one drive for build is a cheap mans RAID.

Is the library well documented? That's your best chance at making it Look into tools such as doxygen that help you create such a documentation.

All considered? OK, now some suggestions for the build times ;)


Understand the C++ build model: every .cpp is compiled individually. That means many .cpp files with many headers = huge build. This is NOT an advise to put everything into one .cpp file, though! However, one trick (!) that can speed up a build immensely is to create a single .cpp file that includes a bunch of .cpp files, and only feed that "master" file to the compiler. You can't do that blindly, though - you need to understand the types of errors this could introduce.

If you don't have one yet, get a separate build machine that you can remote into. You'll have to do a lot of almost-full builds to check if you broke some include. You will want to run this in another machine, that doesn't block you from working on something else. Long term, you'll need it for daily integration builds anyway ;)

Use precompiled headers. (scales better with fast machines, see above)

Check your header inclusion policy. While every file should be "independent" (i.e. include everything it needs to be included by someone else), don't include liberally. Unfortunately, I haven't yet found a tool to find unnecessary #incldue statements, but it might help to spend some time removing unused headers in "hotspot" files.

Create and use forward declarations for the templates you use. Often, you can incldue a header with forwad declarations in many places, and use the full header only in a few specific ones. This can greatly help compile time. Check the <iosfwd> header how the standard library does that for i/o streams.

overloads for templates for few types: If you have a complex function template that is useful only for a very few types like this:

// .h
template <typename FLOAT> // float or double only
FLOAT CalcIt(int len, FLOAT * values) { ... }

You can declare the overloads in the header, and move the template to the body:

// .h
float CalcIt(int len, float * values);
double CalcIt(int len, double * values);

// .cpp
template <typename FLOAT> // float or double only
FLOAT CalcItT(int len, FLOAT * values) { ... }

float CalcIt(int len, float * values) { return CalcItT(len, values); }
double CalcIt(int len, double * values) { return CalcItT(len, values); }

this moves the lengthy template to a single compilation unit.
Unfortunately, this is only of limited use for classes.

Check if the PIMPL idiom can move code from the headers into .cpp files.

The general rule that hides behind that is separate the interface of your library from the implementation. Use comments, detail namesapces and separate .impl.h headers to mentally and physically isolate what should be known to the outside from how it is accomplished. This exposes the real value of your library (does it actually encapsulate complexity?), and gives you a chance to replace "easy targets" first.


More specific advise - and how useful the one given is - depends largely on the actual library.

Good luck!

peterchen