tags:

views:

194

answers:

7

Hello all,

You have a class that many libraries depend on. You need to modify the class for one application. Which of the following changes require recompiling all libraries before it is safe to build the application?

  • add a constructor
  • add a data member
  • change destructor into virtual
  • add an argument with default value to an existing member function

Thank you

+10  A: 

Strictly speaking, you end up in Undefined Behavior land as soon as you do not recompile for any of those reasons.

That said, in practice you might get away with a few of them:

  • add a constructor

Might be Ok to use as long as

  1. it's not the first user-defined constructor to the class
  2. it's not the copy constructor
  • add a data member

This changes the size of instances of the class. Might be Ok for anyone who just uses pointers or references.

  • change destructor into virtual

This changes the class' virtual table, so it needs recompilation.

  • add an argument with default value to an existing member function

Since default arguments are inserted at the call site, everyone using this needs to recompile. (However, using overloading instead of default arguments might you get away with that.)

Note that any inlined member function could render any of the above wrong, since the code of those is directly embedded (and optimized) in the clients' code.

However, the safest bet would be to just recompile everything. Why is this an issue?

sbi
Good answer. Some teachers are sometimes concerned about weird things. Anyone who uses a decent makefile system shouldn't care about recompilation and changes.
ereOn
I disagree (and downvoted). Constructor, depends on what constructor was added, if it is default/copy constructor, most probably it will break for any code that used the implicitly defined versions. Data member: unless at the end with same access level this changes the memory layout, any code that depends on the layout will fail, even if accessing through references/pointers. Virtual destructor: not only code using the destructor, creating a new virtual method changes the memory footprint of the vtable, members may be assigned to other slots and function calls get dispatched to the wrong code.
David Rodríguez - dribeas
@ereOn - not so simple. If you supply a binary only library it is very important to know if you are breaking binary compatibility. Software for which you do not have source code may break if BC is broken
doron
@doron: What do you mean by "binary only library" ? I guess you don't mean "a library without headers" ? Anyway, since determining when changes in header cause binary changes is difficult and error prone, I assume that any decent library provider will recompile its library when some interface changes.
ereOn
@David: Re dctor/cctor: fair enough, I've addressed this now. Re class layout: I don't see what might this be except for inlined functions, which I have addressed. Re virtual dtor: You're right, I've fixed that, too.
sbi
@ereOn: Things are not so simple, one simple situation where this kind of questions arise are plugins. As much as possible you would like your newest version of the program to accept existing decoders. API/ABI changes that break this compatibility imply that all of the existing plugins need to be recompiled for the newest version... and in some cases you will not have the possibility and will depend on third parties to update their side.
David Rodríguez - dribeas
@sbi: That is an improvement, I still feel that the 'add member' looks much simpler than it is. If the destructor is being made virtual, that seems to imply that the class is to be derived, any code that explicit or implicitly casts derived objects to the class type will fail, as in code seeing the old definition the offset of the subobject is smaller than it is in real life, so the cast will end up adapting the pointer incorrectly, and then anything can happen.
David Rodríguez - dribeas
@eteOn - I mean any time where you do not have full source code that you can recompile. Yes, when you change a header, you recompile the library, but sometimes you want to upgrade a library that interacts with a binary that cannot be recompiled. In those instances, you need to be really sure that the changes made will not break the other binary that cannot be recompiled. And it is jolly hard to be 100% sure.
doron
+3  A: 

All of them need to recompile all the libraries that use the class. (provided they include the .h file)

Armen Tsirunyan
+1  A: 

By using ordinal export .def file to maintain Application Binary Interface, you can avoid client recompilation in many cases:

  • Add a constructor

    Export this constructor function to end of export table with largest ordinal number. Any client code doesn't call this constructor need not compile.

  • Add a data member

    This is a break if client code manipulates class object directly, not through pointer or reference.

  • Change destructor into virtual

    This is probably a break, if your class doesn't have any other virtual function, which means now your class has to add a vptr table and increase class object size and change memory layour. If your class has already have a vptr table, moving destructor to end of vptr table won't affect object layout in terms of backward compatibility. But if client class is derived from your class and has defined its own virtual function then it breaks. And also any client calling original non-virtual destructor will break.

  • Add an argument with default value to an existing member function

    This is definitely a break.

Sheen
Sheen, #2 is not Ok when clients direct manipulate instances of the class (as opposed to manipulating pointers or references) or inlined member functions access data.
sbi
sbi, thanks and I've updated my answer.
Sheen
@Sheen - unless you know you how your platform guarantees v-table ordering you will find almost any change to the v-table will break bc
doron
doron, thanks for comment.
Sheen
Sheen, you need to properly @address people in comment replies, or your replies won't show up on their replies tab. (I only saw your reply by accident.)
sbi
+3  A: 

sbi's answer is pretty good (and deserves to be voted up to top). However I think I can expand the "maybe ok" into something more concrete.

  • Add a constructor

    If the constructor you've added is the default constructor (or indeed a copy constructor) then you have to be careful. If previously not available then they will have been automatically generated by the compiler (as such a recompilation is required to ensure they are using the actual constructor that has been implemented). For this reason I tend to always hide or define these constructors for classes that form some API.

CodeButcher
Indeed. Good practice to hide the famous implicit functions unless the design need them.
Sheen
Ah, good point which I forgot about. That makes me wonder how much we overlook. Really, it's best to not to do this at all.
sbi
Yes, best to get it right the first time (if that is ever possible ;) - speaking from experience, supporting a library with binary and source compatibility requirements for 3rd parties can put you in a situation where you have to know what you can (and cannot) get away with safely.
CodeButcher
+6  A: 

Classes are defined in the header file. The header file will be compiled into both the library that implements the class and the code that uses the class. I am assuming that you are taking as a given that you will need to recompile the class implementation after changing the class header file and that the question you are asking is whether you will need to recompile any code that references the class.

The problem that you are describing is one of binary compatibility (BC) and generally follows the following rules:

  1. Adding non-virtual functions anywhere in the class does not break BC.
  2. Changing any function definition (adding parameters )will break BC.
  3. Adding virtual functions anywhere changes the v-table and therefore breaks BC.
  4. Adding data members will break BC.
  5. Changing a parameter from non-default to default will not break BC.
  6. Any changes to inline functions will break BC (inline function should therefore be avoided if BC is important.)
  7. Changing compiler (or sometimes even compiler versions) will probably break BC unless the compilers adhere strickly to the same ABI.

If BC is a major issue for the platform you are implementing it could well be a good idea to separate out the interface and implementation using the Bridge pattern.

As an aside, the C++ language does not deal with the Application Binary Interface (ABI). If binary compatibility is a major issue, you should probably refer to your platform's ABI specification for more details.

Edit: updated adding data members. This will break BC because more memory will now be needed for the class than before.

doron
That's a pretty good answer (`+1` from me). However, you might want to add something about inlined functions.
sbi
@doron, great comprehensive list. I have updated my answer accordingly. Thanks. COM was invented to deal with ABI issue.
Sheen
Using a different C++ compiler (or sometimes using different versions of the same C++ compiler) may break BC. G++ has changed ABIs a couple times in the last decade.
Ken Bloom
@Ken added comment to list
doron
@doron: re #7: so will changing the version of any library used, including the std lib.
sbi
@sbi yes. If the ABI changes, upgrading the standard library could well break compatibility although people are a lot more careful about formalizing the ABI and ensuring it does not change. Similarly Using a newer compiler with an older std library can cause problems.
doron
A: 

I am clearly against @sbi answer: in general you do need to recompile. Only under much more strict circumstances than the ones he posted you may get away.

  • add a constructor

If the constructor added is either the default constructor or the copy constructor, any code that used the implicitly defined version of it and does not get recompiled will fail to initialize the object, and that means that invariants required by other methods will not be set at construction, i.e. the code will fail.

  • add a data member

This modifies the layout of the object. Even code that only used pointers or references need to be recompiled to adapt to the change in layout. If a member is added at the beginning of the object, any code that used any member of the object will be offset and fail.

struct test { 
   // int x; // added later
   int y;
};
void foo( test * t ) {
   std::cout << t->y << std::endl;
}

If foo was not recompiled, then after uncommenting x it would print t->x instead of t->y. If the types did not match it would even be worse. Theoretically, even if the added member is at the end of the object, if there are more than one access modifier the compiler is allowed to reorder members and hit the same issue.

  • change destructor to virtual

If it is the first virtual method it will change the layout of the object and get all of the previous issues plus the addition that deleting through a reference to the base will call the base destructor and not be dispatched to the correct method. In most compilers (with vtable support) it can imply a change in the memory layout of the vtable for the type, and that means that the wrong method can be called and cause havoc.

  • add an argument with default value

This is a change in function signature, all code that used the method before will need to be recompiled to adapt to the new signature.

David Rodríguez - dribeas
In fact, you are right regarding "add a data member". But I guess @sbi and me and other people has implicit assumption that, data members are hidden from client code. Client can only operate via member functions. So under this assumption there is no BC issue when client accesses object via pointer or reference, and calls member functions.
Sheen
@Sheen: I usually don't publish the data members directly, but it is a common pattern to provide inlined accessors in the class definition: `class test { int m_size; public: int size() const { return m_size; } /*...*/ };` while there is no direct access to the member as such, the fact is that there is: the function will be inlined and the compiler will inject the offset inside the object at the place of call.
David Rodríguez - dribeas
Remember context of this discussion is serious library design. Implementation exposure is very bad. It implies that inline is very bad. I won't write any public inline function in this scenario.
Sheen
@David: Inlined member functions have been dealt with in my answer.
sbi
@Sheen: I probably would. The std lib is a seriously designed library and full of inlined accessors. You wouldn't want `std::vector::size()` not to be inline. (How well inlining works with polymorphism is another issue. But still, even polymorphic classes might have non-polymorphic parts that could benefit from inlining.)
sbi
@Sheen: lol... I cannot possibly consider the question of whether I need to recompile dancing in the edge of the blade trying not to get cut in the same context as "serious library design"
David Rodríguez - dribeas
A: 

As soon as you change anything in the header file (hpp file), you have to recompile everything that depends on it.

However if you change the source file (cpp file), you have to recompile just the library which contains needs definitions from this file.

The easy way to break physical dependencies, where all libraries in the upper tier needs to recompile is to use the pimpl idiom. Then, as long as you don't touch the header files, you just need to compile the library where the implementation is being modified.

VJo