views:

332

answers:

9

Hi all.

Problem: I have a pretty big structure with POD variables, and I need to copy around some fields, but not others. Too lazy to write down a member-by-member copy function.

Solution: move the copy-able fields to the base, assign the base. Like this:

struct A
{
   int a, b, c;
};

struct B : public A
{
    int d, e, f;
};

//And copy:

B x, y;
(A&)x = y; //copies the part of B that is A

Now, this is dirty, I know. I had a live, livid discussion with co-workers re: this code, my competence, and my moral character. Yet the hardest specific charge I heard was "d, e, f are not initialized in the copy". Yes I know; that was the intent. Of course I initialize them elsewhere.

Another charge was "unsafe typecast". But this is a guaranteed-safe typecast to the base class! Is's almost like

((A*)&x)->operator=(b);

but less verbose. The derivation is public; so treating B as A is fair game. There's no undefined behavior, as far as I can see.

So, I'm appealing to the collective wizdom of SO. This is an invitation to criticism. Have a go at it, people.

EDIT: the final line of the snippet can be expanded into less offensive code in more than one way. For example:

void Copy(A& to, const A& from)
{
    to = from;
}

B x, y;
Copy(x, y);

Functionally the same. Or like this:

x.A::operator=(y);

EDIT2: there's no maintenance programmer but me. It's from a hobby project. So stop pitying that poor soul.

+2  A: 

Problem: I have a pretty big structure with POD variables, and I need to copy around some fields, but not others. Too lazy to write down a member-by-member copy function.

Then don't. (But then assignment will assign everything.)

Also, if you want slicing, you could do:

B b;
A a;
a = b;

Just having uninitialized members seems dirty indeed.

UncleBens
"Assign everything" is what I'm trying to avoid in the first place. And both source and target of copying are instances of B.
Seva Alekseyev
The target of the COPY is an A - you told it so!
Alex Brown
And why exactly does the target have to be B? - Well, perhaps it isn't that bad, if in the same function you are going to set the uninitialized fields in some other ways (e.g computed values).
UncleBens
+1 If you want only the fields of `A`, then use a `A`.
Matthieu M.
+1  A: 

The problem is that you are initialising a B using an A constructor. I'm surprised compilers allow this.

This means when you go to access the B, and use functions which expect it to have initialised c,d,e, it will not work as expected and may crash. Also if it's actually a class and has virtual functions, it will have a different vtable from that expected by the compiler.

Watch this:

#include <stdio.h>

void f(A& y)
{
  B x;
  x.d = 5;
  printf("%i\n",x.d);
}

void g(A& y)
{
  B x;
  (A&)x = y;                                                  
  printf("%i\n",x.d);                                         
}                                                             

main()                                                        
{                                                             
  B z;                                                        
  z.d = 3;                                                    
  f(z);                                                       
  g(z);                                                       
} 

and compile and run

g++ 1.cc
./a.out
5
5

explain that, please?

Alex Brown
I'm assigning, not initializing. And it does work as expected. :) Here's the train of thought: a reference to B can be safely cast to a reference to A. If you have two references to A, you can copy between them. The compiler-generated copy code for A does not care whether this A is actually a B.
Seva Alekseyev
updated with example of undesireable behaviour.
Alex Brown
Good one. Looks as if the copy operator is somehow virtual...
Seva Alekseyev
no, it's a problem with uninitialised data on the stack.
Alex Brown
I take that back. It's an artifact of stack layout in GCC. You're printf'ing the value of d, which is not initialized by design.
Seva Alekseyev
+3  A: 

Yes; this is dirty - you're intentionally slicing because you're too lazy to write B::CopyVariablesThatIWant(const B&). You're abusing the type system in a way that works for you, however you will most likely confuse and/or enrage any future programmers who have to look at your code and figure out it's intent.

Your coworkers are right, you should be ashamed of yourself.

Terry Mahaffey
Please explicate which part of the type system I'm abusing. The autogeneration of struct copy code? The public inheritance? The cast to the base? For the record, it's not work code, it's from a hobby project that no one will maintain but me.
Seva Alekseyev
@Seva: There are certain bodily functions that, however natural and desirable, I'd prefer you didn't do in public. Code like this is in the same category. If you're absolutely sure that the only person who will ever see this is you, then just don't tell me about it.
David Thornley
Around 20% of responders here don't consider this a bodily function. And the title actually says "dirty hack"; you've been warned :)
Seva Alekseyev
Terry Mahaffey
Well, so is overloading << for the purposes of stream I/O. I'm in a good company. :)
Seva Alekseyev
+3  A: 

Why not just do a member struct and do things explicitly:

struct A
{
    int a, b, c;
};

struct B
{
    A top_;
    int d, e, f;
};

//And copy:

B x, y;
x.top_ = y.top_;

And in my opinion the dirtiest part is unnecessary obfuscation of the code. Six month from now some poor soul will curse you trying to understand why in Batman's name it's done this way.

Nikolai N Fetissov
'Cause then every time I want to access a member of A given a reference to B, I have to type "_top.". And the code will be much heavier and harder to read. There's much more field access than copying in the program, so complicating field access to simplify copying is a bad idea.
Seva Alekseyev
Well, write a[n inline] function to do that then.
Nikolai N Fetissov
Thanks for the edit @atk
Nikolai N Fetissov
Seva, the decision might be more properly phrased as "which is true, A is-a B, or A has-a B." If it's the first then Nikolai's solution is incorrect; if it's the second then Nikolai's solution sounds more correct from a design standpoint. Either way, deciding whether to use inheritance or composition based on how much typing you have to do is unsavory. :)
Curt Nichols
A: 

You could get in trouble in the following situation:

struct A {
  char a;
  int b;
};

struct B : public A {
  char c;
};

a is allocated at offset 0 and b is allocated at offset 4 in the structure (padding to align b). c can be allocated at offset 1 because that space is not being used by A. If that happens, your assignment may clobber c (if A's copy constructor copies the unused padding, which it is allowed to do).

I'm not sure whether allocating c in A's padding space is allowed in C++. I wrote a Java compiler that does it, though, so it isn't unprecedented.

Keith Randall
Data members are required to be in order in C++, base class first, so this can't happen.
David Thornley
@Keith: No, that is not possible. The "sliced" copying is not allowed to clobber `B::c`. Once again, there's nothing "hackish" about what the OP does and the behavior is perfectly defined by the language. It is just that the specific use of that copying mentioned in the OP doesn't look too elegant, to say the least.
AndreyT
+2  A: 

What are the semantics that make some fields needed for the copy but not others?

The semantics of operator=() are that afterwards the two objects will be have equivalent observable state. (That is, after a = b;, a == b should return true.) Why you would want to violate those semantics and confuse your maintenance programmers is the real question. What possible long-term benefit do you see to not explicitly writing your MinimalClone() function, versus the long-term harm to ease of understanding your code?

Edit: There's always a maintenance programmer, unless you delete the code just after compilation. I can't count the number of times I've returned to something I wrote months prior and said "what was I thinking?!?" Be kind to your maintenance programmer, even if it's you.

Bill
There's no maintenance programmer; see around.
Seva Alekseyev
+6  A: 

It all depends on the context and on which part is supposedly "dirty" here.

Firstly, the "sliced copying" trick is technically legal. Formally, it is not really a "hack". You can also achieve the same result by using the qualified name of the assignment operator to refer to the operator from A

x.A::operator =(y); // same as `(A&) x = y` in your case

It is starting to look familiar, isn't it? Yes, that's exactly what you would do if you had to implement the assignment operator in the derived class, if you suddenly decided to do it manually

B& B::operator =(const B& rhs) 
{
  A::operator =(rhs); // or `this->A::operator =(rhs)`

  // B-specific part goes here
}

The A::operator =(rhs); part is exactly the same "sliced copying" trick as yours above, however in this case it is used in a different context. Nobody would, of course, blame you for the latter use, since that's how it is normally done and how it should be done. So, again, the "dirtiness" of the specific application of the trick depends on the context. It is perfectly fine as an integral part of the implementation of the derived assignment operator, but it might look highly questionable when used "by itself" as in your case.

However, secondly and more importantly, what I would call "dirty" in your case is not the use of the trick with "sliced copying" itself. From what you described in your question, it looks like you actually split your data structure into two classes (A and B) specifically for the purpose of being able to use the aforementioned trick. This is what I would call "dirty" in this case. Not the "sliced copying" trick itself, but rather the use of inheritance for the sole purpose of enabling the "sliced copying". If you did it just to avoid writing the assignment operator manually, that's an instance of blatant laziness. That's what's "dirty" here. I wouldn't recommend using inheritance for such purely utilitarian purposes.

AndreyT
Exactly; had I written x.A::operator =(y); no one would ever object :)) But I wasn't sure it's legit, since the struct does not override the assignment operator.
Seva Alekseyev
We would still object! It is only in the implementation of `B::operator=` that it is allowed. Why don't you simply do `x = y` anyway, the default generated version of `B::operator=` is made for those!
Matthieu M.
'Cause I don't want d, e, f to be copied.
Seva Alekseyev
To me the main difference of using the `A::operator=` within the `B::operator=` as compared to using it in outside code is that `B` has control over all its invariants. Consider that `B` could have data is dependent on `A` subobject, if `B::operator=` uses `A::operator=` as a step is a detail of implementation: it can still provide the same invariants in a later step (within the same operation). Doing it from outside will not allow the `B` part of the object to know that the `A` part has changed, possibly breaking invariants.
David Rodríguez - dribeas
... consider a `vehicle` that contains a `wheels` field, and is extended by `truck` that has a vector of `wheels` elements conaining the `nominal_tyre_pressure`. If external code changes the `vehicle` subobject without the `truck` subobject knowledge, it will be unable to resize the `nominal_tyre_pressure` vector breaking an invariant and possibly leading to UB (access elements beyond the end of the vector) since `truck` code will assume the appropriate size.
David Rodríguez - dribeas
Um-kay, that's legitimate. The technique does not generalize to arbitrary classes, specifically the ones that have invariants that involve both base and derived. Point taken.
Seva Alekseyev
A: 

How about:

struct A
{
    int a;
    int b;
    int c;
};
struct B: public A
{
    int d;
    int e;
    int f;
};

int main()
{
    B   x;
    B   y;
    static_cast<A&>(x)  = y;

}
Martin York
I think i'm missing something. Isn't that what he does?
Johannes Schaub - litb
It is, though using C-cast. I don't think that the `dirty` refer to the type of casting used though ;)
Matthieu M.
No difference. It just looks neater in C++ syntax rather than C (No C cast or pointers)
Martin York
A: 

The big issue I have against this is that it saves you only a little bit of work, and leaves a big "Huh?" for the next guy to come around. Commenting it is no better, as it still leaves something extra to be comprehended and is more typing than just doing it right would be.

After decades of poring over assorted code, I've come to really resent anything that gets in the way of just reading and understanding.

David Thornley
After decades of poring over code, you can't just read and understand a single assignment statement? :) Or you're thinking of some hypothetical "maintenance programmer"? Or you're making a general statement to the effect of "don't try to be smart, kid"?
Seva Alekseyev
Oh, I can understand it all right. However, since it's screwy, I have to make a small mental effort to figure out what it's doing and why. Individually, it's no big deal (except for the impression it gives me of the original coder). Start putting more of those in the code and reading it gets slower and more arduous. Generally, it's harder to understand code than it was to write it in the first place. If you're as clever and cutsie as you can be when writing it, comprehending it is going to be a real annoyance.
David Thornley
Which brings me back to the original question: what exactly is screwy here? You never answered to that. Do you think it's screwy 'cause I just told you so? Gimme a technical argument.
Seva Alekseyev