views:

478

answers:

4

union members may not have destructors or constructors. So I can't template the following class Foo on my own MyClass if MyClass has a constructor:

template<class T>
struct Foo {
  T val;
  Foo(T val_) : val(val_) {}
  size_t hash() const {
    union {T f; size_t s;} u = { val };
    return u.s;
  }
};
struct MyClass {
  bool a;
  double b;
  MyClass(bool a_, double b_) : a(a_), b(b_) {}
};

If I do it anyway I get this error:

member 'MyClass Foo<T>::hash() const 
[with T = MyClass]::<anonymous union>::f' with constructor 
not allowed in union

To get around it I created MyClass with an awkward construction function, which copies the thing around first:

struct MyClass {
  bool a;
  double b;
};
MyClass createMyClass(bool a, double b) {
  MyClass m;
  m.a = a;
  m.b = b;
  return m;
}

But I'm wondering if there is a better way than using this createMyClass function. A constructor would be more efficient, and as a critical component MyClass and Foo<MyClass> are constructed millions of times in my code.

std::pair

I'm also a bit surprised that it is possible to use std::pair in the union:

Foo<std::pair<bool, double> > f2(std::make_pair(true, 3.12));

To my knowledge, std::pair (see code) has a constructor?

+5  A: 

EDIT: My original stance on std::pair was wrong, it shouldn't be allowed in a union. For a class to be a valid member of a union it must have a trivial constructor according to standard 9.5.1. The definition of a trivial constructor is this, from paragraph 12.1.5:

If there is no user-declared constructor for class X, a default constructor is implicitly declared. An implicitly-declared default constructor is an inline public member of its class. A constructor is trivial if it is an implicitly-declared default constructor and if:

  • its class has no virtual functions and no virtual base classes, and
  • all the direct base classes of its class have trivial constructors, and
  • for all the nonstatic data members of its class that are of class type (or array thereof), each such class has a trivial constructor

Paragraph 20.2.2.2 states that the following constructor must be available in a pair:

pair(const T1& x, const T2& y);

as soon as this constructor is supplied no default constructor will be implicitly declared.

The funny thing here is that my compiler (Visual Studio 2008) seems to give std::pair special treatment. If I copy the code from the std::pair implementation and place it in my own namespace foo the unions don't work :)

namespace foo {
    template<class _Ty1, class _Ty2> struct pair {
        typedef _Ty1 first_type;
        typedef _Ty2 second_type;
        pair() : first(_Ty1()), second(_Ty2()) {
        }
    }
}

//This doesn't work in VC2008
union Baz {
    foo::pair<bool, double> a;
    int b;
}
//This works in VC2008
union Buz {
    std::pair<bool, double> a;
    int b;
}

Your solution is a common way of getting around this problem. I usually prepend the class name with a C (short for construct) to partially mimic the ordinary constructor syntax, this would in your case become CMyClass(a, b).

As Steve and Matthieu has pointed out you're not using a very good hash function though. Firstly there's no real guarantee (I think, please correct me if I'm wrong) that f and s in the union will even partially occupy the same memory space, and secondly even if they in practice will probably will share the first min(sizeof(s), sizeof(f)) bytes this means that for MyClass you're only hashing on part of the value. In this case you will hash on the value of the bool a, in this case there's two options:

  1. Your compiler uses int as the internal representation for the bool in which case your hash function will only return two values, one for true and one for false.
  2. Your compiler uses char as the internal representation for the bool. In this case the value will probably be padded to at least sizeof(int), either with zeroes in which case you have the same situation as 1. or with whatever random data is on the stack when MyClass is allocated which means you get random hash values for the same input.

If you need to hash by the entire value of T I would copy the data into a temporary buffer like Steve suggests and then use one of the variable-length hash functions discussed here.

Andreas Brinck
`std::pair` doesn't have a trivial constructor - it's not implicitly-declared. GCC (4.1.2) gives the same error when I try to put one in a union; perhaps some compilers are more lenient.
Mike Seymour
How can the constructor be trivial, since pair has a user-defined constructor taking two parameters?
Luc Touraille
@Mike Hmm, maybe I don't understand the definition of implicitly defined correctly. The standard says: "An implicitly-declared default constructor is an inline public member of its class" + a bunch of more stuff which seems to apply to `std::pair`'s constructor?
Andreas Brinck
@Luc the constructor taking two parameters isn't the default constructor.
Andreas Brinck
@Andreas: from 12.1.5, "If there is no user-declared constructor, a default constructor is implicitly declared". `pair`'s constructors are all user-declared, so not implicitly-declared, so non-trivial.
Mike Seymour
@Mike Just read the passage a little bit more careful. Thanks!
Andreas Brinck
So, to conclude, the fact that my compiler does not complain when I use the `pair` means I found a compiler bug. It's in `gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)`.
+2  A: 

Regarding the use of an std::pair as a union member, I think it should be disallowed. The standard says (§12.1):

A union member shall not be of a class type (or array thereof) that has a non-trivial constructor.

So any class with a user-defined constructor cannot be used in a union, since the default constructor will no longer be implicitly declared. Now in the specification of std::pair (§20.2.2), it is explicitly stated that pair implementations must provide a parameterized constructor to initialize both values. Consequently, either the pair implementation or union implementation you use does not comply to the standard.

N.B. : Testing the code you gave on Comeau gives the following error:

"ComeauTest.c", line 8: error: invalid union member -- class
          "std::pair<bool, double>" has a disallowed member function
      union {T f; size_t s;} u = { val };
               ^
          detected during instantiation of "unsigned int Foo<T>::hash() const
                    [with T=std::pair<bool, double>]" at line 22
Luc Touraille
Thanks for the explanation and trying it out with `Comeau`. I use `gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)`, and it does not complain when I use std::pair. It does have the union constructor with two args, so I guess its `union` implementation is buggy.
+2  A: 

I have only one question: why using a union ?

From what I understand, the hash should correspond to the first few bytes of your objects. If you are going to do this, why not:

size_t hash() const {
  return reinterpret_cast<size_t>(val);
}

which should accomplish the same trick (I think) with more efficiency since there is no allocation of an object of size sizeof(T) on the stack.

Matthieu M.
+2  A: 

I would replace this:

size_t hash() const {
    union {T f; size_t s;} u = { val };
    return u.s;
}

With this:

size_t hash() const {
    size_t s = 0;
    memcpy(&s, &val, std::min(sizeof(size_t), sizeof(T)));
    return s;
}

Copies the smaller of the two sizes rather than the larger, and if memcpy is an intrinsic on your compiler then you're looking good for optimisation. Most importantly, though, it doesn't matter what constructors T has.

It's not a good hash function, though, if T is a large type. In your example MyClass, you might find that bool and size_t are the same size in your implementation, hence the double doesn't participate in the hash at all so there are only two possible hashed values.

Still, it could be worse. If T has any virtual functions, you'll probably find that all instances hash to the same value: the address of the vtable...

Steve Jessop
+1 I was looking at the union-thing so intently that I totally missed the bad hash function.
Andreas Brinck