views:

89

answers:

4

I developed a generic "Unsigned" class, or really a class template Unsigned<size_t N> that models after the C (C++) built-in unsigneds using the amount of uint8_ts as a parameter. For example Unsigned<4> is identical to a uint32_t and Unsigned<32> would be identical to a uint256_t -- if it existed.

So far I have managed to follow most if not all of the semantics expected from a built-in unsigned -- in particular sizeof(Natural<N>)==N, (Natural<N>(-1) == "max_value_all_bits_1" == ~Natural<N>(0)), compatibility with abs(), sign(), div (using a custom div_t structure), ilogb() (exclusive to GCC it seems) and numeric_limits<>.

However I'm facing the issue that, since 1.- a class template is just a template so templated forms are unrelated, and 2.- the template non-typed parameter requires a "compile-time constant", which is way stricter than "a const", I'm essentially unable to create a Unsigned given an unknown N.

In other words, I can't have code like this:

...
( ... assuming all adequate headers are included ...)
using namespace std;
using lpp::Unsigned;
std::string str;
cout<< "Enter an arbitrarily long integer (end it with <ENTER>) :>";
getline(cin, str, '\n'); 
const int digits10 = log10(str.length()) + 1; 
const int digits256 = (digits10 + 1) * ceil(log(10)/log(256)); // from "10×10^D = 256^T"
// at this point, I "should" be able to, semantically, do this:
Unsigned<digits256> num; // <-- THIS I CAN'T -- num would be guaranteed 
                         // big enough to hold str's binary expression, 
                         // no more space is needed
Unsigned::from_str(num, str); // somehow converts (essentially a base change algo)
// now I could do whatever I wanted with num "as if" a builtin.
std::string str_b3 = change_base(num, 3); // a generic implemented somehow
cout<< "The number above, in base 3, is: "<< str_b3<< endl;
...

(A/N -- This is part of the testsuite for Unsigned, which reads a "slightly large number" (I have tried up to 120 digits -- after setting N accordingly) and does things like expressing it in other bases, which in and of itself tests all arithmethic functions already.)

In looking for possible ways to bypass or otherwise alleviate this limitation, I have been running into some concepts that I'd like to try and explore, but I wouldn't like to spend too much effort into an alternative that is only going to make things more complicated or that would make the behaviour of the class(es) deviate too much.

The first thing I thought was that if I wasn't able to pick up a Unsigned<N> of my choice, I could at least pick up from a set of pre-selected values of N which would lead to the adequate constructor being called at runtime, but depending on a compile-time value:

???? GetMeAnUnsigned (size_t S) {
  switch (S) {
    case 0: { throw something();  } // we can't have a zero-size number, right?
    case 1, 2, 3, 4: { return Unsigned<4>(); break; }
    case 5, 6, 7, 8: { return Unsigned<8>(); break; }
    case 9, 10, 11, 12, 13, 14, 15, 16: { return Unsigned<16>(); break; }
    ....
    default: { return Unsigned<128>(); break; } // wow, a 1Kib number!
    } // end switch
  exit(1); // this point *shouldn't* be reachable!
  } // end function

I personally like the approach. However I don't know what can I use to specify the return type. It doesn't actually "solve" the problem, it only degrades its severity by a certain degree. I'm sure doing the trick with the switch would work since the instantiations are from compile-time constant, it only changes which of them will take place.

The only viable help to declare the return type seems to be this new C++0(1?)X "decltype" construct which would allow me to obtain the adequate type, something like, if I understood the feature correctly:

decltype (Unsigned<N>) GetMeAnUnsigned (size_t S) {
  .. do some choices that originate an N
  return Unsigned<N>();
  }

... or something like that. I haven't entered into C++?X beyond auto (for iterators) yet, so the first question would be: would features like decltype or auto help me to achieve what I want? (Runtime selection of the instantiation, even if limited)

For an alternative, I was thinking that if the problem was the relation between my classes then I could make them all a "kind-of" Base by deriving the template itself:

template <size_t N>
class Unsigned : private UnsignedCommon { ...

... but I left that approach in the backburner because, well, one doesn't do that (make all a "kind-of") with built-ins, plus for the cases where one does actually treat them as a common class it requires initializing statics, returning pointers and leave the client to destruct if I recall correctly. Second question then: did I do wrong in discarding this alternative too early?

A: 

You can't directly do that. Each unsigned with a separate number has a separate type, and the compiler needs to know the return type of your method at compile time.

What you need to do is have an Unsigned_base base class, from which the Unsigned<t> items derive. You can then have your GetMeAnUnsigned method return a pointer to Unsigned_base. That could then be casted using something like dynamic_cast<Unsigned<8> >().

You might be better off having your function return a union of the possible unsigned<n> types, but that's only going to work if your type meets the requirements of being a union member.

EDIT: Here's an example:

struct UnsignedBase
{
    virtual ~UnsignedBase() {}
};

template<std::size_t c>
class Unsigned : public UnsignedBase
{
    //Implementation goes here.
};

std::auto_ptr<UnsignedBase> GiveMeAnUnsigned(std::size_t i)
{
    std::auto_ptr<UnsignedBase> result;
    switch(i)
    {
    case 42:
        result.reset(new Unsigned<23>());
    default:
        result.reset(new Unsigned<2>());
    };
    return result;
}
Billy ONeal
Aha, this pair of possible solutions seems interesting. I have considered the `UnsignedBase`approach but as I said I discarded it early because I though it was much "unlike" the way. You say by returning a pointer I could get somewhat around the issue? Also... what are the requirements for a union member? It is the first time I hear about something like that. Anything to do with storage being contiguous, the type being constructible, or the like?
Luis Machuca
@Luis: If you don't return a pointer, you'll run into the slicing problem. I believe the requirements for being a union member are the same as being a POD type, but I'm not positive. I do know you cannot have objects with non trivial destructors in a union, which might be a deal breaker, if `Unsigned<c>` needs to contain something like a Vector.
Billy ONeal
Billy: thanks for the clarifications. Unsigned<c>'s destructor is trivial AFAIK, as an Unsigned<c> essentially implements a uint8_t[c], there are nontrivial constructors however, I'm not sure about the semantics with arrays there. That might indicate that the OO approach might be the way to go; I'll wait for confirmation that the new C++0X features can/can't help to return the types (considering the switch case above) before considering this answer as the Definitive One (for my particular problem). Thanks!
Luis Machuca
@Luis: Keep in mind: Trivial means that your class, as well as any members of your class, must be using the *compiler generated destructor*, and that destructor must *not* be virtual. There must be NO virtual methods or members with virtual methods whithin. As a result, you can use the union in restricted circumstances, OR you can use the base class, but not both.
Billy ONeal
@Billy: a quick question -- seeing that only the destructor is declared explicitly in the base class: is it required to be virtual and does that requirement change upon the internal implementation of `Unsigned` (eg.: array members v/s POD members)?
Luis Machuca
@Luis: Once virtual, always virtual; It need only be virtual in the base class.
Billy ONeal
Excellent... now I'm on my way to something... thanks!
Luis Machuca
+1  A: 

The problem you are facing is quite common. Templates are resolved at compile time, while you need to change your behavior at runtime. As much as you might want to do that with the mythical one extra layer of indirection the problem won't go away: you cannot choose the return type of your function.

Since you need to perform the operations based on runtime information you must fall back to using dynamic polymorphism (instead of the static polymorphism that templates provide). That will imply using dynamic allocation inside the GetMeAnUnsigned method and possibly returning a pointer.

There are some tricks that you can play, like hiding the pointer inside a class that offers the public interface and delegates to an internal allocated object, in the same style as boost::any so that the user sees a single type even if the actual object is chosen at runtime. That will make the design harder, I am not sure how much more complex the code will be, but you will need to really think on what is the minimal interface that you must offer in the internal class hierarchy to fulfill the requirements of the external interface --this seems like a really interesting problem to tacke...

David Rodríguez - dribeas
Thanks, David. I too was thinking that "one extra layer of indirection" was going to blow at some point, but not *this* early. "boost::any" sounds interesting, there's many boost::things I have to explore yet.
Luis Machuca
A: 

It's a very common problem indeed, last time I saw it was with matrices (dimensions as template parameters and how to deal with runtime supplied value).

It's unfortunately an intractable problem.

The issue is not specific to C++ per se, it's specific to strong typing coupled with compile-time checking. For example Haskell could exhibit a similar behavior.

There are 2 ways to deal with this:

  • You use a switch not to create the type but actually to launch the full computation, ie main is almost empty and only serve to read the input value
  • You use boxing: you put the actual type in a generic container (either by hand-crafted class or boost::any or boost::variant) and then, when necessary, unbox the value for specific treatment.

I personally prefer the second approach.

The easier way to do this is to use a base class (interface):

struct UnsignedBase: boost::noncopyable
{
  virtual ~UnsignedBase() {}
  virtual UnsignedBase* clone() const = 0;

  virtual size_t bytes() const = 0;

  virtual void add(UnsignedBase const& rhs) = 0;
  virtual void substract(UnsignedBase const& rhs) = 0;
};

Then you wrap this class in a simple manager to ease memory management for clients (you hide the fact that you rely on heap allocation + unique_ptr):

class UnsignedBox
{
public:
  explicit UnsignedBox(std::string const& integer);

  template <size_t N>
  explicit UnsignedBox(Unsigned<N> const& integer);

  size_t bytes() const { return mData->bytes(); }

  void add(UnsignedBox const& rhs) { mData->add(rhs.mData); }
  void substract(UnsignedBox const& rhs) { mData->substract(rhs.mData); }

private:
  std::unique_ptr<UnsignedBase> mData;
};

Here, the virtual dispatch takes care of unboxing (somewhat), you can also unbox manually using a dynamic_cast (or static_cast if you know the number of digits):

void func(UnsignedBase* i)
{
  if (Unsigned<2>* ptr = dynamic_cast< Unsigned<2> >(i))
  {
  }
  else if (Unsigned<4>* ptr = dynamic_cast< Unsigned<4> >(i))
  {
  }
  // ...
  else
  {
    throw UnableToProceed(i);
  }
}
Matthieu M.
+1  A: 

In a nutshell, your problem is no different from that of the built-in integral types. Given a short, you can't store large integers in it. And you can't at runtime decide which type of integer to use, unless you use a switch or similar to choose between several predefined options (short, int, long, long long, for example. Or in your case, Unsigned<4>, Unsigned<8>, Unsigned<256>. The size cannot be computed dynamically at runtime, in any way.

You have to either define a dynamically sized type (similar to std::vector), where the size is not a template parameter, so that a single type can store any type of integer (and then accept the loss of efficiency that implies), or accept that the size must be chosen at compile-time, and the only option you have for handling "arbitrary" integers is to hardcode a set of predefined sizes and choose between them at runtime.

decltype won't solve your problem either. It is fairly similar to auto, it works entirely at compile-time, and just returns the type of an expression. (The type of 2+2 is int and the compiler knows this at compiletime, even though the value 4 is only computed at runtime)

jalf
Hah! Now _that_'s a point I was missing, runtime selection is something you can't do with the ints either. I could arrange for a set of hardcoded types following the "uintX_t" notation (for example a `Unsigned256_t` would be `Unsigned<32>`) and clients would be able to live with that, I mean I don't know of many people who use 3 or 7-byte integers... thanks for the hint on `decltype`.
Luis Machuca