views:

154

answers:

4

I have a class that holds an array of elements, and I want to give it a GetSize member function. But what return type should I give that function?

I'm using the pimpl idiom, and so in the header file it is not known what the implementation will use to store the elements. So I cannot just say std::vector<T>::size_type, for example:

class FooImpl;

class Foo {
  FooImpl* impl_;
public:
  TYPE GetSize(); // what TYPE??
};
A: 

size_type's are usually used to hide the integer type (short vs. long vs. long long etc). Just define your own Foo::size_type.

EricSchaefer
+2  A: 

If the client code can only see Foo (which is the purpose of pimpl idiom), then there's no use in define a specific size_type in the concrete implementation - it won't be visible/accessible to the client anyway. Standard containers can do that since they are built on so called "compile-time polymorphism", while you are specifically trying to use a [potentially] run-time implementation hiding method.

In your situation the only choice would be to choose an integer type that "should be enough for all possible implementations" (like unsigned long, for example) and stick with it.

Another possibility is to use the uintptr_t type, if it is available in your implementation (it is standardized in C99, but not in C++). This integer type is supposed to cover the entire storage address range available to the program, which means that it will always be sufficient for representing the size of any in-memory container. Note, that other posters often use the same logic, but incorrectly arrive at the conclusion that the appropriate type to use here is size_t. (This is usually a result of lack of experience with non-flat memory model implementatioons.) If your containers are always based on physical arrays, size_t will work. However, if your containers are not always array-based, size_t is not even remotely the correct type to use here, since its range is generally smaller than the maximum size of a non-continuous (non-array-based) container.

But in any case, regardelss of what size you are end up using, it is a good idea to hide it behind a typedef-name, just like it is done in standard containers.

AndreyT
+1  A: 

The generic type for sizes in C++ is size_t. I'd use that.

I mean generic in the non-technical sense. This has nothing to do with templates.

Looks like this came up before: http://stackoverflow.com/questions/1951519/when-to-use-stdsize-t

edit

After much discussion, I'm going to slightly amend my answer.

If the size_t is as wide as the pointer, use size_t. If not, use an unsigned int of the same width as the pointer.

Steven Sudit
In the OP's question "size" is not really size, but rather "container element count", as I understand it. `size_t` is a generic type for *object size*, and in general case, it cannot be used to rerpresent the container element count. If the container is array-based `size_t` will work, but only in this case.
AndreyT
Why can't it? It holds an array in his case, and arrays are limited to `size_t` as well, meaning that it can at most hold `size_t/sizeof(element_type)` elements, which will always be <= `size_t`
jalf
@jalf: I assume that the term "array" was used by OP to mean "a linear contaner", not necessarily based on a physical C++ array. The OP explicitly states later that "it is not known what the implementation will use to store the elements".
AndreyT
@AndreyT: Ok, in that case you're right, although I think it's a somewhat academical objection. On most modern systems, and unless you're doing something extremely esoteric, `size_t` will be safe to use. And since the OP controls the pImpl class, modifying the code if it does require larger collections than `size_t` is fairly easy.
jalf
@jalf: "Enough" or "not enough" does not make a difference here. `size_t` represents a completely different concept, it is not intended and not supposed to be used to represent the container element count. When `size_t` is used for that purpose, it exposes the fact that the container is array-based. And if it is not really array-based, then using `size_t` is a major and obvious design error. Moreover, even if it is indeed array-based, `size_t` in interface notably defeats the purpose of using pimpl, since it "unhides" a major property of the implementation.
AndreyT
@AndreyT: By design, size_t is an unsigned integer big enough to hold the size of any allocation. As a result of this attribute and the sizeof(array)/sizeof(element) idiom, it has become a de facto standard for sizes, including element counts. In what way would it be preferable to use something along the lines of `unsigned __int64`?
Steven Sudit
@Steven: The key moment here is *one* allocation. It is a de facto standard for contaners where element count is limited by a *single allocation*, i.e. for a *physical array*, as I said above several times. If the contaner stores its elements through multiple allocations, `size_t` immediately becomes out of question. That's the whole point.
AndreyT
@AndreyT: Are you talking about virtual collections, where the items are too large to all fit into memory at once?
Steven Sudit
@Steven Sudit: No, I'm talking about ordinary in-memory collections based on multi-block allocation. Like an ordinary linked list, for example, with each element allocated independently. `size_t` is not the correct type to use to count elements in a linked list. `size_t` is only appropriate with arrays.
AndreyT
@AndreyT: *Why* is size_t inappropriate for a list?
Steven Sudit
every integral type "represents a different concept" though. The same could be said for `int` or `unsigned long` or anything else. Ultimately, he's going to have to pick a type that is able to represent the range of sizes he's dealing with, and use that. Of course it can be hidden under a typedef, giving it a more relevant name, but I don't really see how it's a worse choice than any other integral type.
jalf
Given that the `size_type` for the standard library containers is `size_t`, and this is the type returned by their `size()` member functions (which return the number of elements in the container), it seems that `size_t` is indeed appropriate for the count of the number of elements in a container.
James McNellis
@Steven Sudit: `size_t` is inappropriate for a list because it implements a completely different concept, and, as a consequence, in general case it is not large enough to hold the count. `size_t` can hold the size of the *largest array* (the exact limit is determined by the implementation) while the total number of elements is not limited by that. In short, the total number of elements in a list is generallly greater than `size_t` can accomodate.
AndreyT
@James McNellis: Absolutely incorrect. `std::allocator::size_type` is `size_t`. `std::vector::size_type` is *usually* `size_t` (for obvious reasons). All other container's `size_type` have no relation to `size_t` whatsoever.
AndreyT
Apparently, many posters here are still have the illusion that `size_t` is somehow supposed to be able to "cover" the entire address range of the platform, which is why they claim that it should be enough. Apparently, some poster believe that `size_t` somehow is supposed to work as an integer counterpart for pointer type on the given platform. This is, of course, incorrect. `size_t` is generally smaller than pointer type (some IBM platforms with 128-bit pointers and 32-bit `size_t` come to mind as examples, not even mentioning Win16).
AndreyT
If you want to derive your decisions on this specific logic, the appropriate type to use would be uintptr_t. *That* type is the type with the required range, which will always be sufficient. size_t, on the other hand, is completely and utterly inappropriate in this case. I also suggest you read the link in the post and think why the accepted (an correct) answer explicitly states that it can also be used as an index in *arrays* specifically.
AndreyT
@AndreyT: It may well be that there are memory architectures where a single allocation can only return a subset of the total pool (due to partitioning) and there are certainly on-disk structures too large to be loaded into memory. For those two cases, you could make an argument that size_t isn't quite right, although I would argue that it's fine for everything else. Still, I'm not comfortable with uintptr_t, even if it's the right bit-width, simply because its name promises a pointer, not a size. I'm not sure that there's a good *general* answer available.
Steven Sudit
@Steven Sudit: Theres' no need to bring on-disk structures into it. A classic example, which I already mentioned, is DOS/Win16 platform in standard memory model, where `size_t` is/was 16 bit, while the number of entries in a linked list could easily exceed 65535. This is a clear and exact example of `size_t` being insufficient for an in-memory container. Once again, the OP does not need to represent *object size*. The OP need to represent *object count*. These two concepts are completely different and unrelated.
AndreyT
@AndreyT: Ok, but object size is identical to object count when the object is a byte, and is proportional otherwise. To put it another way, any integer wide enough to hold the size is also guaranteed to be wide enough for the count. For that matter, this usually applies to any int wide enough to hold the pointer, which is why you suggested uintptr_t, I suspect.
Steven Sudit
@Steven Sudit: You are forgetting that in C/C++ languages object size limit is *not guaranteed* to be as large as the entire available addressable storage on gived platform. Again, the *object size* can be easily limited by 65K bytes, while the *object count* can easily exceed 65K of individual objects. Take your example with bytes. On a specific platform, the maximum size of byte array can be limited by 65K, while if you `malloc` each byte individually (as in linked list), you can create, say, 1 million of individual byte objects.
AndreyT
As for `uintptr_t`, yes I suggested it because it is guaranteed to be wide enough to hold the pointer. While `size_t` is *not guaranteed* to be wide enough to hold the pointer. Which is why `size_t` is generally not sufficient, since in the extreme case you can have as many individual objects as there are different pointer values (or proportional to that).
AndreyT
@AndreyT: I'm not sure you understood my comment. If a linked list is spread across 4 64k blocks, then a 16-bit size_t will only be big enough to hold the count if each object is at least 4 bytes wide (including a forward pointer). This is a safe bet, in this case. In general, the 32-bit pointer used to reference it is guaranteed to be wide enough to contain the count, even if each object is a byte. Having said that, a pointer type is not a good size type. So my conclusion is that we want an unsigned int that's as wide as the pointer.
Steven Sudit
@Steven Sudit: I don't understand the importance of your example with 4 64K blocks. What if the list is occupying 16 or 32 64K blocks, while each object is, say, 8 bytes long. This can easily overflow a 16-bit `size_t`. I agree (as I did many times before) with the point about the unsigned integer that's as wide as a pointer. And that would be `uintptr_t`. Not `size_t`.
AndreyT
A: 

You could follow the STL's lead and make size_type a typedef that relies on FooImpl:

template<typename T> class s_type {
    public:
    typedef size_t type; // good default
};

class FooImpl;

// I'm only doing this specialization to show how it's done
// not because I think it's needed.  In general I'd use size_t.
template<> class s_type<FooImpl> {
    public:
    typedef uintptr_t type;
};

class Foo {
  FooImpl* impl_;

  public:
  typedef size_type s_type<FooImpl>::type;
  size_type GetSize();
};
Max Lybbert
If I understand correctly, this solves a similar but distinct problem.
Steven Sudit
The question is "what return type should I give that function?" The sticky part is that the return type relies on FooImpl's implementation. This solution gives a return type that relies on FooImpl's implementation. How is this a distinct problem from the OP?
Max Lybbert