views:

374

answers:

3

I'm working with a low-level API that accepts a char* and numeric value to represent a string and its length, respectively. My code uses std::basic_string and calls into these methods with the appropriate translation. Unfortunately, many of these methods accept string lengths of varying size (i.e. max(unsigned char), max(short), etc...) and I'm stuck writing code to make sure that my string instances do not exceed the maximum length prescribed by the low-level API.

By default, the maximum length of an std::basic_string instance is bound by the maximum value of size_t (either max(unsigned int) or max(__int64)). Is there a way to manipulate the traits and allocator implementations of a std::basic_string implementation so that I may specify my own type to use in place of size_t? By doing so, I am hoping to leverage any existing bounds checks within the std::basic_string implementation so I don't have to do so when performing the translation.

My initial investigation suggests that this is not possible without writing my own string class, but I'm hoping that I overlooked something :)

A: 

Can't you create a class with std::string as parent and override c_str()? Or define your own c_str16(), c_str32(), etc and implement translation there?

Zepplock
Most of the standard library is not intended to be inherited from (they have no virtual methods). So this is not advisable.
Evan Teran
+4  A: 

you can pass a custom allocator to std::basic_string which has a max size of whatever you want. This should be sufficient. Perhaps something like this:

template <class T>
class my_allocator {
public:
    typedef T              value_type;

    typedef std::size_t    size_type;
    typedef std::ptrdiff_t difference_type;
    typedef T*             pointer;
    typedef const T*       const_pointer;
    typedef T&             reference;
    typedef const T&       const_reference;

    pointer address(reference r) const             { return &r; }
    const_pointer address(const_reference r) const { return &r; }

    my_allocator() throw() {}

    template <class U>
    my_allocator(const my_allocator<U>&) throw() {}

    ~my_allocator() throw() {}

    pointer allocate(size_type n, void * = 0) {
        // fail if we try to allocate too much
        if((n * sizeof(T))> max_size()) { throw std::bad_alloc(); }
        return static_cast<T *>(::operator new(n * sizeof(T)));
    }

    void deallocate(pointer p, size_type) {
        return ::operator delete(p);
    }

    void construct(pointer p, const T& val) { new(p) T(val); }
    void destroy(pointer p)                 { p->~T(); }

    // max out at about 64k
    size_type max_size() const throw() { return 0xffff; }

    template <class U>
    struct rebind { typedef my_allocator<U> other; };

    template <class U>
    my_allocator& operator=(const my_allocator<U> &rhs) {
        (void)rhs;
        return *this;
    }
};

Then you can probably do this:

typedef std::basic_string<char, std::char_traits<char>, my_allocator<char> > limited_string;

EDIT: I've just done a test to make sure this works as expected. The following code tests it.

int main() {
    limited_string s;
    s = "AAAA";
    s += s;
    s += s;
    s += s;
    s += s;
    s += s;
    s += s;
    s += s; // 512 chars...
    s += s;
    s += s;
    s += s;
    s += s;
    s += s;
    s += s; // 32768 chars...
    s += s; // this will throw std::bad_alloc

    std::cout << s.max_size() << std::endl;
    std::cout << s.size() << std::endl;
}

That last s += s will put it over the top and cause a std::bad_alloc exception, (since my limit is just short of 64k). Unfortunately gcc's std::basic_string::max_size() implementation does not base its result on the allocator you use, so it will still claim to be able to allocate more. (I'm not sure if this is a bug or not...).

But this will definitely allow you impose hard limits on the sizes of strings in a simple way. You could even make the max size a template parameter so you only have to write the code for the allocator once.

Evan Teran
I would have added an int template parameter <typename T, int N> and had max_size() return N; then he can do typedef std::basic_string<char, std::char_traits<char>, my_allocator<char, 256> > my256string;
jmucchiello
I think exposing `size_type` as a template argument would be beneficial as then the user is free to choose which `size_type` is best suited for a given string instance. Then, partial template specialization will help with making this type play nice with `basic_string`.
Steve Guidi
+4  A: 

I agree with Evan Teran about his solution. This is just a modification of his solution no more:

template <typename Type, typename std::allocator<Type>::size_type maxSize>
struct myalloc : std::allocator<Type>
{
    // hide std::allocator[ max_size() & allocate(...) ]

    std::allocator<Type>::size_type max_size() const throw()
    {
     return maxSize;
    }
    std::allocator<Type>::pointer allocate
     (std::allocator<Type>::size_type n, void * = 0)
    {
     // fail if we try to allocate too much
     if((n * sizeof(Type))> max_size()) { throw std::bad_alloc(); }
     return static_cast<Type *>(::operator new(n * sizeof(Type)));
    }
};

Be aware you should not use polymorphism at all with myalloc. So this is disastrous:

// std::allocator doesn't have a virtual destructor
std::allocator<char>* alloc = new myalloc<char>;

You just use it as if it is a separate type, it is safe in following case:

myalloc<char, 1024> alloc; // max size == 1024
AraK
Yea, I just thought of making the max size a template parameter. Unfortunately, I don't think your solution will work as is. Because (at least gcc's) string implementation does not base it's max size on the allocator's max size. So I had to make `allocate(...)` throw if it requested more than `max_size` bytes.
Evan Teran
@Evan Your solution is nice, you got my +1 btw. I'll add more code for `allocate` as you clarified :)
AraK