tags:

views:

312

answers:

5
+14  A: 

Where's the destructor of ustring? Shouldn't it free the allocated memory?


Let's look e.g. at the assignment operator:

ustring operator=(ustring input)
{
    ustring result(input);
    ...
    return *this;
}

You pass a ustring parameter by value. This will likely result in a copy being created via the copy constructor. You invoke copy construction another time to initialize result. I suspect that you invoke copy construction yet another time when you return *this as a ustring (again by value, instead of by reference).

Let's look at one of these three cases, namely result: When this local variable goes out of scope (ie. at the end of the function), it will be automatically destroyed. But if you don't provide a destructor (~ustring) that frees the allocated memory, you'll get a memory leak.

And since you apparently have lots of copy construction and pass-by-value going on without having a destructor for freeing the allocated memory, you'll get lots and lots and lots of unfreed memory.


Besides: Why do you use malloc and free instead of new[] and delete[]? You could get rid of the ugly sizeof(int) * ... calculations and (int*) type casts. Instead of:

values = (int *) malloc(sizeof(int) * len);

you'd simply write:

values = new int[len];
stakx
And why would you use `new[]` and `delete[]` when you could use `std::vector`?
jamesdlin
Of course one could, it's the OP's decision what's best in his case. I simply figured that the transition from `malloc/free` to `new[]/delete[]` is probably more gentle than going straight to the STL (which IMO has quite a steep learning curve if you're just starting out).
stakx
The answer to why he doesn't use new/delete is all that reallocing in the middle. I doubt it's worth re-writing that just to get slightly shorter calls to allocate memory. Use of a vector would greatly improve things, so is worth the curve. Or even better ditch this class entirely, use a `std::string` for UTF-8 strings, and either a `std::wstring` or `vector<uint32_t>` for wide-character data depending whether you agree with your platform's definition of a wide character, and make `encode` a free function.
Steve Jessop
@Steve Jessop: You're right. (Although I think one can easily get around `realloc` with `new/delete`, it just takes a *little* extra work.) I'd also agree that the OP is sort of reinventing the wheel by writing his own Unicode strings library from scratch... but I guess it's more fun understanding and programming something yourself than just re-using something that already exists.
stakx
@stakx: oh, I don't mean to object to DIY as a learning experience - when I say use `std::string`, if he wants the experience of writing his own string class then sure, do that instead. I disagree with mixing up resource management in the same class as character encoding conversion, though.
Steve Jessop
I need malloc/realloc, because the number of characters parsed is unknown (UTF-8 is variable-length and can have errors parsed). new and delete don't allow this, and I'd like to avoid the STL.
Delan Azabani
+2  A: 
  ustring operator=(ustring input) {
    ustring result(input);
    free(values);
    len = input.len;
    values = input.values;
    return * this;
  }

Why is result being declared in this?

Amber
-1 This is not an answer, this is a question.
Space_C0wb0y
Technically, it's both in a sense. http://en.wikipedia.org/wiki/Socratic_method
Amber
+1, not because it answers the question (it doesn't), but because it hints at a possible improvement of the OP's code.
stakx
A: 
  1. Try using RAII-techniques to avoid problems with memory leaks. If you wrap your arrays in something like a boost::shared_array you will not have to write an explicit destructor to free the memory.
  2. ustring operator=(ustring input) should be ustring operator=(const ustring & input) to avoid a copy of the argument. This should always be done if you pass a non-integral type as read-only parameter.
  3. Using shared_array are something like it also gets rid of the problem of calculating size.

Just some hints. On another note, your code is incredibly hard to read and understand. You should really do something about that if you expect other persons to ever work with that. Make your names more descriptive, refactor recurring code into functions that express what it does. For instance instead of using a for-loop to copy your values, you could use std::copy. You may also consider to replace all your magic numbers (e.g. 0x1f) by constants whose names express the semantics of the value.

Space_C0wb0y
+2  A: 

As soon as You implement the destructor, this implementation of assignment will bite You

  ustring operator=(const char * input) {
    ustring result(input);
    free(values);
    len = result.len;
    values = result.values;
    return * this;
  }

No, it is not an argument for leaving the destructor not implemented.

EDIT:

ustring object owns an array allocated with malloc and should free that memory when it gets destroyed.

If you create a local object

    ustring result(input);

It will get destroyed when the function returns.

In the line

    values = result.values;

You copy the pointer from Your local object - result, but result doesn't know about it, so it should destroy the array and leave *this with a dangling pointer. You have to change the state of result in a way, that prevents it from freeing the memory, because You took the ownership of the array from it. One way to do that could be setting it's pointer to null. Calling free on a null pointer is legal, so You don't have to worry about an attempt to free memory by result's destructor.

ustring& operator=(const char * input) {
    ustring result(input);
    free(values);
    len = result.len;
    values = result.values;
    result.values = 0;
    return * this;
}
Maciej Hehl
+1 But may-be explain how it bites You?
UncleBens
Great advice, but unfortunately your solution doesn't work. The memory usage still grows heavily. I've also tried free(result.values) in place of "result.values = 0" but that deleted my string.
Delan Azabani
@Delan Azabani It was an advice. just that. I didn't imply that my fix will solve Your whole problem with all leaks. But the problem I was talking about was real and could be a source of confusion after fixing the rest of the code. values and result.values after the assignment point to the same piece of memory so of course freeing one frees what the other points to ( the same thing). Setting result.values to 0 prevents freeing the memory it pointed to before. The function ustring operator=(ustring input) has the same problem. There are many things to fix in Your code and others pointed it out.
Maciej Hehl
+4  A: 

Basically, you create far too many ustrings, you need a LOT more references, and you didn't implement a destructor so when they all fall off the stack, they don't get freed.

Also, when in your assignment operator, you need to set result.values to NULL, else the memory will be deleted. You could use a move operator to make this a nice fast operation, although I still don't understand why you would.

DeadMG