views:

1333

answers:

6

How do you determine the length of an unsigned char*?

+6  A: 

For the actual size of the pointer:

size_t s = sizeof(unsigned char*);

If you want the length of the string:

unsigned char* bla = (unsigned char*)"blabla";
int s = strlen((char*)bla);
Joce
"blabla" yields a read-only string, so bla should be const unsigned char*.
Bastien Léonard
This should not compile. "blabla" is a const char*, and you can't assign a const char* to an unsigned char* without casting.
Brian Neal
That's not assignment - it's initialization - done all the time.
Steve Fallows
Brian was right. That did not compile. I've edited it so it does now.
Joce
There's a third option missing: the length of the allocated array of chars. This differs from the number of chars before the zero-terminator.
xtofl
After looking around, actually your "string" interpretation is wrong. There's a nice answer about that, too: http://stackoverflow.com/questions/75191/unsigned-char/87648#87648 .
xtofl
xtofl: For the 3rd option, there's no way to know if the array was dynamically allocated other than keeping the size value around. For a statically allocated array, sizeof would work.
Joce
xtofl: Unsigned char string is just to give an example. In any case, strlen would return the correct value, whether the chars are signed or not.
Joce
A: 

By unsigned char * I suppose you mean the string located at that pointer. In that case it would be:

strlen(your_string_pointer)

However, this will only find the \0 position. There is no garantee this is the actual allocated memory block size.

Coincoin
http://www.cplusplus.com/reference/clibrary/cstring/strlen... strlen takes a "const char*", not an unsigned.
xtofl
+4  A: 

There could be two meanings to this. Are you just wanting to know how big the pointer type is? If so then Joce's answer is correct

size_t size = sizeof(unsigned char*);

If you're wanting to know how many elements does the pointer point to, that's a bit more complex. If this is a C style string then strlen or some variant is your best option.

However if this is just a pointer to unsigned char which has no relation to a C style string, then there is no way to reliably achieve what you're looking for. C / C++ does not associate a length field with a pointer. You'll need to pass the length around with the pointer or use a class like vector which stores both the pointer and the length.

JaredPar
You're right about not being able to retrieve the allocated length - a bad language design decision in many people's eyes. You're wrong about there being two meanings: it might be that you need to know the length of the contained zero-terminated string, though it would be better to use a signed char. Make it two-and-a-half :)
xtofl
A: 

Do you want the length of the pointer, which would be an int. If you want the length of the string that is being pointed to, use strlen: e.g. Size of the pointer: sizeof(unsigned char*) Size of the string: strlen(unsigned char*) Multibyte characters will get reported as ..multi byte

Rohit
Actually it wuld be a size_t
Tom
Right. I should've said size_t
Rohit
A: 

If you're using C++, and its a string in an unsigned char*, you're better off first putting it into a std::string before manipulating it. That way you can do all kinds of things to it and still be able to get the length() and/or capacity() of it whenever you want.

I'm assuming that you're doing things to said array to make its size non-constant. If you're just allocating, setting, and forgetting, you can always store the actual allocation size of the array in a seperate variable - or better, make a struct/class.

//WARNING: memory issues not addressed here.
struct myStringStruct
{
  unsigned char * string;
  int len;

  allocate(int size) {
    len = size;
    string = malloc(sizeof(unsigned char) * len);
  }
}

Any more complex than that and you're re-inventing std::string.

cyberconte
+1  A: 

In an ideal world, you don't. You use char* for C-style strings (which are NUL-terminated and you can measure the length of), and unsigned char* only for byte data (which comes with its length in another parameter or whatever, and which you probably get into an STL container ASAP, such as vector<unsigned char> or basic_string<unsigned char>).

The root problem is that you can't make portable assumptions about whether the storage representations of char and unsigned char are the same. They usually are, but they're allowed not to be. So there are no string-like library functions which operate on unsigned char*, only on char*, and it is not in general safe to cast unsigned char* to signed char* and treat the result as a string. Since char might be signed, this means no casting unsigned char* to char*.

However, 0 is always the same value representation in unsigned char and char. So in a non-ideal world, if you've got a C-style string from somewhere but it has arrived as an unsigned char*, then you (a) cast it to char* and get on with it, but also (b) find out who did this to you, and ask them please to stop.

Steve Jessop