tags:

views:

379

answers:

4

I've been doing some work on an unfamiliar codebase that uses UChar* as strings. Uchars are defined as follows (at least according to gdb)

(gdb) ptype UChar
type = short unsigned int

However, when I try to print these in gdb, I just get the address. I can also index into the pointer and retrieve the values of each character.

Is there any way to print a variable of type UChar* from within gdb and get back a meaningful string?

Also, this is on OS X, if that makes any difference.

A: 

You first need to figure out what a UChar actually represents. It is likely UTF-16 or UCS-2 (but BE or LE?). Once you determine this, you want to provide (you can probably use existing code, such as iconv) a debug method to convert to UTF-8. See http://www.skynet.ie/~caolan/TechTexts/GdbUnicodePrinting.html for details.

Matthew Flaschen
Note that current CVS versions of GDB integrate libiconv, so external hacks like above should no longer be necessary.
Employed Russian
Employed Russian, care to post details as an answer? I haven't seen that before. Although I don't know how new the OS X gdb is.
Matthew Flaschen
A: 

if it is an ascii string, you might try to tell gdb to reinterpret:

(gdb) print (char*) theUcharPtr
Demi
That probably won't work because UTF-16 has the high-byte null for common characters (i.e. ASCII), and C-strings are null-terminated.
Matthew Flaschen
A: 

print is the same as x; x/1s 0x1234 -- will print out that location in memory as a string, if you keep hitting carrage return, it will print the next line... etc...

If you want to monitor something continually, use display/ with the same format specifier as x (print). "display/1s 0x1234" then every time you break via a breakpoint or a single step, you will see the information you configured print out .. updated etc...

RandomNickName42
+1  A: 

Just define this command in your .gdbinit and type uc varname (uc will likely work as a short form for the ucharprint command you define)

define ucharprint
echo "
set $c = (unsigned short*)$arg0
while ( *$c )
  if ( *$c > 0x7f )
    printf "[%x]", *$c
  else
    printf "%c", *$c
  end
  set $c++
end
echo "\n
end

You don't need to worry about endianness since each unsigned short in your UTF-16 UChar type holds a code point (or half surrogate) as a native binary integer.

Ben Bryant