views:

860

answers:

2

I have a DTrace probe catching calls to a function, and one of the function's arguments is a CFStringRef. This is private structure that holds a pointer to a unicode string. But the CFStringRef is not itself a char*, so normal DTrace methods like copyinstr() just return ?cp?, which isn't exactly helpful.

So how can I print out the string in the DTrace action?

A: 

I believe that you can't do this directly, but you can create a custom static probe that feeds in the CFString / NSString as a char *, which you can use with copyinstr(). I describe how to do this in an article here.

Brad Larson
Unfortunately, I'm trying to use this to probe some compiled code that I don't have control over, so changing the source isn't possible.
TALlama
+5  A: 

As far as I know, there is not built-in support for this kind of thing. Usually a library would publish a probe that decodes the string for you (as Brad mentions). So since in your case you can't modify the library, you'll need to use the pid provider and hook into a user function, and decode it yourself.

The solution (which is very similar to the approach you would use in C++ to dump a std::string) is to dump out the pointer which is stored at an 2 word offset from the base CFStringRef pointer. Note that since a CFString can store strings internally in a variety of formats and representations, this is subject to change.

Given the trivial test application:

#include <CoreFoundation/CoreFoundation.h>

int mungeString(CFStringRef someString)
{
    const char* str = CFStringGetCStringPtr(someString, kCFStringEncodingMacRoman);
    if (str)
        return strlen(str);
    else
        return 0;
}

int main(int argc, char* argv[])
{
    CFStringRef data = CFSTR("My test data");

    printf("%u\n", mungeString(data));

    return 0;
}

The following dtrace script will print the string value of the first argument, assuming it is a CFStringRef:

#!/usr/sbin/dtrace -s

/*
    Dumps a CFStringRef parameter to a function,
    assuming MacRoman or ASCII encoding.
    The C-style string is found at an offset of
    2 words past the CFStringRef pointer.
    This appears to work in 10.6 in 32- and 64-bit
    binaries, but is an implementation detail that
    is subject to change.

    Written by Gavin Baker <gavinb.antonym.org>
*/

#pragma D option quiet

/* Uncomment for LP32 */
/* typedef long ptr_t; */
/* Uncomment for LP64 */
typedef long long ptr_t;

pid$target::mungeString:entry
{
    printf("Called mungeString:\n");
    printf("arg0 = 0x%p\n",arg0);

    this->str = *(ptr_t*)copyin(arg0+2*sizeof(ptr_t), sizeof(ptr_t));
    printf("string addr = %p\n", this->str);
    printf("string val  = %s\n", copyinstr(this->str));

}

And the output will be something like:

$ sudo dtrace -s dump.d -c ./build/Debug/dtcftest 
12
Called mungeString:
arg0 = 0x2030
string addr = 1fef
string val  = My test data

Simply uncomment the right typedef depending on whether you are running against a 32-bit or 64-bit binary. I have tested this against both architectures on 10.6 and it works fine.

gavinb
Using this program and this probe file, I just get a big list of this:dtrace: error on enabled probe ID 1 (ID 93815: pid11402:sc:mungeString:entry): invalid address (0x7c8) in action #5 at DIF offset 12Taking out the line that prints the string, I see that all the string addrs are a tad unusual:Called mungeString:arg0 = 0x100001068string addr = 7c8Adding a second, different constant string and mungeString'ing it, I get the same string addr for both strings.
TALlama
Ok, I can tell from the memory addresses that you must be using 10.6 and building a 64-bit app. I wrote the test app (in a hurry!) on 10.5 as it was all I had access to at the time. I should have used sizeof(intptr_t) for the offset in the DTrace script to be arch-neutral (rather than hardcode 8, which would now be 16 in a 64-bit app). I'll have a look on a 10.6 machine.
gavinb
@TALlama Please try the updated script above. I tested it on both 32-bit and 64-bit binaries and it works fine.
gavinb
I found that when running on 10.6 with a 64-bit kernel, you need to use 'typedef int ptr_t' (int rather than long), because long changes size depending 32 vs 64 bit kernels.http://wikis.sun.com/display/DTrace/Types,+Operators+and+Expressions#Types%2COperatorsandExpressions-CHPTYPEOPEXPR2
Nick Dowell