tags:

views:

469

answers:

3

Can someone show me a really simple Python ctypes example involving Unicode strings including the C code?

Say, a way to take a Python Unicode string and pass it to a C function which catenates it with itself and returns that to Python, which prints it.

A: 

Untested, but I think this should work.

s = "inputstring"
mydll.my_c_fcn.restype = c_char_p
result = mydll.my_c_fcn(s)
print result

As for memory management, my understanding is that your c code needs to manage the memory it creates. That is, it should not free the input string, but eventually needs to free the return string.

tom10
Can you post the C code?
mike
This answer includes ctypes and processing, but omits the unicode!
joeforker
He posted that request well after I wrote this code. A bit harsh to downvote 3 months after the question and it's modification. (There should be a rule against this: the fact that I answer the question shouldn't guarantee a perpetual commitment from me to answer whatever question the OP eventually morphs it into.)
tom10
Sorry for the harshness. I'd change my vote but it won't let me.
joeforker
A: 
from ctypes import *

buffer = create_string_buffer(128)
cdll.msvcrt.strcat(buffer, "blah")
print buffer.value

Note: I understand that the Python code is easy, but what I'm struggling with is the C code. Does it need to free its input string? Will its output string get freed by Python on its behalf?

No, you need to manually free the buffer yourself. What people normally do is copy the python string immediately from buffer.value, and then free the buffer.

Can you post the C code? – mike 2 hours ago

#include <string.h>

char* mystrcat(char* buffer) {
    strcat(buffer, "blah");
    return buffer;
}
Unknown
Can you post the C code?
mike
+2  A: 

This program uses ctypes to call wcsncat from Python. It concatenates a and b into a buffer that is not quite long enough for a + b + (null terminator) to demonstrate the safer n version of concatenation.

You must pass create_unicode_buffer() instead of passing a regular immutable u"unicode string" for non-const wchar_t* parameters, otherwise you will probably get a segmentation fault.

If the function you need to talk to returns UCS-2 and sizeof(wchar_t) == 4 then you will not be able to use unicode_buffer() because it converts between wchar_t to Python's internal Unicode representation. In that case you might be able to use some combination of result.create_string_buffer() and result.decode('UCS2') or just create an array of c_short and u''.join(unichr(c) for c in buffer). I had to do that to debug an ODBC driver.

example.py:

#!/usr/bin/env python
#-*- encoding: utf-8 -*-
import sys
from ctypes import *
example = cdll.LoadLibrary(".libs/libexample.so")
example.its_an_example.restype = c_wchar_p
example.its_an_example.argtypes = (c_wchar_p, c_wchar_p, c_uint)
buf = create_unicode_buffer(19) # writable, unlike u"example".
buf[0] = u"\u0000"
a = u"あがぃいぅ ☃ "
b = u"個人 相命理 網上聯盟"
print example.its_an_example(buf, a, len(buf) - len(buf.value) - 1)
print example.its_an_example(buf, b, len(buf) - len(buf.value) - 1)
print buf.value # you may have to .encode("utf-8") before printing
sys.stdout.write(buf.value.encode("utf-8") + "\n")

example.c:

#include <stdlib.h>
#include <wchar.h>

wchar_t *its_an_example(wchar_t *dest, const wchar_t *src, size_t n) {
    return wcsncat(dest, src, n);
}

Makefile: (ensure the indentation is one tab character, not spaces):

all:
    libtool --mode=compile gcc -g -O -c example.c
    libtool --mode=link gcc -g -O -o libexample.la example.lo \
            -rpath /usr/local/lib
joeforker
great answer! This is the best "getting started" example on ctypes I've ever found! Thanks a lot! Note that I had to change the load to cdll.LoadLibrary("./.libs/libexample.so.0") but everybody should be able to fix similar details on his/her machine
Davide