ansaurus

Question

why are these C / Cython arrays defined as character, not integer arrays?

Answer 1

+8 A:

Despite the misleading name, char types in C language are ordinary integral types, just like short, int, long and such. Of all integral types, chars have the smallest range and occupy the smallest amount of memory. So, if in your application it is important to save as much memory as possible, it might make sense to use char instead of int.

On some hardware platforms it might turn out that int types work faster than char types, so the selection of the specific type becomes a speed-vs-memory trade-off, but, once again, in many cases when the range of char is naturally sufficient, it might make more sense to use char instead of int.

AndreyT 2010-07-31 21:14:51

Just a minor side note: C99's int_fast8_t type (and derivatives) leaves the mentioned speed vs. size trade-off to the implementation. Usually it gives you the fastest type with a minimum of 8 bits (one byte on most systems, i.e. a 'char') available.

RWS 2010-07-31 21:46:32

at least *if* this is meant as a small integer one should never, never ever use `char` as the type, since its signedness is undefined and might vary from platform to platform. Use `signed char`, `unsigned char` or better the mentioned `int_fast8_t` or `int8_T` etc.

Jens Gustedt 2010-07-31 22:00:24

Answer 2

+1 A:

Quite simply, to save memory -- but please note carefully that declaring these arrays as char limits the result distance to either 127 or 255, depending on whether the C compiler defaults to signed char or unsigned char respectively. In C, char is an integer type -- you don't need an ord() to get its integer value.

Your original code contains no mention of this limitation. Note that if a char overflows, it does so silently and the code will produce incorrect results -- 127 + 1 -> -128 (signed); 255 + 1 -> 0 (unsigned).

You didn't respond to my comment on your original question: """What are the (a) maximum (b) average sizes of your strings? Do you really need to do the whole O(M*N) thing if the two strings are nothing like each other?""" ..... Please answer that now (edit your question); had you done so then, you would have had this question answered then.

Update: Reading the original post again, I've noticed a problem: The code that reads

m1, m2 = m2, m1
strcpy( m3, m2 )

is WRONG on three grounds: (1) it doesn't shuffle the rows properly (should do strcpy() before swapping m1 and m2) (2) strcpy() will not copy anything beyond the first null (zero byte) (3) there is no need to copy anything, just shuffle the pointers

m3, m2, m1 = m2, m1, m3

John Machin 2010-07-31 21:28:01

thx; your comment is not visible to me. i have added an edit to answer this point.

flow 2010-07-31 21:53:33

@flow: It IS visible to you (1) at the end of the comments you will see "show 1 more comment" (2) you would have been notified of a new comment by the little envelope icon at the top of the page next to your name being highlited

John Machin 2010-07-31 22:31:37

you're right. this is so subtle.

flow 2010-08-01 22:05:23

@flow: What precisely am I right about? What is so subtle? Are you talking about SO's comment visibility or the strcpy error or both?

John Machin 2010-08-01 22:09:46

i meant the half-greyed smallish "show 1 more comment" is 'subtle'. thx for your code fixes, i already incorporated them. right now running tests to see whether that gives correct results.

flow 2010-08-03 01:42:55

ansaurus

tags:

views:

answers:

why are these C / Cython arrays defined as character, not integer arrays?

related questions