views:

286

answers:

8

Hello, I am looking for an extremely small way of turning a string like "123" into an integer like 123 and vice-versa.

I will be working in a freestanding environment. This is NOT a premature optimization. I am creating code that must fit in 512 bytes, so every byte does actually count. I will take both x86 assembly(16 bit) and C code though(as that is pretty easy to convert)

It does not need to do any sanity checks or anything..

I thought I had seen a very small C implementation implemented recursively, but I can't seem to find anything for size optimization..

So can anyone find me(or create) a very small atoi/itoa implementation? (it only needs to work with base 10 though)

Edit: (the answer) (edited again because the first code was actually wrong) in case someone else comes upon this, this is the code I ended up creating. It could fit in 21 bytes!

;ds:bx is the input string. ax is the returned integer
_strtoint:
    xor ax,ax
    .loop1:
        imul ax, 10 ;ax serves as our temp var
        mov cl,[bx]
        mov ch,0
        add ax,cx
        sub ax,'0'
        inc bx
        cmp byte [bx],0
    jnz .loop1
ret

Ok, last edit I swear! Version weighing in at 42 bytes with negative number support.. so if anyone wants to use these they can..


;ds:bx is the input string. ax is the returned integer
_strtoint:
    cmp byte [bx],'-'
    je .negate
    ;rewrite to negate DX(just throw it away)
    mov byte [.rewrite+1],0xDA
    jmp .continue
    .negate:
    mov byte [.rewrite+1],0xD8
    inc bx
    .continue
    xor ax,ax
    .loop1:
        imul ax, 10 ;ax serves as our temp var
        mov dl,[bx]
        mov dh,0
        add ax,dx
        sub ax,'0'
        inc bx
        cmp byte [bx],0
    jnz .loop1
    ;popa
    .rewrite:
    neg ax ;this instruction gets rewritten to conditionally negate ax or dx
ret

A: 

Write it yourself. Note that subtracting '0' from a digit gets the power-of-ten. So, you loop down the digits, and every time you multiply the value so far by 10, subtract '0' from the current character, and add it. Codable in assembly in no time flat.

bmargulies
+3  A: 

With no error checking, 'cause that's for wussies who have more than 512B to play with:

#include <ctype.h>
// alternative:
// #define isdigit(C) ((C) >= '0' && (C) <= '9')

unsigned long myatol(const char *s) {
  unsigned long n = 0;
  while (isdigit(*s)) n = 10 * n + *s++ - '0';
  return n;
}

gcc -O2 compiles this into 47 bytes, but the external reference to __ctype_b_loc is probably more than you can afford...

Norman Ramsey
see my edited answer. I got it down to 18 bytes!
Earlz
if this is without error checking, why are you checking every character is a digit? just loop until you see a \0.
Idan K
I don't think you need to fit the C source into 512 bytes.
starblue
A: 

And here is another one without any checking. It assumes a null terminated string. As a bonus, it checks for a negative sign. This takes 593 bytes with a Microsoft compiler (cl /O1).

int myatoi( char* a )
{
   int res = 0;
   int neg = 0;

   if ( *a == '-' )
      {
      neg = 1;
      a++;
      }

   while ( *a )
      {
      res = res * 10 + ( *a - '0' );
      a++;
      }

   if ( neg )
      res *= -1;

   return res;
}
Mark Wilkins
A: 

atoi(p) register char *p; { register int n; register int f;

n = 0;
f = 0;
for(;;p++) {
    switch(*p) {
    case ' ':
    case '\t':
        continue;
    case '-':
        f++;
    case '+':
        p++;
    }
    break;
}
while(*p >= '0' && *p <= '9')
    n = n*10 + *p++ - '0';
return(f? -n: n);

}

deadpoint
A: 

Are any of the sizes smaller if you use -Os (optimize for space) instead of -O2 ?

Kristopher Ives
A: 

You could try packing the string into BCD(0x1234) and then using x87 fbld and fist instructions for a 1980s solution but I am not sure that will be smaller at all as I don't remember there being any packing instruction.

jbcreix
A: 

How in the world are you people getting the executables so small?! This code generates a 316 byte .o file when compiled with gcc -Os -m32 -c -o atoi.o atoi.c and a 8488 byte executable when compiled and linked (with an empty int main(){} added) with gcc -Os -m32 -o atoi atoi.c. This is on Mac OS X Snow Leopard...

int myatoi(char *s)
{
 short retval=0;
 for(;*s!=0;s++) retval=retval*10+(*s-'0');
 return retval;
}
Chinmay Kanchi
debugging symbols or ELF headers are common things... you'd just have to try using a cross compiler targetted to create flat binary files in a stand alone environment... or just disassemble it and count the bytes...
Earlz
+1  A: 

I don't have an assembler on my laptop to check the size, but offhand, it seems like this should be shorter:

; input: zero-terminated string in DS:SI
; result: AX
atoi proc
        xor cx, cx
        mov ax, '0'
    @@:
        imul cx, 10
        sub al, '0'
        add cx, ax
        lodsb
        jnz @b
        xchg ax, cx
        ret
atoi endp
Jerry Coffin
oooh.. neat.. I forgot about the `lods` instruction.. I'll have to check this out later after work...
Earlz