views:

245

answers:

3

According to answers to that question: Which, if any, C++ compilers do tail-recursion optimization? it seems, that compiler should do tail-recursion optimization.

But I've tried proposed options and it seems that compiler can't do this optimization in case of template functions. Could it be fixed somehow?

+1  A: 

I don't use the MS compilers, but GCC can certainly do tail-recursion optimisation for templates. Given this function:

template <typename T>
T f( T t ) {
   cout << t << endl;
   if ( t == 0 ) {
      return t;
   }
   return f( t - 1 );
}

The code produced is:

    5   T f( T t ) {
    6       cout << t << endl;
-   0x401362    <main+22>:      mov    %esi,0x4(%esp)
-   0x401366    <main+26>:      movl   $0x4740c0,(%esp)
-   0x40136d    <main+33>:      call   0x448620 <_ZNSolsEi>
-   0x401372    <main+38>:      mov    %eax,%ebx
    7      if ( t == 0 ) {
-   0x4013a5    <main+89>:      test   %esi,%esi
-   0x4013a7    <main+91>:      je     0x4013c8 <main+124>
    8         return t;
    9      }
    10     return f( t - 1 );
-   0x4013a9    <main+93>:      dec    %esi
-   0x4013aa    <main+94>:      jmp    0x401362 <main+22>
    11  }

You can see that the recursive call has been turned into a jump back to the start of the function. This optimisation is only performed by GCC if the code is compiled with optimisations enabled (-O2 in this case) - perhaps the same is true for MS C++?

anon
What g++ options did you use? My code with no options did not do tail-call optimisation.
philcolbourn
@philcolbourn g++ -g -O2 -Wall -pedantic tr.cpp
anon
@Neil Butterworth Thank you. And to get the inline source with the assembler?
philcolbourn
@philcolbourn The code I posted was taken from the output of running the gdb http://www.gnu.org/software/gdb/ debugger. In this case using the Insight GUI.
anon
@Neil Butterworth Ah! The joys of Linux I presume. I don't think Insight runs on Apple's special gdb.
philcolbourn
@philcolbourn Insight is written in Tcl, so it should work on Apple - in fact I'm running it on Windows, I don't do too much Linux programming these days.
anon
A: 

I'm guessing here, but it might be possible to do them manually.

The first fill template uses recursion to fill a buffer. The second uses a hand crafted tail recursion to do the same thing.

This may be bad for some reason, so I am suggesting it's use with caution.

eg.

#include <stdio.h>

template <class myType>

// fill a buffer with n v's

void fill( myType *p , int n , myType v ){
    if ( n <= 0 ) return;
    *p = v;
    fprintf( stderr , "[%x] = %d\n" , (unsigned) p , *p );
    fill( p+1 , n-1 , v );
}

template <class myType>

// fill a buffer with n v's

void fillTail( myType *p , int n , myType v ){
    tail:
    if ( n <= 0 ) return;
    *p = v;
    fprintf( stderr , "[%x] = %d\n" , (unsigned) p , *p );
    // hand crafted tail call
    p++;
    n--;
    goto tail;
}

int main(){
  int   buf[100];
  int   v = 12;
  fill( buf , 10 , v );
  for ( int i=0; i<10 ; i++ ){
    fprintf( stderr , "[%d] = %d\n" , i , buf[i] );
  }
  v = 13;
  fill( buf , 10 , v );
  for ( int i=0; i<10 ; i++ ){
    fprintf( stderr , "[%d] = %d\n" , i , buf[i] );
  }
}

Edit:

Good idea to add the assembler. I have changed some labels to make it clearer.

I simply used g++ file.cpp to compile and g++ -S file.cpp to get the assembler.

fill:
        pushl   %ebp
LCFI0:
        movl    %esp, %ebp
LCFI1:
        subl    $24, %esp
LCFI2:
        cmpl    $0, 12(%ebp)
        jle     L4
        movl    8(%ebp), %edx
        movl    16(%ebp), %eax
        movl    %eax, (%edx)
        movl    12(%ebp), %edx
        decl    %edx
        movl    8(%ebp), %ecx
        addl    $4, %ecx
        movl    16(%ebp), %eax
        movl    %eax, 8(%esp)
        movl    %edx, 4(%esp)
        movl    %ecx, (%esp)
        call    fill
L4:
        leave
        ret

fillTail:
        pushl   %ebp
LCFI3:
        movl    %esp, %ebp
LCFI4:
        subl    $8, %esp
LCFI5:
        jmp     L6
L10:
        movl    8(%ebp), %edx
        movl    16(%ebp), %eax
        movl    %eax, (%edx)
        addl    $4, 8(%ebp)
        leal    12(%ebp), %eax
        decl    (%eax)
L6:
        cmpl    $0, 12(%ebp)
        jg      L10
L9:
        leave
        ret
philcolbourn
A: 

The Llvm project is a framework to create compilers which has an extensive set of optimization mechanism (tail call optimization among them). It provides c and c++ compilers, although c++ is not considered complete.

Ismael