views:

86

answers:

1

In GCC compiler, whenever -O3 is flagged, the compiler optimizes mainly by Loop Unrolling and Function Inlining.

However, when I compiled an application that has the following function, it didn't do Function Inlining. From gprofing and gconving I could notice that this function (comp_t_delay) was not called from the main function unit of this application. However, it was called from a function, called by a function, called by the main function.

Here is the code of comp_t_delay(int in,int ip) that I want to know why the GCC didn't optimize it using -O3 (any help is appreciated!):

static float
comp_t_delay(int in,int ip)
{

    int sb, sib,dx, dy;
    t_type_ptr st, sit;
    float d_ssi;

    d_ssi = 0.;

    sb = net[in].node_block[0];
    st = block[sb].type;

    sib = net[in].node_block[ip];
    sit = block[sib].type;

    assert(st != NULL);
    assert(sit != NULL);

    dx = abs(block[sib].x - block[sb].x);
    dy = abs(block[sib].y - block[sb].y);

    if(st == T_IO)
    {
        if(sit == T_IO)
        d_ssi = de_io[dx][dy];
        else
        d_ssi = de_iof[dx][dy];
    }
    else
    {
        if(sit == T_IO)
        d_ssi = de_fio[dx][dy];
        else
        d_ssi = de_fb[dx][dy];
    }
    if(d_ssi < 0)
    {
        printf
        ("Error1\n");
        exit(1);
    }

    if(d_ssi < 0.)
    {
        printf
        ("Error2\n");
        exit(1);
    }

    return (d_ssi);
}
+6  A: 

It most probably didn't inline it because it is too long. Long functions, when inlined, can actually cause the code to run slower - for example you get the CPU registers bloated by more variables. In this particular case gcc decided it will be faster not to inline the function.

dark_charlie
Not to mention, when the code is so damn long and involved, the call overhead is going to be a trivial percentage of the time spent, i.e. inlining won't give any measurable performance benefit. The only time inlining could benefit large functions is when an argument is constant and knowing that constant allows the compiler to eliminate the majority of the function body.
R..