Why does GCC not auto-vectorize this loop?

I am attempting to optimize a loop that accounts for a lot of my program's computation time.

But when I turn on auto-vectorization with -O3 -ffast-math -ftree-vectorizer-verbose=6 GCC outputs that it can not vectorize the loop.

I am using GCC 4.4.5

The code:

/// Find the point in the path with the largest v parameter
void prediction::find_knife_edge(
    const float * __restrict__ const elevation_path,
    float * __restrict__ const diff_path,
    const float path_res,
    const unsigned a,
    const unsigned b,
    const float h_a,
    const float h_b,
    const float f,
    const float r_e,
) const
{
    float wavelength = (speed_of_light * 1e-6f) / f;

    float d_ab = path_res * static_cast<float>(b - a);

    for (unsigned n = a + 1; n <= b - 1; n++)
    {
        float d_an = path_res * static_cast<float>(n - a);
        float d_nb = path_res * static_cast<float>(b - n);

        float h = elevation_path[n] + (d_an * d_nb) / (2.0f * r_e) - (h_a * d_nb + h_b * d_an) / d_ab;
        float v = h * std::sqrt((2.0f * d_ab) / (wavelength * d_an * d_nb));

        diff_path[n] = v;
    }
}

The messages from GCC:

note: not vectorized: number of iterations cannot be computed.
note: not vectorized: unhandled data-ref 

On the page about auto-vectorization ( http://gcc.gnu.org/projects/tree-ssa/vectorization.html ) it states that it supports unknown loop bounds.

If I replace the for with

for (unsigned n = 0; n <= 100; n++)

then it vectorizes it.

What am I doing wrong?

The lack of detailed documentation on exactly what these messages mean and the ins/outs of GCC auto-vectorization is rather annoying.

EDIT:

Thanks to David I changed the loop to this:

 for (unsigned n = a + 1; n < b; n++)

Now GCC attempts to vectorize the loop but throws out this error:

 note: not vectorized: unhandled data-ref
 note: Alignment of access forced using peeling.
 note: Vectorizing an unaligned access.
 note: vect_model_induction_cost: inside_cost = 1, outside_cost = 2 .
 note: not vectorized: relevant stmt not supported: D.76777_65 = (float) n_34;

What does "D.76777_65 = (float) n_34;" mean?

Asked By: ljbade
||

Answer #1:

I may have slightly botched the details, but this is the way you need to restructure your loop to get it to vectorize. The trick is to precompute the number of iterations and iterate from 0 to one short of that number. Do not change the for statement. You may need to fix the two lines before it and the two lines at the top of the loop. They're approximately right. ;)

const unsigned it=(b-a)-1;
const unsigned diff=b-a;
for (unsigned n = 0; n < it; n++)
{
    float d_an = path_res * static_cast<float>(n);
    float d_nb = path_res * static_cast<float>(diff - n);

    float h = elevation_path[n] + (d_an * d_nb) / (2.0f * r_e) - (h_a * d_nb + h_b * d_an) / d_ab;
    float v = h * sqrt((2.0f * d_ab) / (wavelength * d_an * d_nb));

    diff_path[n] = v;
}
Answered By: ljbade
The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .



# More Articles