views:

59

answers:

1

Hallo Everyone,

I am actually on topic of 3d-Scanning for robotic pick&place-applications.

To get a start I'm using an ICP-algorithmus to match the position of a reference object relative to the actual object. For this purpose I am using Octave/Matlab with the following code: http://www.mathworks.com/matlabcentral/fileexchange/12627-iterative-closest-point-method

After some tries the algorithm seems to generate satisfying accuracy in appropriate time. Matching of about 6000 to 6000 Data points costs with 100 iteration-loops about 15 seconds computation time.

Actually I'm trying to extract this matlab/octave-code to get it into my application to try a parallelism of the algorithm. When I'm running the unchanged code from my own c-application the computation time increases about 10 to 20 times. (Same datasets!)

If have turned on function inlining and the optimazion-level -O3. Are there any other optimizations octave does when generating a .oct-file? I've actually no idea why there is such a big difference in performance.

The ICP-algorithm massive does double addition, multiplication and division!

Thanks for all your help!

Greets, jodel

A: 

I expect that Octave, like Matlab, uses an implementation of BLAS which is tuned for the hardware you are using. Do you ? If not, this could account for the difference in speed.

High Performance Mark
That Octave/Matlab interally uses the BLAS to speed up certain operations seems to be quite logic. Actually I am using some C-Code, extracted it from the Matlab Envoriment and the computation time increases about factor 20. Do you think that the gcc/g++ speeds up the external c-code through BLAS? I am using no hardware-acceleration.
jodel
@jodel: GCC (and most other compilers I've heard of) will only use BLAS if you tell it to -- which you would do by calling BLAS functions and by linking to libblas (or whatever it's called on your system). Yes, I would expect a BLAS implementation of a procedure to be faster than your own implementation of the same procedure unless you have spent a lot of time optimising your own implementation, in particular optimising it for use of the memory hierarchy.
High Performance Mark
@High Performance Mark: The used code does not any speedup through an external library. Because of that I think that there is possibly another reason why the performance is that bad without using octave.
jodel