I have to develop an application for windows that will enable controlling the mouse through web cam by recognizing hand gestures. I will be using vc++ 2008 for development. But I am confused whether to go with .NET framework or core win32 APIs. Performance is very important for my application. As per the book "Beginning Visual C++ 2008" by Ivor Horton, there is a small performance penalty associated in using .NET framework. I wanted to know on what all factors the penalty depends upon and will it be feasible to use .NET framework for my application.
If you are acquainted with Win32 API, then go Win32 API. It is the natural choice in your case since most of your source code will be video capturing, image processing, algorithms, and interfaces to mouse in Windows. When you are interested in performance, be closer to the hardware avoiding thick layers like .NET.
I believe that .NET is for complex business applications not for real-time applications or device drivers.
A quick way to put it: the performance difference between the native API and .Net can be compensated for by buying a more expensive processor. You would pay somewhere between $1 and $100 more, with $10 being a reasonable estimate - per CPU of course. So, if you expect more than a million users, do choose the native API. If you expect to use it on 2-3 demo PCs, it really doesn't matter at all.
.NET is nice for GUIs and for general programming in non-performance-intensive areas. If you need to do anything more than a trivial GUI, I would suggest writing at least that part in a .NET language.
In what you've described of you program, recognizing hand gestures is going to be the only computationally intensive part. The actual process of controlling the mouse is trivial. So as long as the gesture recognition part performs well enough for your needs, it probably won't matter what the rest of the program is written in.
First step, you should research what libraries are out there that do gesture recognition or similar image processing. (I would hope that you're not intending to write that part from scratch anyway.) If you find any .NET based libraries that claim to have performance good enough for your needs, then you could give them a try. Otherwise, you would probably end up with a library based on C or C++ or similar. Either way though, it's possible to integrate such a thing with a .NET-based program.
Although this is not what you asked for, but I think it's worth considering development time and the effort necessary to use Win32 API. Actually, I'm a fan of the Windows SDK aka Win32 API, but for certain tasks, it's much too complicated (compared with .NET framework and .NET libs).
My own (little) experience shows that Win32 allows you to do everything, which includes a very error prone program. Try to use speech output, for example, and you'll see it needs four lines of .NET C# code where your're already lost in COM when you'd like to do that natively. (Of course, you can be error prone in .NET too.)
Apart from that, it may be worth to put some kind of HAL into your program. Using DLLs has it advantages, and one of it is that you may use them via P/Invoke.
Guessing what you'd have to implement:
- capture video
- video processing
- mouse input emulation
The first one will be easy in .NET using DirectX and I think it wouldn't have a big overhead (correct me if I'm wrong). The second one might be the performance part, and the last one is only a small exercise (even in Win32 API, five lines of code).
Now, there's a difference between having a background of years of C++ development and 10 co-workers that'll develop a main-stream hand-to-mouse lib to being a beginner that wants or has to write an app that gets things done. You can do everything fast, nice, correct and expandable, but you'll most certainly need twice the time you considered a worst-case scenario. Or you create something reliable, which is easy-to-maintain as it is simple code, fast enough for its specified appliances and easy-to-use.
For the first one, you could use Win32 API. If you go the second way, .NET will certainly fit your needs. It will most probably reduce your development time so you can concentrate on the important things like algorithms for hand recognition.
Now, leaving behind the black-and-white attitude, there are already LIBs out there. Tons of 'em. Using them may ease the use of native code, and take a lot of effort from you. But still, you can use 5 lines of .NET C# code to create a GUI and show your latest results.
Besides all of this: The fastest web cam I know gives you 90 pics per second (at somewhat 640x480). Plus USB is NOT real time. So what you can do is reduce response time. Modern mice (vendors tell that their products) have up to 10 times more fps at 1 ms response time.
My 2 cents.
I think you should limit .NET usage for GUI building.Rest of the works try to do in Win32. Remained question about object recognition , there is nice library called OPENCV (open source Computer Vision Library). this lib contains all possible methods which you would require in project. Also there is Intel's hardware specific library, IPP, which boost Opencv's performance.
Unless you are a very experienced C++, programmer, C# is a much more productive language (speaking as someone with over 15 years C++ experience and 2 years C#).
The .Net libraries offer a wealth of high-quality functionality that's easier to use than the standard C++ library.
So, I'd go with using .Net.
I also recommend using C++/CLI to directly call complex native libraries if you have to integrate them, rather than P/Invoke. That's good for odd calls but doesn't let you easily access data structures or mingle native and managed code.