views:

1141

answers:

12

I know that computer vision involves a lot of math, but I need some tips about how programmers gain that knowledge. I've started to use the OpenCV library but I have some major problems in understanding how the math works in the algorithms.

In college I have studied some math and we worked with matrices and derivatives, but I didn't pay to much attention to the subject. It seemed to be so difficult and useless from a programmer point of view. I suppose that there has to be some easy way to understand what a second derivative is without calculating an equation. (Derivatives are just an example)

Do you have any tips for me about how can i gain such knowledge? A forum, book, link, advice, anything?

+3  A: 

Well, there is this calculus book. Strangely it's ridiculously expensive in the US. Maybe you can find an edition in an (online) used book store.

You might want to visit a book store with departments for math and tech books. Books about introductory engineering mathematics mostly introducate calculus of several variables in a more concrete way with regards to application.

msiemeri
+3  A: 

Depending which parts of computer vision you are looking at, Numerical Recipies in x where x includes most commonly used vision languages, can offer a deeper and more thorough examination of some of the more complex issues in vision such as non-linear optimisation. Chapters are available on-line as PDFs and you'll find hardcopies in most good online bookstores.

Colin Desmond
+4  A: 

Dover textbooks.

Inexpensive (many case under $20), but often well written.

My bookshelf is full of them. After a while their distinctive form-factor and binding catches the eye from across the room.

dmckee
Dover books are great cheap resources. However, most of them are re-releases of out-of-print books, so they're typically too old to have examples related to computing. However, if you're really serious about learning the necessary math, Dover books are high-quality texts.
"they're typically too old to have examples related to computing" Well, numeric methods have been around for a long time. For a while, they were "computing", and you can find some good books on that topic from Dover. Just don't look for them to help you with Ruby on Rails...
dmckee
+4  A: 

check out opencourseware. Here's one on machine vision. See if you understand the math there, if not, work your way back down by looking it up in math textbooks. You can check out what prerequisites are needed for that course as well.

yx
+7  A: 

A lot of Universities have online courses (e.g. MIT's Open Courseware). This would be a good place to look, you can probably find a computer vision course which will have pre-reqs listed. As others have mentioned you'll need to understand calculus and vectors/matrices, but you will also need to read up on statistics and Bayes' Theorem for more advance vision work, as computer vision often uses probabilistic techniques.

Bishop wrote an excellent book on the topic but it is very expensive and really written for an advanced level. I certainly wouldn't start there, but if you really get in to the topic it's a good resource.

Steve Haigh
+2  A: 

A lot of universities around the world have been posting lectures to YouTube and their own websites, so it shouldn't be hard to find decent lectures on most mathematical subjects. You can actually learn quite a bit about a lot of interesting subjects this way.

mandaleeka
+4  A: 

The most relevant math topics you need to cover for computer vision are calculus (specifically muti-variate calculus), Fourier analysis, linear algebra, and statistics.

Calculus and Fourier analysis are probably the most difficult ones, but you need them for the low-level image processing. An image is a discrete function of x and y, so you talk about its partial derivatives, which help you detect edges and corners and describe textures. Also, you can think of an image as a 2-dimensional signal and use the Fourier transform to analyze it. The way to really get the feel for it is to implement the Fast Fourier Transform yourself a couple different ways (e. g. recursively and iteratively), run it on a few images, and see what the results look like.

For higher level stuff, such as object recognition, you really need to get into statistics and machine learning. You would need to know what a histogram is, understand the meaning of the mean and the variance of a probability distribution, and lots of other stuff...

If you have access to Matlab, it makes it very easy to implement various image processing and vision algorithms, and try them out. IMHO, this is the best way to really understand how they work.

I would also suggest reading papers published in computer vision conferences and journals. Most of them are available on the web, and you can find them with google scholar. Look up topics like object recognition, image retrieval, object tracking in video, or 3D reconstruction to see what kind of problems computer vision actually deals with. Reading these papers will probably be difficult at first, but they can give you an idea of which mathematical techniques are being used.

Dima
+5  A: 

Bishop's book, which was recommended by someone else, is on the more general topic of machine learning, not vision in particular. Nevertheless, I consider it required reading, as the current state of the art in CV relies heavily on the concepts of machine learning. The other required text is Forsyth's book on computer vision. Not the most readable, but fairly current and comprehensive.

To get to the point where you can understand these two books, you'll need to polish up on linear algebra, probability, and computer graphics. A strong foundation in calculus will be necessary to understand the optimization algorithms in Bishop. Physics comes in handy as well, because there are a lot of algorithms that pose the problem in terms of an analogous physical system to be optimized.

In other words, it's a lot of math; it's all math. Every one of the math courses I took over the span of 3 years as an undergrad has come into play in my computer vision studies, so if your math is weak, be ready for a big commitment to become stronger at it.

redmoskito
+1 about computer vision being extremely math heavy
kigurai
+3  A: 

I have an MSc as "programmer mathematician". I have been implementing computer vision algorithms for 1-2 years. You should know, that to do computer vision professionally, you have to know an amount of math, that one can not learn in 1 or 2 years. However, you can be a "non-so professional" compute vision developer (I was such one too), even with little math knowledge, so you do not have to give up your plans in this really beautiful topic of computer science.

I would also say, that the most used math topics in this area are linear algebra, calculus, statistics, Fourier, wavelets, etc.

I strongly recommend you to start with reading this book: Robert Beezer: "A First Course in Linear Algebra". It is not only suitable for total beginners in math, but also is the best quality math book I have ever seen, and its free as in freedom.

I also recommend to start with linear algebra and especially with this book, because doing so you can get in short time practically useful knowledge which eventually you will need anyway.

+2  A: 

I've recently completed my 3rd Calculus class at my university. While this by no means makes me an expert in math or even Calculus, I like to think I have been given a solid introduction to a lot of the basic concepts as a non-math major (our CS degree requires it).

I personally feel your best bet is to bite the bullet and take math at a local university. Not online, not self-study, but with a real life teacher in a real life classroom.

For the longest time I thought upper level math like Calculus and Numerical Analysis were beyond me. I was never a math person growing up, in large part due to an unwarranted fear of it.

However, after finally getting the courage to dive into our program I've found math below a certain level is more about brute force learning than it is about Mensa intelligence. It is about listening to the lecture, asking questions and doing the homework. After taking it in incremental steps and taking the homework seriously I found getting A's in Calculus was just about consistent effort, nothing more.

Without that consistent effort, and the nightly homework you are going to be hard pressed to learn the requisite math you need for Computer Vision in a reasonable amount of time (less than 3 years).

This is why I think taking them in the classroom setting is ideal for getting your base in more advanced math. You are assigned homework and it keeps some pressure on you to push through the material (a section a day), even if there are times when you don't want to. With self-study I find it way too easy to gloss over material that is uninteresting or annoying. Plus you have your professor as a terrific resource during their office hours to shed light on things you didn't understand in the previous nights homework.

After you have the solid base to work from, self-study becomes a more realistic option.

Simucal
A: 

For using OpenCV I would suggest just using the internet or browsing it sources, it contains most of the information, and you save money on a book.

"Multiple View Geometry in Computer Vision"

This is a book you must have to understand the basic things like finding the homography and how to find the fundamental matrix, the camera calibration matrix, absolute dual quadric. I know might be considered expensive, but it will save you a lot of time.

"Matrix Computations" (Gene H. Golub)

This one is a really good book, it is totally not expensive considering what is in it. "Schaum's outlines Matrix Operations" by Richard Bronson. The book is fairly thin, but he (Richard) writes things straight to the point. I have two books written by him. It won't disappoint. Don't buy "numerical recipes in C" like the guy above says. "Matrix Computations" will apply directly to Lapack and don't buy the Lapack book either. Just use the source code. I can't tell you what an eye opener "Matrix Computations" is. It is one of my newest books. Very usable today, yet published in 1996 (third edition).

Now I'm looking for a book about pattern recognition for computer vision. Anybody know any good books? I'm thinking like graph theory and computer vision. But things are starting to become very expensive (above $100), and I don't want to buy the wrong one.

Sclytrack

sclytrack
A: 

As for learning the math, after taking a introductory linear algebra class, I think getting exposure to linear algebra in the computer vision context should be good enough. Also getting a good foundation in graphics (OpenGL) should be helpful too.

Our university used this book:

As a beginner I liked these books:

and of course (the best):

Wikipedia has been a great resource, here are some computer vision topics. More links here and here.

Also IEEE, ACM, and your university have access to lots of research papers. Finding old computer vision lectures like this course at UNC have also been useful.

srand