It's unfortunate that much of the literature on the subject is very dense. I too was in your shoes. I got my first introduction to the subject from Programming Languages: Applications and Interpretation
http://www.plai.org/
I'll try to summarize the abstract idea followed by details that I did not find immediately obvious. First, type inference can be thought of generating and then solving constraints. To generate constraints, you recurse through the syntax tree and generate one or more constraints on each node. For example, if the node is a '+' operator, the operands and the results must all be numbers. A node that applies a function has the same type as the result of the function, and so on.
For a language without 'let', you can blindly solve the above constraints by substitution. For example:
(if (= 1 2)
1
2)
here, we can say that the condition of the if statement must be boolean, and that the type of the if statement is the same as the type of its "then" and "else" clauses. Since we know 1 and 2 are numbers, by substitution, we know the "if" statement is a number.
Where things get nasty, and what I couldn't understand for a while, is dealing with let:
(let ((id (lambda (x) x)))
(id id))
Here, we've bound 'id' to a function that returns whatever you've passed in, otherwise known as the identity function. The problem is the type of the function's parameter 'x' is different on each usage of id. The second 'id' is a function from a->a, where a can be anything. The first is from (a->a)->(a->a) This is known as let-polymorphism. The key is to solve constraints in a particular order: first solve constraints for the definition of 'id'. This will be a->a. Then fresh, separate copies of the type of id can be substituted into the constraints for each place 'id' is used, for example a2->a2 and a3->a3.
That wasn't readily explained in online resources. They'll mention algorithm W or M but not how they work in terms of solving constraints, or why it doesn't barf on let-polymorphism: each of those algorithms enforce an ordering on solving the constraints.
I found this resource extremely helpful to tie Algorithm W, M, and the general concept of constraint generation and solving all together. It's a little dense, but better than many:
http://www.cs.uu.nl/research/techreps/repo/CS-2002/2002-031.pdf
Many of the other papers there are nice too:
http://people.cs.uu.nl/bastiaan/papers.html
I hope that helps clarify a somewhat murky world.