I'm trying to get my head around how type inference is implemented. In particularly, I don't quite see where/why the heavy lifting of "unification" comes into play.
I'll give an example in "pseudo C#" to help clarify:
The naive way to do it would be something like this:
Suppose you "parse" your program into an expression tree such that it can be executed with:
interface IEnvironment
{
object lookup(string name);
}
interface IExpression
{
// Evaluate this program in this environment
object Evaluate(IEnvironment e);
}
So something like "Multiplication" might be implemented with:
class Multiply : IExpression
{
IExpression lhs;
IExpression rhs;
// etc.
public object Evaluate(IEnvironment e)
{
// assume for the moment C# has polymorphic multiplication
return lhs.Evaluate(e) * rhs.Evaluate(e);
}
}
Then to "implement" type inference, you could just do something like:
interface ITypeEnvironment
{
Type getType(string name);
}
interface IExpression
{
//as before
object Evaluate(IEnvironment e);
// infer type
Type inferType(ITypeEnvironment typeEnvironment);
}
Then "Multiplication"'s type inference might just be something like:
class Multiply : IExpression
{
IExpression lhs;
IExpression rhs;
// ...
public Type inferType(ITypeEnvironment typeEnvironment)
{
Type lhsType = lhs.inferType(typeEnvironment);
Type rhsType = rhs.inferType(typeEnvironment);
if(lhsType != rhsType)
throw new Exception("lhs and rhs types do not match");
// you could also check here that lhs/rhs are one of double/int/float etc.
return lhsType;
}
}
lhs and rhs might be simple constants, or "variables" which are looked up in the environment:
class Constant : IExpression
{
object value;
public Type inferType(ITypeEnvironment typeEnvironment)
{
return value.GetType(); // The type of the value;
}
public object Evaluate(IEnvironment environment)
{
return value;
}
}
class Variable : IExpression
{
string name;
public Type inferType(ITypeEnvironment typeEnvironment)
{
return typeEnvironment.getType(name);
}
public object Evaluate(IEnvironment environment)
{
return environment.lookup(name);
}
}
But nowhere in this do we end up with the need for a "unification" algorithm.
So, clearly, my example is not complex enough. Does it need higher order functions? Do we need "parametric polymorphism"?
What is the simplest possible example where "unification" is actually needed to correctly infer the type of an expression.
An example in Scheme would be ideal (i.e. an example of a very small Scheme program where you require unification to correctly infer the type of the s-expression).