What you're talking about is depth mapping, or 'disparity mapping', which is the basis of stereoscopic computer vision. The OpenCV project has libraries which do this. I don't know if they directly convert into a rotatable 3D object, which may be what you are looking for, but they probably come close.
http://opencv.willowgarage.com/wiki/
http://en.wikipedia.org/wiki/OpenCV
The dimension part is a little harder. The libraries can identify objects, but that would just give you 'common' dimension. If you were using stereoscopic imaging with two cameras, you could determine real depth, and therefore dimensions, from multiple samples.
The problem is difficult, really difficult, otherwise (i.e., impossible.)