I'd like to statistically profile my C code at the instruction level. I need to know how many additions, multiplications, divisions, etc I'm performing.
This is not your usual run of the mill code profiling requirement. I'm an algorithm developer and I want to estimate the cost of converting my code to hardware implementations. For this, I'm being asked the instruction call breakdown during run-time (parsing the compiled assembly isn't sufficient as it doesn't consider loops in the code).
After looking around, it seems VMware may offer a possible solution, but I still couldn't find the specific feature that will allow me to trace the instruction call stream of my process.
Are you aware of any profiling tools which enable this?