Hi,
Interesting problem! I've not done any low-level (think Win32) Windows programming in a while, but here's what I would do.
Use a named pipe and have your application listen to it. Using this named pipe as a communication medium, implement a real simple protocol whereby you can query the application for the name of a control given its HWND, or other things you find useful. Make sure the protocol is rich enough so that there is sufficient information exchanged between your application and the test framework. Make sure that the test framework does not yield too much "special behavior" from the app, because then you wouldn't really be testing the features, but rather your test framework.
There's probably way more elegant and cooler ways to implement this, but this is what I remember from the top of my head, using only simple Win32 API calls.
Another approach, which we have implemented for our product at work, is to record user events, such as mouse clicks and key events in an event script. This should be rich enough so that you can have the application play it back, artificially injecting those events into the message queue, and have it behave the same way it did when you first recorded the script. You basically simulate the user when you play back the script.
In addition to that, you can record any important state (user's document, preferences, GUI controls hierarchy, etc.), once when you record the script, and once when you play it back. This gives you two sets of data you can compare, to make sure for instance that everything stays the same. This solution gives you tests that not easy to modify (you have to re-record if your GUI changes), but that provide awesome regression testing.
(EDIT: This is also a terrific QA tool during beta testing, for instance: just have your users record their actions, and if there's a crash, you have a good chance of easily reproducing the problem by just playing back the script)
Good luck!
Carl