X works enough differently that a raw Windows program and a raw X program probably wouldn't be able to share much UI code at all.
If you want portability between the two, chances are pretty good that you want to use something like Qt or wxWidgets. Of the two, wxWidgets is more similar to MFC, so it would probably require less rewriting, but would maintain (more or less) the same "disconnect" you're seeing between what you want and what it provides.
Without knowing more about your application, and why it doesn't fit well with MFC, it's impossible to guess whether Qt would be a better fit or not. An immediate guess would be "probably not".
MFC uses a "document/view" architecture, where Qt uses the original Model-View-Controller architecture. For the most part, MFC's Document class is equivalent basically a Model and a Controller rolled into one -- so if your Document contains nothing useful, in Qt you'd apparently have both a Model and a Controller, neither of which did much that was useful.
That said, I have to raise a question about why your Document currently doesn't do much. The MVC pattern has proven applicable to a wide variety of problems, so while it's possible it can't work well for your problem, it's also possible that it could work well, and you're simply not using it. Without knowing more about what you're doing, it's impossible to even guess at that though.
Edit: Okay, the clarification helps quite a bit. The first thing to realize is that a Document does not necessarily equate to a file. Quite the contrary, a document can perfectly reasonably relate to an arbitrary number of files.
Just for example, consider a web browser. All the data needed to compose the page its currently displaying would reasonably be part of the same document. Depending on your viewpoint, that's either zero files, or a whole bunch of them (it will start as an arbitrary number of files coming from the server(s), but won't necessarily be stored as files locally at all). Storing any of it as a file locally will be a (more or less) accidental by-product of caching, and mostly unrelated to browsing per se.
In your case, you're presumably reading the two (or three?) files into memory and storing them along with some sort of data structure to hold the result of the comparison. After the comparison is complete, you might or might not discard the contents of the files themselves. I think it's safe to say that the "normal" separation of responsibilities would be for that data and the code that produces that data to be in the Document.
The View should contain only the code to take that result from that data structure, and display it on screen. Nearly the only data you normally want to store in the View would be things related to how the data is presented (e.g., things like a zoom level or current scroll position). Likewise, the code in the view should relate only to displaying the result and reacting to user input, NOT to "creating" the data in the first place.
As such, I think your program could be rewritten to use the Document/View pattern more effectively, or could be rewritten to use MVC. That, in turn, means a port to Qt could/would probably work just fine -- provided you're willing to put some time and effort into understanding how it's intended to work and then make what may be fairly substantial changes to your code to work the way it's designed to.
As I commented previously, wxWidgets is more like MFC in this respect -- it uses a Document and View, not a Model, View, and Controller. It's also going to work best if you do some rewriting to separate responsibilities the way it's designed for. The good point is that it's probably a bit easier to do that one step at a time: rewrite the code in MFC, which which you're already familiar, and then port it to wxWidgets -- but given the similarity between the two, that "Port" will probably be little more than minor editing -- often just changing some names from C* to wx* is just about enough. To my recollection, the only place I've run into much work was in creating menus -- with MFC they're normally handled via resources, but (at least a few years ago when I used it) wxWidgets normally directly exposed the code that created the menu entries.
Porting to Qt would probably be more work -- you pretty much have to learn a new framework, and substantially reorganize your code at the same time. The good point is that when you're done, the result will probably be somewhat cleaner, though given what you're doing, the difference may be pretty minor. In a Document/View, the View displays data, and reacts to user input. In a Model/View/Controller, the View only displays data, but user input (that modifies the underlying data) goes through the Controller. Since you (presumably) don't expect to modify the underlying data, the only user input involved probably belongs in the view in any case (e.g., things like scrolling). It's barely possible you might have a few things you could put in the Document/Model that would be open to change (e.g., things like the current font or colors the user has selected).