On a virtual memory system, the virtual address space means that virtual pages can map anywhere. You don't need large contiguous blocks of physical memory. If you are having problems with fragmentation of your virtual address space then you may need a different memory management strategy.
However, most options would require your application code to be aware of the memory management strategy at some level. I don't believe there is a quick fix for this problem - you are probably up for reasonably major surgery to fix it. None of these options are simple to implement, you will have to find the one most likely to work in your particular case.
The major options that I can see are: custom memory allocators, something involving AWE (see below) or rebuilding the memory allocation strategy within the application.
Option 1: Custom memory allocators
Custom memory allocators are not uncommon in C and C++ circles. You might be able to implement something similar. Two possibilities are open to you:
Build a memory allocator with a mechanism that attempts to merge adjacent free blocks into a single larger block (you could run this as a part of attempting to recover from a failed memory allocation). This might allow you to transparently manage the memory without the application having to be aware. Implementing this would be fiddly and technical but is probably feasible.
The principal benefit of this approach is that it is the only one that would not require you to change existing application code. The downside is that it is not guaranteed to work; it is still possible that the merge operation can fail to consolidate a block of memory large enough to fulfil the request. The merge operation may also cause significant pauses in application response while it runs.
You may need to build your application in a way that allows the data structure to be compacted. This would require you to maintain handles that support the objects being moved, i.e. a double indirection mechanism. I'm guessing that there is probably one or a fairly small number of different data structures that cause this fragmentation problem, so it may be possible to localise any re-architecture work within your application.
Option 2: PAE
Windows does support facilities to directly manipulate the MMU, and there are a couple of possibilities where this could apply to your application. This would definitely require explicit architectural support from your application, but offers the possibility of using a pool of memory that is much larger than 2GB.
On server versions of Windows, look into PAE, which is supported by API's that allow you to manually manipulate the system's MMU and re-map chunks of memory. This might be helpful to you in one of two ways
You could build the manager for the data structure in a way that uses this mechanism as an inherent part of managing the data.
If you can fit the items in your data structure to page boundaries you may be able to use this as a way to consolidate memory.
However, this approach would require you to re-engineer your application so that object references had enough information to manage the explicit swap-in process (possibly some sort of overlay manager with a proxy mechanism for the objects being referenced through this system). This means that any solution involving PAE is not a drop-in replacement for FastMM - you would have to modify the application to explicitly support PAE.
However, a proxy mechanism of this sort might mean that this subsystem could be relatively transparent to clients. Apart from the overhead of managing the indirection and overlays (which may or may not be a significant issue) the proxies could be virtually indistinguisable from the original API. This type of approach would work best for a relatively small number of large, heavyweight objects with minimal interconnection - the canonical application for this is disk caching. The proxies would have to remain in a fixed location in memory, with the larger objects being accessed through the overlay mechanism. YMMV.
Option 3: Fix the problem at source
One possibility is that your object allocation strategy can be optimised from within the application code (perhaps from pools of objects allocated in bulk and then managed from within the application). This might allow you to deal with memory fragmentation from within your application without attempting to re-write the memory manager.
Again, this approach means that you will have to re-build parts of your application, and the applicability of the approach really depends on the nature of your application. Only you can be the judge of how well this might work.