Compressing and reducing XML size has been an issue for more than a decade now, especially in mobile communications where both bandwidth and client computation power are scarce resources. The final solution used in wireless communications, which is what I prefer to use if I have enough control on both the client and server sides, is WBXML (WAP Binary XML Spec).
This spec defines how to convert the XML into a binary format which is not only compact, but also easy-to-parse. This is in contrast to general-purpose compression methods, such as gzip, that require high computational power and memory on the receiver side to decompress and then parse the XML content. The only downside to this spec is that an application token table should exist on both sides which is a statically-defined code table to hold binary values for all possible tags and attributes in an application-specific XML content. Today, this format is widely used in mobile communications for transmitting configuration and data in most of the applications, such as OTA configuration and Contact/Note/Calendar/Email synchronization.
For transmitting large XML content using this format, you can use a chunking mechanism similar to the one proposed in SyncML protocol. You can find a design document here, describing this mechanism in section "2.6. Large Objects Handling". As a brief intro:
This feature provides a means to synchronize an object whose size exceeds that which can be transmitted within one message (e.g. the maximum message size – declared in MaxMsgSize
element – that the target device can receive). This is achieved by splitting the object into chunks that will each fit within one message and by sending them contiguously. The first chunk of data is sent with the overall size of the object and a MoreData tag signaling that more chunks will be sent. Every subsequent chunk is sent with a MoreData tag, except from the last one.