Make a binary tree representing the document before and after applying all the changes. Each node represents either original text or inserted/deleted text; the latter kind of node includes both the amount of original text to delete (possibly 0) and the string of text to insert (possibly empty).
Initially the tree has just one node, "0 to end: original text". Apply all the changes to it merging changes as you go wherever possible. Then walk the tree from beginning to end emitting the final set of edits. This is guaranteed to produce the optimal result.
Applying an insert: Find the appropriate point in the tree. If it's in the middle of or adjacent to inserted text, just change that node's text-to-insert string. Otherwise add a node.
Applying a delete: Find the starting and ending points in the tree—unlike an insert, a delete may cover a whole range of existing nodes. Modify the starting and ending nodes accordingly, and kill all the nodes in between. After you're done, check to see if you have adjacent "inserted/deleted text" nodes. If so, join them.
The only tricky bit is making sure you can find points in the tree, without updating the entire tree every time you make a change. This is done by caching, at each node, the total amount of text represented by that subtree. Then when you make a change, you only have to update these cached values on nodes directly above the nodes you changed.
This looks strictly O(n log n) to me for all input, if you bother to implement a balanced tree and use ropes for the inserted text. If you ditch the whole tree idea and use vectors and strings, it's O(n2) but might work fine in practice.
Worked example. Here is how this algorithm would apply to your example, step by step. Instead of doing complicated ascii art, I'll turn the tree on its side, show the nodes in order, and show the tree structure by indentation. I hope it's clear.
Initial state:
*: orig
I said above we would cache the amount of text in each subtree. Here I just put a * for the number of bytes because this node contains the whole document, and we don't know how long that is. You could use any large-enough number, say 0x4000000000000000L.
After inserting "ab" at position 2:
2: orig, 2 bytes
*: insert "ab", delete nothing
*: orig, all the rest
After inserting "cde" at position 1:
1: orig, 1 byte
5: insert "cde", delete nothing
1: orig, 1 byte
*: insert "ab", delete nothing
*: orig, all the rest
The next step is to delete a character at position 4. Pause here to see how we find position 4 in the tree.
Start at the root. Look at the first child node: that subtree contains 5 characters. So position 4 must be in there. Move to that node. Look at its first child node. This time it contains only 1 character. Not in there. This edit contains 3 characters, so it's not in here either; it's immediately after. Move to the second child node. (This algorithm is about 12 lines of code.)
After deleting 1 character at position 4, you get this...
4: orig, 1 byte
3: insert "cde", delete nothing
*: insert "ab", delete nothing
*: orig, all the rest
...and then, noticing two adjacent insert nodes, you merge them. (Note that given two adjacent nodes, one is always somewhere above the other in the tree hierarchy. Merge the data into that higher node; then delete the lower one and update the cached subtree sizes in between.)
1: orig, 1 byte
*: insert "cdeab", delete nothing
*: orig, all the rest