You can't anticipate every possible weird type of malformed markup that some browser somewhere might trip over to escape blacklisting, so don't blacklist. There are many more structures you might need to remove than just script/embed/object and handlers.
Instead attempt to parse the HTML into elements and attributes in a hierarchy, then run all element and attribute names against an as-minimal-as-possible whitelist. Also check any URL attributes you let through against a whitelist (remember there are more dangerous protocols than just javascript:).
If the input is well-formed XHTML the first part of the above is much easier.
As always with HTML sanitisation, if you can find any other way to avoid doing it, do that instead. There are many, many potential holes. If the major webmail services are still finding exploits after this many years, what makes you think you can do better?