Here's my wild and whacky psuedo-code. Anyone know how to make this real?
Background:
This dynamic content comes from a ckeditor. And a lot of folks paste Microsoft Word content in it. No worries, if I just call the attribute untouched it loads pretty. But the catch is that I want it to be just 125 characters abbreviated. When I add truncation to it, then all of the Microsoft Word scripts start popping up. Then I added simple_format, and sanitize, and truncate, and even made my controller start spotting out specific variables that MS would make and gsub them out. But there's too many of them, and it seems like an awfully messy way to accomplish this. Thus so! Realizing that by itself, its clean. I thought, why not just slice it. However, the microsoft word text becomes blank but still holds its numbered position in the string. So I came up with this (probably awful) solution below.
It's in three steps.
- When the text parses, it doesn't display any of the MSWord junk. But that text still holds a number position in a slice statement. So I want to use a regexp to find the first actual character.
- Take that character and find out what its numbered position is in the total string.
Use a slice statement to cut it from.
def about_us_truncated x = self.about_us.find.first(regExp representing first actual character) x.charCount = y self.about_us[y..125] end
The only other idea i got, is a regex statement that allows it to explicitly slice only actual characters like so :
about_us([a-zA-Z][0..125])
, but that is definately not how it is written.
Here is some sample text of MS Word junk :
≪! [If Gte Mso 9]>≪Xml>≪Br /> ≪O:Office Document Settings>≪Br /> ≪O:Allow Png/>≪Br /> ≪/O:Off...