We deal with alot of UGC (1m+/mo) and sometimes our users will input large strings with no spaces which causes web browsers to display content in a strange manner, breaking UI here and there.
I am trying to find a way to intelligently and quickly process text up to 50k and insert tags where appropriate.
I have already built this, but the JVM seems to crap out on larger strings (somewhere around 20k it chokes) so I was thinking about use a Perl script to do the modification and call it from Java but I do not know how to write Perl :(
Is there any libraries out there that do this? Has anyone run into this issue?