views:

525

answers:

2

hi when I get a text from a textarea in html like this

wase&
;#101;m

the correct decode is waseem

notice the newline , when I decode it I get

wase&;#101;m

the newline make errors here , Can I fix it ? I use javascript in the decoding process .

I use this function in decoding

function html_entity_decode(str) {  
 var  ta=document.createElement("textarea");

 ta.innerHTML=str.replace(/</g,"&lt;").replace(/>/g,"&gt;");

 return ta.value;    
}
A: 

You could pass it through the following regex - Replace

&[\s\r\n]+;(?=#\d+;)

with

&

globally. Your HTML entity format is simply broken. Apart from the fact that HTML entities cannot contain whitespace and newlines, they cannot contain semi-colons in the middle.

Tomalak
The question is why is this even necessary?
AnthonyWJones
I have no idea. I just saw the broken input.
Tomalak
A: 

Your input text may not be right and it is working as intended. Garbage-In-Garbage-Out.

I suspect the &\n; should be something else. But if not:

str.replace(/&\s*;/g, "");
cdm9002