I'm interested in unescaping text for example: '\' -> '\' in C. Anyone knows of a good library?
By html escape I mean, all the entity references.
I'm interested in unescaping text for example: '\' -> '\' in C. Anyone knows of a good library?
By html escape I mean, all the entity references.
I had some free time today and wrote a decoder from scratch: entities.c, entities.h.
The only function with external linkage is
size_t decode_html_entities_utf8(char * dest, const char * src);
If src
is a null pointer, the string will be taken from dest
, ie the entities will be decoded in-place. Otherwise, the decoded string will be put in dest
- which should point to a buffer big enough to hold strlen(src) + 1
characters - and src
will be unchanged.
The function will return the length of the decoded string.
Please note that I haven't done any extensive testing, so there's a high probability of bugs...