tags:

views:

255

answers:

2

How can I decode HTML entities in C++?

For example:

HTML: "Music" & "video"

Decoded: "Music" & "video"

Thanks.

+1  A: 

If you're comfortable with using C-strings, you might be interested in my answer to a similar question.


There's no need to compile the code as C++: compile entities.c as -std=c99 and link the object file with your C++ code, eg if you have the follwing example program foo.cpp

#include <iostream>

extern "C" size_t decode_html_entities_utf8(char *dest, const char *src);

int main()
{
    char line[100];
    std::cout << "Enter encoded line: ";
    std::cin.getline(line, sizeof line);
    decode_html_entities_utf8(line, 0);
    std::cout << line;
    return 0;
}

use

g++ -o foo foo.cpp entities.o
Christoph
Eduardo
@Eduardo: neither `scanf()` not my decode function will allocate memory for you; I'll add some example code
Christoph
Thanks, now it works.
Eduardo
Now, I need the code in C++. I can't compile it with g++.
Eduardo
there's no need to compile it as C++: you can link C and C++ object files
Christoph
If I try to do that, I get this:entdec.o: In function `main':entdec.cpp:(.text+0x51): undefined reference to `decode_html_entities_utf8(char*, char const*)' collect2: ld returned 1 exit status'
Eduardo
@Eduardo: I changed the example to C++ and added the command to create the executable
Christoph
Thanks, now it work.
Eduardo
A: 

Thanks for that quick answer, but I got this error:

$ gcc -c entities.c -std=c99

entities.c: In function ‘parse_entity’:

entities.c:328: error: ‘errno’ undeclared (first use in this function)

entities.c:328: error: (Each undeclared identifier is reported only once

entities.c:328: error: for each function it appears in.)

That I solve adding int errno; in the line 324. Now I'm going to try it.

Eduardo
it seems I actually forgot to include `errno.h`; curiously, I'm not getting any warnings...
Christoph
I did some minor cleanup; please try again: http://mercurial.intuxication.org/hg/cstuff/raw-file/tip/entities.c
Christoph
The problem of errno is solved, thanks.
Eduardo