tags:

views:

211

answers:

2

I have a QString with some HTML in it... is there an easy way to strip the HTML from it? I basically want just the actual text content.

<i>Test:</i><img src="blah.png" /><br> A test case

Would become:

Test: A test case

I'm curious to know if Qt has a string function or utility for this.

A: 

I won't do the legwork, but if you google "QString" and "regex" and "regex remove html" you'll find what you need. I've done it before.

San Jacinto
-1 I just wanted to know if there was a Qt-provided method for doing this.
George Edison
@george I think my "legwork" comment came across wrong. It wasn't intended to be rude but rather to give you the recipe for how I accomplished this in the past. At any rate, glad that it looks like you got what you needed.
San Jacinto
+2  A: 

You may try to iterate through the string using QXmlStreamReader class and extract all text (if you HTML string is guarantied to be well formed XML).

Something like this:

QXmlStreamReader xml(htmlString);
QString textString;
while (!xml.atEnd()) {
    if ( xml.readNext() == QXmlStreamReader::Characters ) {
        textStream += xml.text();
    }
}

but I'm unsure that its 100% valid ussage of QXmlStreamReader API since I've used it quite longe time ago and may forget something.

VestniK
Thanks. I'm not trying to validate it or extract it. I just want rid of it.
George Edison
Oh, and the text I'm getting *is* from a trusted source (not user input) so I should be fine.
George Edison