views:

78

answers:

1

I downloaded the latest version (TidyPas_Delphi2010.zip) from the official homepage (http://sourceforge.net/projects/curlpas/files/).

But to my surprise, there are full of AnsiString in the unit instead of string(UnicodeString).

Does anybody use this? No Unicode version?

Thanks

+3  A: 

TidyPas is just a wrapper around the HTML Tidy library API. It does not provide a UnicodeString facade over that API, it exposes the API as-is.

As far as I can tell from the docs, HTML Tidy itself only supports a limited range of character sets, but these do include the UTF8 encoding of Unicode, which with a bit of care I think should be OK with ANSIString and ANSIChar types used by the API.

Any further inquiries about Unicode support in HTML Tidy other than with UTF8 would probably be best directed at the HTML Tidy community itself. It doesn't seem to have been updated for a while though (since 2008).

Deltics
Thanks. It seems the only choice is to modify the TidyPas to make it unicode. right?
Paul
No, Paul, that's not the only choice. Another choice is to encode your HTML as UTF-8 (which is a wise thing to do anyway) and pass that to HTML Tidy as an AnsiString. Delphi's UTF8String type is already an AnsiString type, so there should be no problem with that at all.
Rob Kennedy
@Rob, I guess there might be some implicit conversion when passing a UTF8String to an AnsiString parameter in higher Delphi. I will test it. Thank you.
Paul
I tested it in D2010 and the conversion exists when passing an UTF8String to an AnsiString. It won't happen if the parameter is declared as RawByteString.
Paul