htmltidy

xslt: remove duplicate xml header

html Tidy gives this as output for some reason: <?xml version="1.0" encoding="utf-16"?> <?xml version="1.0" encoding="utf-16"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt; <html xmlns="http://www.w3.org/1999/xhtml"&gt; <head> <meta name="generator" content= "HTML ...

How to keep my non-breaking spaces with HTML Tidy (PHP)?

I've just noticed that tidy_repair_string() is removing my non-breaking spaces from empty elements causing my table to collapse. Basically I've put in: <td>&nbsp;</td> and HTML Tidy is stripping them out to: <td> </td> which may or may not be some Unicode break but either way it's collapsing. The only &nbsp; related option I'v...

HTML Tidy, Don't move those comments!

I was working with html-tidy and a couple of my comments were moved from the head to the root of the document. Is there anyway to avoid this behavior? (I'm trying to turn some really really bad markup into xhtml complaint code) Oh and additionally it uses an in-house developed semi-server-side scripting language that uses comments to pl...

Using libtidy for iPhone app

I'm trying to use libtidy for an iPhone app (since the iPhone 2.2 SDK doesn't include NSXMLDocument which has tidy functionality) but I get a linker error saying "library not found for -ltidy" when I build the app. As for other framework/library references, I've added the libtidy.dylib to my list of referenced frameworks and I've added ...

Installing Html Tidy

I'm running Mac OS X with Apache/2.0.59 (Unix) PHP/5.2.5 DAV/2. I've never administered Apache or PHP before so somethings aren't really that obvious to me. I'm trying to get PHP Tidy to run as mentioned here http://th.php.net/manual/en/tidy.installation.php It says I need to "In PHP 5 you need only to compile using the --with-tidy opt...

Screenscraping the ugliest HTML you've ever seen in your life

I'm using PHP and libtidy to attempt to screen scrape what might possibly be the most horrendous and malformed use of HTML tables in history. The site closes few table, tr, td, font, or bold tags and consistently nests many different layers of tables within tables. Example snippet: <center> <table border="1" bordercolor="#000000" cells...

How do I set vim 7's html tidy options for "utf-8" encoding?

I need to run tidy in vim, using the usual: :compile tidy :make However, if I'm on an utf-8 file, I get errors that I don't see if I run tidy outside of vim, i.e., tidy -e -q -utf8 I get what I'm expecting TIA! ...

Notepad++ HTML Tidy

Is HTML Tidy for Notepad++ broken? None of the commands except Tidy (the first one) work. They don't show any message, even with all text selected. I really need Tidy to work, or is it just a limitation of the newest version of N++, or lack of support? Also, the custom syntax dialog freezes whenever I select a color from the color dialo...

How do I stop HTML Tidy from removing my closing tag?

Currently HTML Tidy is changing any empty HTML tag and combining them into one, for example: <script src="somescript.js"></script> Turns into: <script src="somescript.js" /> This is a problem because including javascript files in the "head" of my HTML is now not working in some browser that explicitly need this closing tag to be se...

Strange problem calling a .DLL from C#

Hey all, I'm trying to call the HtmlTidy library dll from C#. There's a few examples floating around on the net but nothing definitive... and I'm having no end of trouble. I'm pretty certain the problem is with the p/invoke declaration... but danged if I know where I'm going wrong. I got the libtidy.dll from http://www.paehl.com/open_s...

Configure HTML Tidy to ignore PHP short start and end tags when inside html attributes

How can I keep HTML Tidy from converting PHP short tags when used as values in html attributes? Here's an example of what it currently does. It converts this: <input value='<?=$variable?>'> to this: <input value='&lt;?=$variable?&gt;'> I want HTML Tidy to ignore PHP short tags. Any config options that change this? == To simplify...

How to use the HTML Tidy .Net DLL wrapper in PowerShell?

I'm trying to use the HTML Tidy .Net wrapper in PowerShell 2.0. Here is a working example using C# (TestIt.cs included in the wrapper distribution): using Tidy; Document tdoc = new Document(); I'm doing this in PowerShell: [Reflection.Assembly]::LoadFile("C:\Users\e-t172\Desktop\Tidy.NET\Tidy.dll") New-Object Tidy.Document I get t...

HTML Tidy license question

I'm looking to use the HTML Tidy source code and modify it to add a few more features. Having read the license, I'm not quite sure if I will have to release the source of my modified application? I don't want to. ...

html tidy and rails apps - erb or rthml files. Ruby alternative?

Hi. I just installed HTML Tidy plugin for eclipse. I added the html.erb file type and now it will do its magic on my erb files. However it puts in the title tag and changes a lot of my characters to escape characters. How can I stop this from happening - or is there a ruby alternative which will go through my code, reindent, and stick in...

How to fix noncompliant HTML so Expat will parse it (htmltidy not working)

I'm trying to scrape information from http://www.nfl.com/scores (in particular, find out when a game is over so my computer can stop recording it). I can download HTML easily enough, and it makes this claim about compliance with standards: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/D...

.NET version of HTML Tidy?

Does anyone know if there's a native port of HTML Tidy available for .NET? In Sourceforge, there's a TidyNet project - which hasn't been updated since 2005 and seems like a wrapper only. Java port seems to exist as recent JTidy project. HTML Tidy project page: http://tidy.sourceforge.net/ ...

Using Tidy to clean HTML, HTML content is being changed, encoding problem?

I am fetching HTML from a smarty template and need to clean it (simply want to remove extra whitespace, and format / indent the HTML nicely), I'm using tidy to do something like: $html = $smarty->fetch('foo.tmpl'); $tidy = new tidy; $tidy->parseString($html, array( 'hide-comments' => TRUE, 'output-xhtml' => TRUE, 'indent'...

Tidy.NET for rendering ASP.NET pages (webforms)

I´m using tidy.NET for correcting (indenting) ASP.NET pages, but it creates 2 html tags, one with viewstate and other with rest of page. Could i change this for having only 1 html tag? Thx in advance, ...

Feed rendered jsp pages through htmltidy

I have a Java project running on Glassfish that renders some ugly looking HTML. Its a side effect from using various internal and external JSP libraries. I would like to set up some sort of post-render filter that would feed the final HTML through HTMLTidy so that the source is nice and neat to aid debugging. Is this possible? Is there ...

Is there an alternative to HTML Tidy?

Hi, I have embedded HTML Tidy in my application to clean incoming HTML. But Tidy has a huge amount of bugs and fixing them directly in the source is my worst nightmare. Tidy source code is an unreadable abomination. Thousand+ line functions, poor variable naming, spaghetti code etc. It's truly horrible. Worse yet, official development ...